A straightforward question: how do you tell which institution is which? Easy enough, you may think. Every institution has a unique name. Ah – would that it were so! Institutions change names surprisingly often, to reflect regional changes, aspirations, mergers, changes in status, or marketing whims. While this is all fun and games for some, it can cause endless trouble for a wonk looking to collate a time series.
In their ineffable wisdom, the university and college bits of the four devolved governments got their heads together as the UK Registry of Learning Providers and allocated every institution they knew about a Provider Reference Number (UKPRN). This is similar, but different to the DfE Edubase which covers the compulsory sector, FE and HE in England and Wales, and to a whole bunch of other identifiers like URNs, Independent Schools Council Number…
But UKPRN is the primary identifier used in HE. Or it should be. And no matter how the name of an institution changes, the number remains the same.
HESA also uses INSTIDs, for internal reasons best known to itself. But every INSTID maps to one, and only one, UKPRN – and HESA nicely demonstrates how not even a merger of institutions can disrupt the strength and smoothness of this system. HESA publishes UKPRNs with its datasets too. Well played, HESA.
If you want to be King Wonk this (on data.ac.uk – kudos to Chris Gutteridge for some sterling data-wonk work) is the dataset you’ll be wanting to map everything else to. It includes, alongside the UKPRN, a bunch of other information about mission groups, map coordinates, Wikipedia pages. Because the UKPRN is the common identifier, you could map data from say HESA or HEFCE to the codes and be able quickly to sort things by mission group or whatever else you fancy.
This is possible because when HEFCE and HESA publish anything that involves a list of institutions and data, they publish it as a spreadsheet or .csv file and use UKPRNs to identify institutions.
QAA also uses UKPRNs as identifiers within the database that serves the pages with institutional reports on – look at the end of the web address (URL) from each institutional page. But because QAA doesn’t publish this information as a .csv, you have to get into screen scraping to get information into a form you can play with. Seriously, QAA – give us a spreadsheet.
Unistats is too cool for school. It does use UKPRNs and offers its data up as XML and a documented API. For those of us who don’t speak nerd, that means that you can talk to their datasets from within your computer program over the internet. This offers amazing possibilities if you can code, but you haven’t learnt to code and you probably never will.
UCAS has always used its own special institutional codes which are not UKPRNs for reasons lost in the mist of time. It does offer .csv downloads now, after a long time offering only .pdfs (which were completely useless) – but they don’t map these to UKPRNs even though this would be trivial. If you need to, you can do it via data.ac.uk, but it’s just more hassle that you don’t want.
And then we come to gov.uk which published the TEF eligibility list as a pdf with no identifiers at all. Slow handclap, gov.uk. We wrote about that in more depth here.
In conclusion, if you are publishing something that is a list of institutions, or you are developing something that you might one day publish that is a list of institutions: include the UKPRN. And do not, on any account, publish it as a pdf.