A straightforward question: how do you tell which institution is which? Easy enough, you may think. Every institution has a unique name. Ah – would that it were so! Institutions change names surprisingly often, to reflect regional changes, aspirations, mergers, changes in status, or marketing whims. While this is all fun and games for some, it can cause endless trouble for a wonk looking to collate a time series.
In their ineffable wisdom, the university and college bits of the four devolved governments got their heads together as the UK Registry of Learning Providers and allocated every institution they knew about a Provider Reference Number (UKPRN). This is similar, but different to the DfE Edubase which covers the compulsory sector, FE and HE in England and Wales, and to a whole bunch of other identifiers like URNs, Independent Schools Council Number…
But UKPRN is the primary identifier used in HE. Or it should be. And no matter how the name of an institution changes, the number remains the same.
HESA also uses INSTIDs, for internal reasons best known to itself. But every INSTID maps to one, and only one, UKPRN – and HESA nicely demonstrates how not even a merger of institutions can disrupt the strength and smoothness of this system. HESA publishes UKPRNs with its datasets too. Well played, HESA.
If you want to be King Wonk this (on data.ac.uk – kudos to Chris Gutteridge for some sterling data-wonk work) is the dataset you’ll be wanting to map everything else to. It includes, alongside the UKPRN, a bunch of other information about mission groups, map coordinates, Wikipedia pages. Because the UKPRN is the common identifier, you could map data from say HESA or HEFCE to the codes and be able quickly to sort things by mission group or whatever else you fancy.
This is possible because when HEFCE and HESA publish anything that involves a list of institutions and data, they publish it as a spreadsheet or .csv file and use UKPRNs to identify institutions.
QAA also uses UKPRNs as identifiers within the database that serves the pages with institutional reports on – look at the end of the web address (URL) from each institutional page. But because QAA doesn’t publish this information as a .csv, you have to get into screen scraping to get information into a form you can play with. Seriously, QAA – give us a spreadsheet.
Unistats is too cool for school. It does use UKPRNs and offers its data up as XML and a documented API. For those of us who don’t speak nerd, that means that you can talk to their datasets from within your computer program over the internet. This offers amazing possibilities if you can code, but you haven’t learnt to code and you probably never will.
UCAS has always used its own special institutional codes which are not UKPRNs for reasons lost in the mist of time. It does offer .csv downloads now, after a long time offering only .pdfs (which were completely useless) – but they don’t map these to UKPRNs even though this would be trivial. If you need to, you can do it via data.ac.uk, but it’s just more hassle that you don’t want.
And then we come to gov.uk which published the TEF eligibility list as a pdf with no identifiers at all. Slow handclap, gov.uk. We wrote about that in more depth here.
In conclusion, if you are publishing something that is a list of institutions, or you are developing something that you might one day publish that is a list of institutions: include the UKPRN. And do not, on any account, publish it as a pdf.
I used to maintain my own lookup/mapping list of HESA codes in the days before UKPRN – the biggest problem was keeping it up to date. How can the sector be confident the list is up to date?
UKRLP are covered by FoI, so a list can be got from them if you can wait (they’ve told me in the past they plan to publish a list, but it’s still not there). HESA aren’t covered by FoI, but did follow through on their agreement to publish the quarterly update they get from UKRLP (link on the same page as the Unistars dataset download) – albeit that they sometimes have to be reminded to put up the latest version. The best source is probably the Unistats dataset itself – all providers and regularly kept up to date by them – though you need the skills to extract it from the xml.
UCAS should be put out of our misery.
“UCAS should be put out of our misery.” Hear hear!
If we’re being ambitious, let’s get the Home Office in on this.
One of my favourite pastimes is checking the Tier 4 register of sponsors. The Home Office publish this pretty much everyday as there are constant little changes to it. Except they never publish an exceptions report, so you can see the totals shift, but you have to go through 1300 names on 73 pages of a PDF to spot who might have moved on or off. They don’t publish an identifier of any sort, so if you’re trying to do a historical comparison, you have to contend with both major and minor name changes. They don’t even bother to suppress the ‘The’ at the start of names – there are 10 universities listed under ‘The’ (which can get periodically exciting when universities such as Cambridge aren’t where you’d expect them to be).
This is probably one of the longer list of HE providers published, and it would be nice to link it into the others. But you don’t even get a link through the HEFCE register; that just sends you off to look through the 73 page pdf.
Mike, you’re totally right to call out the Home Office on this too.
This is a really difficult area and, while the UKRLP is a significant step forward, it doesn’t work for every case.
When UKRLP was originally created we agonised about exactly what it was identifying. We landed on a UK legal entity as the thing that can get a UKRLP – this works for many cases, but not all. Sometimes applications or research grants are made to things that are some sort of joint endeavour between two organisations; and in the research space collaboration with non-UK organisations is quite normal or there might be a need to identify a department or institute that is a part of a larger legal-entity.
UKPRNs are, as the article suggests basically fab; but the diverse and complex nature of reality often means that one approach cannot solve all problems. In this respect UKPRN is a fine example of the broader challenges we all face in data.
Would love you to pitch a HECOS article – that stuff is next-level wonkery and is going to change everything starting in 2019 when Jacs finally dies…
Thanks David – good thought…….there is an interesting story to tell I think…
And of course the SLC also have their own codes which seem designed to maintain the pre 92 divide, as they still use PO as the prefix for a number of the post92’s
Following the earlier comment about the challenges in this space regarding research data (what is an organisation, for the purposes of research administration) there is a very useful study here that I was involved with a couple of years ago https://jisccasraipilot.jiscinvolve.org/wp/2015/03/06/organisational-identifiers-working-group-outputs-and-update/
The question of “what is an organisation?” is almost as interesting as “What is a Course?” 😉
Worth noting that UCAS has been publishing data files in a csv format since 2012 (and excel downloads going back at least a decade before that). We do collect UKPRN data but mapping it isn’t a trivial exercise. We’re happy to discuss if there is significant demand.
what is the diference between PUBUKPRN and UKPRN exactly?
I came across this after trying to find a method of obtaining a list of pubukprn. All the API calls on Hesa use this as a root identifier so obtaining a list of pubukprn is the root into a goodie box of application possibilities.
However , it is locked away – see the https://www.ukrlp.co.uk/ – only those with a pubukprn are allowed access.
https://www.whatdotheyknow.com/request/list_of_all_the_registered_train – was a request for the list under freedom of information but they came back first with something he hadn’t asked for and then with website links which again doesn’t provide the information.
This therefore prevents any business, individual, charity etc from using the API on Hesa fully – and my cynical point of view is that actually it is for a select few.
Yeah, I’ve never been able to find a complete canonical list either. It would be hugely useful. For my purposes institutions that return to HESA are the majority of the subset I use, the others I just look up manually. But it should and could be easier than this to do it properly
I managed to find a beta service on the GOV.UK site at:
https://get-information-schools.service.gov.uk/Establishments/Search?tok=8TQILM6z
You can download information for 51,091 establishments which includes their UKPRN numbers.
Oh – good spot Leon!