David Kernohan is Deputy Editor of Wonkhe

Just before the excitement of conference season, your local Facebook group lost its collective mind. And it shows no sign of calming down.

Given everything else that is going on, you’d think that reinforcing the joins between key government data sources and giving more visibility to the subjects of public data would be the kind of nerdy thing that the likes of me write about.

But no. Somebody used the secret code word. ID Cards.

Who is she and what is she to you?

I’ve written before about the problems our government faces in reliably identifying people. Any entitlement– or permission– based system needs a clear and unambiguous way of assuring the state that a person is indeed who they claim they are, and have the attributes or documentation they claim to.

As a nation, we are astonishingly bad at this. Any moderately serious interaction with the state requires a parade of paperwork – your passport, driving license, birth certificate, bank statement, bank card, degree certificate, and two recent utility bills showing your name and address. Just witness the furore over voter ID – to be clear a pointless idea aimed at solving a problem that the UK has never faced – and the wild collection of things that you might be allowed to pull out of your voting day pocket that do not include a student ID.

We are not immune from this problem in higher education. I’ve been asking for years why you need to apply to a university via UCAS, and apply for funding via the Student Loans Company, via two different systems. It’s then never been clear to me why you then need to submit largely similar information to your university when you enroll.

Sun sign

Given that organs of the state have this amount of your personal information, it is then alarming that the only way it can work out what you earn after graduating is by either asking you directly (Graduate Outcomes) or by seeing if anyone with your name, domicile, and date of birth turns up in the Inland Revenue database.

That latter one – administrative matching – is illustrative of the government’s current approach to identity. If it can find enough likely matches of personal information in multiple government databases it can decide (with a high degree of confidence) that records refer to the same person.

That’s how they make LEO data. They look for National Insurance Number (NINO), forename, surname, date of birth, postcode, and sex in both HESA student records and the Department for Work and Pension’s Customer Information System (which itself links to the tax database). Keen Wonkhe readers will have spotted that NINO isn’t returned to HESA – to get this they use “fuzzy matching” with personal data from the Student Loans Company, which does. The surname thing is even wilder – they use a sound-based algorithm (SOUNDEX) to allow for flexibility on spellings.

This kind of nonsense actually has a match rate of more than 90 per cent (though this is lower for ethnically Chinese graduates because sometimes forenames and surnames can switch depending on the cultural knowledge of whoever prepared the data).

It’s impressive as a piece of data engineering. But given that all of this information was collected and stored by arms of the same government it is really quite poor.

The tale of the student ID

Another higher education example. If you were ever a student you had a student ID. It was printed on your student card, and may have turned up on various official documents too. Perhaps you imagined that every student in the UK had a student number, and that there was some kind of logic to the way that they were created, and that there was a canonical national list. You would be wrong.

Back in the day, this would have been a HESA ID, itself created from your UCAS number and your year of entry (or your year of entry, HESA provider ID, and an internal reference number if you applied directly). Until just a few years ago, the non-UCAS alternative was in use for all students – even including the use of the old HESA provider ID rather than the more commonly used UKPRN. Why the move away from UCAS – well, UCAS had changed how they did identifiers and HESA’s systems couldn’t cope.

You’re expecting me to say that things are far more sensible now, but no. They are not. HESA has finally fixed the UKPRN issue within a new student ID field (SID). This otherwise replicates the old system but with one important difference: it is not persistent.

Under the old approach, the idea was you had one student number for life – if you did an undergraduate degree at Liverpool, a masters at Manchester Met, and a PhD at Royal Holloway these were all mapped to the same ID. There was even a lookup service for new providers if the student didn’t have their old number. I probably don’t even need to tell you why this is a good idea if you are interested – in policy terms – in the paths that students within their career in higher education. These days we just administratively match if we need to. Or – as in LEO – assume that the last thing a student studied was the key to or cause of their glittering or otherwise career.

The case of the LLE

Now I hear what you might be thinking. These are pretty terrible examples, but they are just bodges – workarounds for bad decisions made in the distant past. But we have the chance to get it right in the next couple of years.

The design of the Lifelong Learning Entitlement means that the government needs tight and reliable information about who does what bit of learning in order that funds can be appropriately allocated. So you’d think that there would be a rock-solid, portable, unique learner number underpinning everything.

There is not. Instead, we appear to be standardising on the Student Loans Company customer reference number. This is supposed to be portable for life, but it doesn’t appear in any other sector datasets (the “student support number” is in HESA, but that is somehow different – you get two identifiers from SLC, lucky you). SLC also holds your NINO (you need one to get funding!), and has capacity to hold another additional number of an institution’s choice, but not (routinely) your HESA student ID or your UCAS identifier.

There’s also space to add a Unique Learner Number (ULN) but at this stage I’m too depressed to go into what a missed opportunity that is.

Why is standardising on a customer reference number not a good idea? Well, think of all the data SLC doesn’t hold but HESA does. Think about being able to refer easily back to a school career and forward into working life on various government data. Think about how it is HESA data and not SLC data that underpins LEO. Think about the palaver I have described above and ask yourself why you wouldn’t fix it when you had the opportunity.

Learning to love Big Brother

I’ll be frank, I’m not crazy about how much the government knows about me – but honestly compared to people like Google, Meta, or – yikes – X (formerly twitter) it doesn’t hugely worry me.

I’ve been a No2ID zealot in my past (any employee of those three companies could tell you that) but these days I am resigned to the fact that people need to know who I am, and I’d rather be more than 95 per cent confident that they could get it right.

I’m no fan of filling in forms, but I am a fan of streamlined and intelligent administration.

So why do we need ID cards? Simply because in proper countries we don’t need to go through stuff like this every time we want to know if a person that pays tax and a person that went to university are the same person. Because the current state of the art is a mess.

0 Comments
Oldest
Newest
Inline Feedbacks
View all comments