This article is more than 2 years old

Time to get serious about data burden

As yet another task force gets to work on the eternal problem of data burden, David Kernohan offers some advice from history.
This article is more than 2 years old

David Kernohan is Deputy Editor of Wonkhe

Periodically, the time comes when someone in a position of authority says that it is time to sort all this stuff out – this “stuff” being the multiplication of data burden on higher education providers.

As boundaries between sectors have blurred, the rationale for separate regulatory data requirements for higher education, further education, apprenticeships and the rest of the skills sector has become less clear. One of the many providers that delivers qualifications between level 3 and level 9 on more than one of these routes will be making an unreasonable number of data returns to an unreasonable number of people.

Where the data goes, no-one knows

What does that mean? – we’ve covered this before on Wonkhe and I take no pleasure in pointing out that this is still the case:

  • For higher education, information about students, staff, finances goes to HESA…
  • … apart from some detail on student numbers and financial projections that goes direct to OfS.
  • For non-HE provision – there are 14 Individualised Learner Record (ILR) returns to the Education and Skills Funding Agency.
  • For students getting funding via the Student Loans Company, there’s attendance returns to be made there too.
  • And there is data on RPAs to be returned to UCAS every year.
  • Some apprenticeships data now goes to OfS.
  • And then there’s stuff that goes to Ofqual.
  • Research councils and other funders expect data.
  • And let us not forget specific returns for medical, nursing, and teacher training courses.
  • And the Home Office.
  • The buff copies go to accounting, the fuschia ones go to purchasing, the goldenrod ones go to Roz.

You would perhaps assume that even though the destinations and coverage may be different, the core definitions and protocols attached to data would be similar enough to let you reuse stuff. You would be wrong. There are many ways to count students.

It is a huge mess, and additions to the system – not least the spectre of module-level data funded by the Lifelong Loan Entitlement – are only going to make things more complex.

A taskforce assembles

The standard response to any issue in English higher education these days is for Michelle Donelan to hold a summit – or, for really complex issues to convene a taskforce. Rejoicing in the name “HE Data Reduction Taskforce”, a group of stakeholders from the organisations named above and experts is charged with rationalising the current mess.

It’s not the first time an attempt has been made to tidy things up – you can see parallels in the MIAP programme in the 00s or in the more consensual bottom-up approach employed by the HEDIIP (kicked off in the Students at the Heart of the System White Paper). Data professionals do get sentimental about HEDIIP, but we also have to reckon with the fact that it clearly didn’t work in quite the way that was hoped.

The proposals were decent – after identifying an astonishing number (97!) of data returns in higher education the report made all the right noises about rationalisation. HEDIIP proposed overarching governance for data collectors (to avoid duplication and near-duplication), alignment of external requirements with internal provider data needs, common data standards (no more bespoke definitions for basic stuff like student numbers!), and more data-sharing between data collectors.

Data router

The most eye-catching provisions, and the ones – arguably – that have caused most of the trouble were the idea that HESA (later to become the Designated Data Body) would be the standard data collector of first resort in most cases, and we’d see a shift to in-year data collections. The difference between HEDIIP and this new group is that the latter has burden reduction, rather than tidiness, as a remit.

HEDIIP won hearts and minds, but it didn’t really result in meaningful change. It was there at a top level, but despite the White Paper backing it didn’t really have the ability to immediately address practice.

Data Futures, which is what we are left with after the hope around HEDIIP evaporated, was supposed to meet the needs of all data users, rather than just one. And if you spotted the omission in HEDIIP – where was the stuff about reducing burden? – that continued through the various iterations of Futures.

But the Taskforce is explicitly rooted in burden reduction. It’s short – some would say perilously short at just 7 meetings in six months – and all of the key players, plus some great sector representatives, are on board from the start. It has the ear of the Minister. What could go wrong?

Things that could go wrong

There is a risk that scope is too tight. Any conversation about the future data landscape needs to be rooted in the reality of data collection – and this needs to be focused on the new reality that LLE will bring about in 2025. We need a decision on tertiary sector identifiers, agreement on regulatory data needs (outcomes data metrics from individual modules? Good luck with that), and it all needs to be a lot more responsive.

The temptation on these groups is to fix the thing that is in front of you – the current state of sector data collection. But if that is set to change – and it is – even the best intentioned streamlining tweaks would just be adding burden in the interim. We need clarity on the post 2025 end point before we spend all of 2023-24 replumbing. The sector cost of Data Futures delays and reworks is already adding up – so let’s try to nip it in the bud here.

One of the things that HEDIIP struggled with was balancing power relations – the landscape report even suggested, hopefully, that:

Some organisations may have to concede their own preferences for the greater benefit of the HE sector

A noble sentiment, but some agencies can ask for concessions while others are only able to make them. The place of OfS (and indeed ESFA) at the centre of sector data is problematic – and serves to highlight that not all data users are equal. We have a Designated Data Body for higher education in England that should be leading on all this stuff on behalf of the sector – other agencies should be reusing what the DDB collects.

Section 64 and section 66 of HERA only provides this statutory link for the Secretary of State, UKRI, and the Office for Students, and Section 65 grants OfS power to make decisions over what is collected and how it is collected. Granting the DDB more power to make its own decisions – and making some regulatory recognition that the DDB has multiple customers that need to be considered in collection design – would be hugely helpful here.

In particular – the task force needs to include representation from the devolved regulators. There’s been some movement away from UK wide (and indeed international) standards and expectations in the quality space – it would be a shame if the reality of the UK sector was torn asunder by an England-centric datagrab… and cross border recruitment means that this would inevitably add burden anyway.

Everybody should be able to use the same, standard, sector collections – so this taskforce needs to clamp down hard on innovations and special cases, and on anomalies between reporting standards. Again I see a DDB role here (especially given the actuality of HESA’s pan-UK work) but it would require certain regulators to let go of areas of the data landscape they may be keener to hang on to.

4 responses to “Time to get serious about data burden

  1. The usual reminder that ESFA also collect HE information via the 14 ILR returns for degree apprenticeships (thus duplicating the HESA collection of HE provision)

  2. Reducing burden is easy. Only ask institutions to return what they collect as part of their normal business. Do away with FUNDCOMP, MODOUT, STULOAD, ELQ, Highest Qual on entry. Ask us to return module results, progression decisions, basis of entry. Massive reduction in burden immediately.

  3. Just picked up on this piece David – we were very used to seeing plenty of data requests (including FOIs) and benchmark surveys/ 3rd party return requests going direct to HEIs and thus multiplying effort and increasing burden again – avoiding HESA aka not incurring cost! The DDB being the single provider of de-facto authoritative versions (note the “s”) of the many (if not all) HE truths is the way to go.

Leave a Reply