In its recent report on the OfS, the Lords Industry and Regulators Committee called for the Department for Education Data Reduction Taskforce to be reconvened.
This recommendation – which is addressed directly to the DfE rather than the OfS – echoes the recommendation made by Universities UK in its recent study on regulatory burden and by GuildHE in the recommendations of its recent Regulation Briefing Series.
The issue of data burden seems to have been on the agenda for ever and yet it feels no closer to resolution now than it was thirty years ago. How did we get here, and what would it take to make real lasting progress on this perpetual problem?
A brief history of data burden
Funders and regulators have always used data to inform funding and policy; universities have always complained about the burden of supplying this data. In the early 1990s a senior vice chancellor told me that setting up HESA as a sector-owned agency would enable the sector to push back on “all these wretched demands for data”. In the 2000s the debate around data burden was wrapped up in the broader issue of regulatory burden as HEFCE ran a series of three Accountability Reviews. These were followed by the Better Regulation Task Force and then the HE Data & Information Improvement Programme (HEDIIP) which launched ten years ago.
Despite all these concordats, programmes, frameworks and agreements the expectations of the value that can be derived from data have increased relentlessly and so therefore has the demand for granular, timely, high-quality data.
What drives data burden?
Although many initiatives have sought to address the problem of data burden, there remains very little consensus about the specific drivers of burden and how it varies. Earlier this year I launched the Data Burden Project to analyse the lifecycle of activities that institutions go through when making data submissions to funders and regulators. The main areas of activity are:
- Understanding the reporting requirements and preparing systems and processes
- Data capture and processing
- Making the data submissions to data collectors
- Reconciliation of data submissions with other data sources
- Engaging with funding and regulatory metrics
Within each of these broad areas the specific tasks involved were analysed and the nature of the burden was assessed.
The study found that overall data burden has increased significantly in recent years. This has been driven by an increase in the use of funding and regulatory metrics and by a general increase in complexity across all of the data interactions. This in turn increases demands on data systems and on the data professionals working in the sector.
The analysis of tasks found that data burden is not significantly driven by the size of the institution. For each task burden is either a fixed amount or an amount that varies according to complexity and fit of the institution’s data and processes with the external data model and algorithms.
It follows that burden reduction initiatives by individual funders and regulators can only have a marginal impact on the burden experienced by each institution and that this impact will often be difficult to predict and randomly distributed. Burden does not correlate to institution size so smaller institutions are disproportionately burdened.
The final conclusion of the project is that the duplication of data collections across HE is a far more significant problem than the burden associated with any individual data collector.
What needs to happen?
Despite so many initiatives to address the problem, the Gordian Knot of data burden remains. I think there are three key elements that need to be put in place to make a real and lasting reduction in data burden.
There is a need to standardise the data definitions used by the funders and regulators across the sector. Across the range of returns that we have to make we often find that the same concepts have frustratingly different definitions. Jim Dickinson recently explored the myriad of different definitions of full-time that we have to contend with; OfS alone has two different definitions – one for B3 metrics and one for funding calculations. The requirement for institutions to simultaneously map their internal data to all these different definitions adds layers of complexity and cost to the data submission processes. It also creates a whole new category of burden when institutions are asked to explain why their submissions to different bodies don’t appear to match.
And there is scope to rationalise the number of data collections that run. Funders and regulators often repeat the mantra collect once use many times but the extent to which data is shared and reused remains frustratingly low. Institutions have to engage separately with each collection: separate portals, user-accounts, submission processes, validation checks, sign-off processes and each one running to a slightly different timetable.
Thesw first two elements are essentially technical; standardising and rationalising the machinery of data collections. The third necessary element plays to a different set of rules.
HE providers in England have to submit data to many silos of funding and regulation: OfS, ESFA for apprentices, SLC for Student Finance (similar collections of acronyms exist in other parts of the UK). In each funding/regulatory silo the data machinery serves a funding/regulatory framework that is operated by a funding/regulatory (arms-length) body which operates within a set of legislative rules and policy objectives. Each level in each silo is optimised to serve the requirements of the level above it and the extent to which data collections are duplicated and not standardised is perhaps an inevitable consequence of a lack of standardisation and coherence across the funding and regulatory landscape more generally.
If we are to progress the standardisation and rationalisation of the data machinery there needs to be a way to pursue the standardisation and rationalisation of the data landscape without having to rebuild the entire funding and regulation system. The funding and regulatory bodies will not do this voluntarily because they would have to cede some control of their own data machinery; previous attempts at leadership/oversight/governance in both the HE and FE sectors have failed due to a lack of hard mandate in this respect.
Therefore, there needs to be some kind of independent, authoritative entity that can steer and mandate change across this data landscape. This would require political will; a commitment from Mr Halfon – or maybe Mr Western – to support a real and lasting solution to the age-old problem of data burden.
Without this commitment, we could be having this debate for the next thirty years.