Could your data end up in the dock?

The recent decision by the High Court allowing the Office for Students to publish its registration refusal for Barking and Dagenham College shines a bright light on the way that the regulator uses data for the assessment and monitoring of providers on the register.

At the heart of the OfS decision to refuse registration were three data indicators covering student continuation and completion rates, qualification outcomes and graduate employment. OfS decided that the college performed poorly in these indicators and refused registration under Condition B3 Quality (student outcomes). The college contended that OfS gave little or no weight to contextual factors, and that the detail of the algorithms used to create these metrics was not shared with the college until after the decision to refuse registration had been taken.

The actual decision of the court focused entirely on whether the OfS should delay publication in order that a full appeal could be heard. The judgement was that such “interim relief” would not be available, primarily because it was felt that current students had a “right to know” the status of their provider. But we also get to see some of the arguments that would underpin a full appeal, and these were primarily focused on the quality and applicability of data used in OfS’ decision-making process.

Hard knocks

The new regulatory framework has a critical dependence on data and metrics and, as this case demonstrates, hard judgements can and will be based on hard metrics. This represents a real shift from the way Hefce used data. While on the surface the funding algorithms provided a direct line between the data and the decision, the reality was softened by the ability of Hefce to retrospectively adjust funding should the initial data hypothesis prove to be incorrect or unwarranted. The (negative) adjustment of £36 million made to London Metropolitan University’s grant in 2009 is probably the most infamous example of this.

The new framework – and the 2017 legislation under which it operates – has a very different perspective on data and I think there are three key issues that the Barking and Dagenham case surfaces.

Who collects?

First is the fact that OfS depends on other bodies to define, collect and quality assure the data it uses for these regulatory decisions. As a further education college, the data for Barking and Dagenham will have been collected by the Education and Skills Funding Agency (ESFA) on their Individualised Learner Record (ILR). The ILR is the further education sector’s equivalent to the HESA Student Record and it has been designed primarily to describe FE-level activity and to meet the needs of the FE funding and regulatory structures.

I have written previously about the messy complexities of data around the HE/FE borders and the challenges of creating meaningful bridges across these divisions. OfS (and Hefce before it) invests significant effort in attempting to create fair and robust comparisons between the two sectors – and this case is likely to reignite the questions about the extent to which these different data sources can be brought together in this way.

Responsibilities

The second issue is one of structures and responsibilities for the sector-level data operations. Higher education stands alone in having its main dataset collected by something other than a government body. The establishment of HESA as the sector-owned UK-data body in 1992 followed the precedent of the Universities Statistical Record and made sense in a realignment of the sector that saw the binary divide between universities and polytechnics replaced by a splitting of HE funding and regulation across the four administrations of the UK.

The 1992 legislation was virtually silent on data issues and the services HESA provided to its statutory customers across the UK were largely based on custom and practice and always subject to some element of negotiation. The 2017 HERA says a lot about data, creating the role of the Designated Data Body (which is HESA) and setting into statute the powers that OfS has over data collections and the onward use of data.

The centrality of data to the OfS regulatory mechanism – including the fact that regulatory decisions based on data can be challenged in the courts like this – means that OfS has to exercise a degree of control and oversight of the data collections that goes significantly beyond what we have seen in the previous 25 years.

Feel the algorithms

The final issue sits somewhere between transparency and complexity. The data processing that goes into the plethora of OfS metrics is extensive and complex. Not only does OfS have a responsibility to get these sums right (something that, in my experience over the past year, has been problematic) but higher education providers need to develop the capability and capacity to meaningfully engage with these algorithms given their critical role in the regulation.

In the Barking and Dagenham College case, the failure of the OfS to disclose the detail of the algorithms at a suitably early stage in the registration process arguably places the provider at a distinct disadvantage.

The situation for Barking and Dagenham College was further complicated by the fact that only a portion of its HE provision was within the scope of the OfS regulatory framework; The college submitted to the court that it appeared as though the OfS metrics only covered that portion of its total HE provision. While that approach seems rational from the perspective of the OfS, it adds to the opaqueness of the metrics and gears in further complexity for the provider to untangle.

Much has been said and written over the past couple of years about data-driven regulation under the OfS and this case surfaces what it means in practice for OfS, for HESA and for HE providers. Your data could end up in the dock.

2 responses to “Could your data end up in the dock?

  1. Andy
    For those of us involved in the establishment of HESA in 1992-93 it is amazing how far we have travelled. If I may say so your article is an excellent and concise assessment of the difficulties and dangers of data driven decision making.

Leave a Reply