This article is more than 5 years old

Changing behaviours across the data landscape

The work of the Data Landscape Steering Group aims to address duplication and lack of coherence in HE data collections. Simon Gaskell, their Chair, introduces us to supply and demand side codes of practice.
This article is more than 5 years old

Simon Gaskell is Chair of the Quality Assurance Agency.

The journey to create a more effective and less-burdensome HE data landscape has been underway for a number of years now – ever since the Regulatory Partnership Group (RPG) commissioned the first studies in this area following the 2011 White Paper ‘Students at the Heart of the System’.

Those early RPG studies identified a lack of oversight and leadership as the cause of many  inconsistent and duplicative behaviours of the various data collectors.  While much of the work undertaken since then has focused on issues like the standardisation and rationalisation of data flows, the fundamental need has been for a change in behaviours.

The Data Landscape Steering Group has been established to provide an overview and leadership of ideas across the data landscape, and to encourage a change in behaviour in HE data collections. The group has developed and consulted on a set of behaviours, enshrined in codes of practice, that should address those issues of duplication and a lack of coherence in data collections.


There are two codes of practice – one for the demand-side (the data collectors) and one for the supply-side (the HE providers that are supplying data). The supply-side code of practice is based on one that was developed by the HE funding councils across the UK for the data collections that they run, either directly or through HESA. It is based on three principles: honesty, impartiality and rigour, and has been expanded to apply to all data returns that HE providers make.

The supply-side code of practice states that the data submitted by HE providers should genuinely reflect the characteristics and events being reported on, and that the providers should act in a transparent manner and not withhold information. The preparation of data should be impartial, objective and should be done in accordance with repeatable, documented processes.


The demand-side code of practice reflects those principles of honesty, impartiality and rigour back to the data collectors and sets out expected behaviours in the management and governance of data collections.

Data collectors should be honest and open about the intended uses for data and should respond to scrutiny of collection requests. Data requests should be defined objectively and aim to offer the best evidence for the stated use. Data definitions should align with relevant published standards and any duplication of an existing data collection must be supported by a strong rationale. Data collectors should be able to justify requests from an informed position concerning the impact on data suppliers.

The demand-side code of practice requires the assessment of burden, and a methodology has been developed to support this. The methodology provides a framework for assessing the impact of data requests and aims to create a better shared understanding of the issues that contribute to the overall burden.

What does it all mean?

With so many independent and autonomous organisations collecting data, the Data Landscape Steering Group does not have (and does not seek) the sort of hard authority that could mandate a change in behaviours. What we do have, through the development and publication of these codes of practice, is an opportunity for organisations on both the demand and supply sides of data collection to create a new compact that will improve the management and governance of data assets as they flow across the sector. For HE providers, this means ensuring that the high standards applied to HESA and funding council returns are extended to all external data returns. For data collectors, this means reviewing processes against the code of practice and taking actions to address deficiencies. Engagement in a comprehensive and mature conversation about burden is a responsibility that falls on us all.

The building blocks of a new data landscape are starting to fall into place; more technical developments like the HECoS subject coding system and the HESA Data Futures project need to be aligned with a change in behaviours around the governance and management of HE data flows. The increasing focus on data-driven funding and regulation makes this work more important than ever.

One response to “Changing behaviours across the data landscape

  1. I can certainly agree with the sentiment and principals of the Codes of Practice, on both sides, the reality is quite different.

    HEIs, with fairly minimal input, are assigned fields that we have a statutory requirement to supply. The burden is on HEIs to source that data, store it, and ensure it’s quality. We have to meet HESA’s hard-set timeline and their coding structure and their validation. The cost and resources to meet these demands are all on the HEI. Failure to comply means losing HEFCE funding, what little there is, but also to be barred from Student Loans. On top of this, HEIs then PAY HESA for this privilege. An annual subscription fee to give them data, in the way they want it, and on time or pay a substantial fine for late submission. Any queries regarding the quality of data are directed at the HEI to resolve to HESA and HEFCE’s satisfaction. The implication being that failure to satisfy these masters is to draw an audit down on the HEI, something which costs the HEI more time and money than it does HEFCE.

    HESA then, as far as I’m aware, sell this data to the government. They also can sell this to any third-party interested in the data. Possibly for league tables, possibly to fast-food restaurants who want to know where students eat. As far as I know, no validation or quality assurance is carried out on their work. How could it? They didn’t create the data so they can hardly be held responsible for it.

    Now we have Data Futures. A sensible move to reduce duplication of data. However, so far this data landscape has chosen to ignore the ITT return. And also the new ILR required Higher Degree Apprenticeship returns. It has also deemed a number of other returns and external demands as “outside of scope”. In fact, it is almost identical to the current HESA Student Return. Except for a few minor changes.

    An increase in the number of fields to be supplied. And the requirement to supply dates to these fields, effectively tripling large sections of the data provided. It also completely changes the framework, terms of references, fields names and structure of the entire return, while largely keeping the same information. There’s also the expectation to supply this data on a near quarterly basis, presumably because HESA’s data demanders want the data now, and as they’re paying for it I’m not surprised.

    HESA no doubt has to adapt to all these changes but as they are the leaders in this change they seem to have a good seat prepared. Meanwhile, every single HEI must bear the cost and resource drain to meet these demands with absolutely zero funding.

    So despite the data being more timely this process seems to have the exact opposite effect of reducing the burden of submitting data. It seems to have also multiplied the number of data demanders. Also, none of the changes seem to serve students interests at all. It is effectively the same information and student’s already have access to it. There seems to be no benefit to the internal processes of HEIs at all. Therefore I can only conclude it serves external processes, such as the assurance HESA monopolises HE data for their own profit.

Leave a Reply