This article is more than 8 years old

Beginning a revolution in data collection

Andy Youell writes on possibly the best opportunity to address data gathering in higher education in years - a new far-reaching consultation from HESA who are conducting a fundamental think about the sector's relationship with data.
This article is more than 8 years old

Andy Youell is Executive Director: Regulation at UCEM

Thursday 1 December 1994 was a significant milestone in the world of HE data. At the stroke of midnight an employee of the West London Institute of HE sent the first ever student return to the HESA servers in Cheltenham. For the first time the sector had a UK-wide, post binary-divide data and information system.

In 1994 submitting data to HESA could be done on a floppy disk, a CD-ROM, half-inch magnetic tape or, as in this case, using something called File Transfer Protocol (FTP) across the new-fangled internet. The machine at HESA which crunched all of this data had a massive (by 1994 standards) 26 gigabytes of storage; the mobile phone in your pocket today probably has 32 or 64 gigabytes.

Much has changed since the early 90s; the data collection process has evolved to deliver a world-class data quality assurance system and to provide a far more user-friendly interface for institutions to manage the submission process. However, the fundamental aspects of HESA data collections have not changed a great deal since the 1990s. Data still goes through a number of stages of quality assurance and a failure at any stage sends the institution back to the start of the submission process.

All of the data about a student must be submitted in one go; everything from entry qualifications and course information through to modules studied and qualifications awarded. Nowadays making the annual HESA student return involves a massive burst of activity at institutions during September and October, much of it processing data that institutions captured during the registration process twelve months earlier. By modern standards the fundamentals of the data collection process remain inefficient and slow with the first release of student data from the annual HESA collection happening in the January around 15 months after the students registered.

HESA has embarked on a programme to develop a more efficient approach to data collection that delivers more frequent and timely data. The vision involves both process changes and a significant technology upgrade. HESA has launched a sector-wide consultation on these proposals in order to assess the level of support for this vision and to gain a better understanding of the challenges that lie ahead.

The HE information landscape now involves a myriad of organisations (93 we think) that collect student data from the sector every year. HEDIIP conducted an in-depth study into the requirements and operations of these organisations in 2014 which confirmed that there is significant scope for the rationalisation of these collections. Access to timely data is key requirement for these data collecting organisations and if we are to achieve the rationalisation of data collections that is fundamental to the revised architecture set out in the HEDIIP New Landscape report then HESA – working with the sector – needs to create this new approach to data collections.

I am struck by the extent to which this aligns with the broader conversations around efficiency that have been taking place across the sector over the past few years. The current information landscape is chaotic, inefficient and ineffective; the vision that is set out by HEDIIP and HESA promotes efficiency and effectiveness at a system level.

At a time when there is so much other change around embarking on this journey might seem daunting. But I think that a failure to grasp this opportunity would be a big mistake. The 2011 White Paper gave a clear mandate to be bold; it called for us to redesign the information landscape. The burdens of multiple data collections and the problems of getting high-quality timely information out of the system cannot be addressed by tweaking around the edges. We have invested time and resources to fully understand the problems and to identify a vision of the future that carries broad support. Now is the time for action.

This is the best opportunity to address these issues that we have had in the past twenty-one years. We might not get another opportunity like this for a very long time.

4 responses to “Beginning a revolution in data collection

  1. “For the first time the sector had a UK-wide, post binary-divide data and information system.”

    Depends what you mean by ‘the sector’ Andy… a new ‘binary-divide’ in the (English) public sector was created in 1993/1994 between statutorily defined FECs and HEIs (and in the definition of HE courses), and a further divide of significance (between public and private/alternative sectors) has grown since then. Some very important HE, maybe now approaching 10% (?) by volume of student numbers is outside the ‘sector’ of statutory HE institutions..

    There have been some moves to bring data collection together (including HESA style data collected on HE delivered by FE and more recently certain courses ‘designated’ for student loan purposes delivered by private providers). But there’s still plenty of questions over how comprehensive and integrated the data collection is and how much provision has no central data collection.

    It was certainly a landmark 21 years ago to bring three previously seperate data systems into one.
    [Three? Yes, as well as the Polytechnic and other former Local Authority institutions, the Open University’s data had until then been seperate from the rest of the Universities.]

    I had completely forgotten that the first institution to submit its student return was one that did not survive the subsequent few years (and its former campus is now long gone and host to luxury housing). Thanks for reminding us Andy that even being the first in the queue to submit data is no guarantee of a long and successful life!

  2. Thanks for sharing your thoughts Mike.

    You’re right of course that there continue to be boundaries in this landscape – FE is a particular case where reporting is different; the alternative providers are starting to come into the system – I undetrstand that HESA are running their first AP Student collection now and they are, of course, interacting with the SLC systems. I suspect that, as time goes on, the difference in reporting between APs and the ‘publicly-funded’ sector will diminish.

    The pedant in me can’t let your point about 3 pre-HESA data systems pass; I think it was worse than that. USR covered the pre-92 universities across the UK. There were separate reporting arrangements for the Polytechnics and Colleges in England, Scotland, Wales and Northern Ireland. The OU were joined by Cranfield and the Royal College of Art in having separate, direct arrangements with the then DFE.

    Yes, the first institution to submt that night was West London Inst of HE and it was performed by Debbie Price, who I’m pleased to say I’m still in touch with. Here on Ada Lovelace Day, I think that is something we should take a moment to celebrate.

  3. Thanks Andy. I’ve shared your link with the West London HE dream team of Rodney Chapman and Alan Cash who pressed the button at 00.01 on that day.

  4. As a very recent subscriber, let me be brief for now. Earlier this year, the Eureopean Commission started the project called Data4Policy; you can view all material to date here: Educational statistics have been brought to the attention of this EU project through the Groningen Declaration Network ( In the coming 4 to 5 years, the EC plans to fund a number of Data 4 Policy projects, up to a total investment of almost €0,5 Billion – not just in education, though. I would strongly recommend those of you that have not taken notice to take an interest.

Leave a Reply