Earlier this week we learned of another delay to HESA’s platform for in year reporting.
The headline is Data Futures will not be live until at least 2021/22. The subtext is that the HEDIIP vision for a high utility collection landscape is pretty much dead. That’s is a whole other discussion which lacks the urgency of “Where to reallocate those precious staff allocated to DF projects?”
But that’s the wrong question.
Most of the activities to support in-year collection are focussed on initially driving up the quality of – primarily – student data, and maintaining well understood hygiene factors to negate the need for a long Quality Assurance tail. That is both necessary to meet the collection timelines, and somewhat peripherally within touching distance of best practice data management.
That’s important because right now we’re nowhere near best practice.
Many – if not most – universities trigger their HESA student return processes in the Spring. While there are many differences in how operational data is cleaned, massaged, algorithmically swept and beatified to create a “data double” of the institution, certain conventions hold.
The source data is not fit for purpose. The quality attributes – completeness, accuracy, validity, and so on – all need work and lots of it. While certain key attributes tend to be prioritised, this is just the tip of a spear requiring months and months of work from skilled colleagues to create a data set considered good enough for external validation.
What differentiates institutions is the value of that dataset post submission to HESA. Some institutions treat it as a trusted (if somewhat tarnished by entropy) source of student data, others submit it and forget it. What’s missing are all the other uses for that data that have nothing to do with HESA at all, but would benefit significantly from the same quality efforts.
That’s where we came in. The mantra to clean data at the point of entry is often well understood. The reason it isn’t done less so. A representative example may help here. Let’s consider entry qualifications. Clearly these are incredibly important to admissions staff to ensure offers are made to the right prospective students in line with academic and recruitment targets. That same data is used to derive tariff, to measure course performance, to inform tactical and strategic planning along with a host of other scenarios.
And yet our primary use fixes the data quality threshold. If it’s good enough for admissions, it’s good enough. Other use cases are someone else’s problem. It’s not like people don’t care – I don’t believe colleagues get out of bed aiming to make a fellow staff members’ life a misery as they comb individual entry qualification records for weeks on end – but they are incentivised on what their team needs to do.
This is nuts. It really is. Collecting non-admissions qualifications that are tariff affecting is not a huge undertaking. Not compared to contacting individual students six months later which is both extremely costly in time, and not great in terms of the student experience. This though is often standard practice.
Data Futures was changing that. It introduced the concept of Data Governance. Appropriate governance around data is predicated on understanding the entirety of need across the university. It is based on strong principles of considering data as an institution wide asset, managing it actively in support of short and medium term goals, of sharing it where appropriate, and comparing it with other datasets.
Further it crucially defines everyone who touches that data is a steward of it. The minimum compliance with that role is ‘don’t make it worse for anyone else’. The outcome of this is data utility – making less data work harder. It reduces rekeying, it increases data integration, it starts to break data out of its silos.
Sure it doesn’t eradicate the spreadsheet culture, but it starts to make the case to how that might eventually happen. It stops conversations about whose data is right, and starts discussions around what the data means. It provides accountability so staff can actually find data rather than creating yet another copy.
Doing it right or doing it twice
Universities are already managing their data. But primarily in silo. The wasted effort and poor outcomes of this are evidenced in the life of staff and students every day. When I hear concerns that ‘Yes but good data is really expensive’ I respond with ‘Do you know the cost of bad data?’
If the delay to Data Futures retrenches the silo mentality, what we are essentially saying is “having higher quality data that the whole university can trust and use isn’t important to us”. This is really a proxy for cultural blindness around data.
Too many universities approach to the management of their data asset appears to be “we don’t have time to do it right, but we do have time to do it twice”. Yet, whether it’s process improvement, better decision making, improved insights, richer scenario planning or learning analytics – understanding and managing our data as an asset in the same class as finance, estates and staff is foundational.
This is not easy. If it was, we’d already have done it. Data Futures gave us that reason to start professionalising our approach to data. The delay does not negate the central assertion that quality data is woven into the success of the institution.
The question is not “What should we do instead?”, but “how can we do more?”