This article is more than 6 years old

by Alex Leigh

11/07/19

Data delays shouldn’t mean data disorganisation

Alex Leigh hopes that delays to Data Futures don't set back the cause of better data collection.

This article is more than 6 years old

by Alex Leigh

Comment

11/07/19

Alex Leigh

Consultant

by David Kernohan

staff

11/07/19

Alex Leigh has worked with over thirty UK universities, most of the sector agencies including UCAS, HESA and the QAA and a host of practitioners in the HE sector. Alex designed and developed the HEDIIP data capability framework, led the team to create the HESA in-year collection model, designed the sector level governance for reporting … Continued

Getting there

Many – if not most – universities trigger their HESA student return processes in the Spring. While there are many differences in how operational data is cleaned, massaged, algorithmically swept and beatified to create a “data double” of the institution, certain conventions hold.

The source data is not fit for purpose. The quality attributes – completeness, accuracy, validity, and so on – all need work and lots of it. While certain key attributes tend to be prioritised, this is just the tip of a spear requiring months and months of work from skilled colleagues to create a data set considered good enough for external validation.

What differentiates institutions is the value of that dataset post submission to HESA. Some institutions treat it as a trusted (if somewhat tarnished by entropy) source of student data, others submit it and forget it. What’s missing are all the other uses for that data that have nothing to do with HESA at all, but would benefit significantly from the same quality efforts.

That’s where we came in. The mantra to clean data at the point of entry is often well understood. The reason it isn’t done less so. A representative example may help here. Let’s consider entry qualifications. Clearly these are incredibly important to admissions staff to ensure offers are made to the right prospective students in line with academic and recruitment targets. That same data is used to derive tariff, to measure course performance, to inform tactical and strategic planning along with a host of other scenarios.

Data reuse

And yet our primary use fixes the data quality threshold. If it’s good enough for admissions, it’s good enough. Other use cases are someone else’s problem. It’s not like people don’t care – I don’t believe colleagues get out of bed aiming to make a fellow staff members’ life a misery as they comb individual entry qualification records for weeks on end – but they are incentivised on what their team needs to do.

This is nuts. It really is. Collecting non-admissions qualifications that are tariff affecting is not a huge undertaking. Not compared to contacting individual students six months later which is both extremely costly in time, and not great in terms of the student experience. This though is often standard practice.

Data Futures was changing that. It introduced the concept of Data Governance. Appropriate governance around data is predicated on understanding the entirety of need across the university. It is based on strong principles of considering data as an institution wide asset, managing it actively in support of short and medium term goals, of sharing it where appropriate, and comparing it with other datasets.

Further it crucially defines everyone who touches that data is a steward of it. The minimum compliance with that role is ‘don’t make it worse for anyone else’. The outcome of this is data utility – making less data work harder. It reduces rekeying, it increases data integration, it starts to break data out of its silos.

Sure it doesn’t eradicate the spreadsheet culture, but it starts to make the case to how that might eventually happen. It stops conversations about whose data is right, and starts discussions around what the data means. It provides accountability so staff can actually find data rather than creating yet another copy.

Doing it right or doing it twice

Universities are already managing their data. But primarily in silo. The wasted effort and poor outcomes of this are evidenced in the life of staff and students every day. When I hear concerns that ‘Yes but good data is really expensive’ I respond with ‘Do you know the cost of bad data?’

If the delay to Data Futures retrenches the silo mentality, what we are essentially saying is “having higher quality data that the whole university can trust and use isn’t important to us”. This is really a proxy for cultural blindness around data.

Too many universities approach to the management of their data asset appears to be “we don’t have time to do it right, but we do have time to do it twice”. Yet, whether it’s process improvement, better decision making, improved insights, richer scenario planning or learning analytics – understanding and managing our data as an asset in the same class as finance, estates and staff is foundational.

This is not easy. If it was, we’d already have done it. Data Futures gave us that reason to start professionalising our approach to data. The delay does not negate the central assertion that quality data is woven into the success of the institution.

The question is not “What should we do instead?”, but “how can we do more?”

fest Festival side

TFOHE25_Website_Column_1000x1680_Book@2x

View here

by Mark Leach

featured message

19/05/23

post list Latest articles

wonkhe-foundation-building — Image: Shutterstock

How funding policy has affected foundation year provision

by David Kernohan

Analysis

13/07/25

Drug,Addiction,Concept,And,Substance,Dependence,As,A,Junkie,Symbol — Image: Shutterstock

It is high time higher education adopted a harm reduction approach to drug use among students

by David Hillier

Comment

11/07/25

178cambridgecorpuschristioncork — Image: Hugh Jones

Higher education postcard: Corpus Christi, Cambridge

by Hugh Jones

Comment

11/07/25

Podcast: Student experience, LLE, civic

by Team Wonkhe

Podcasts

10/07/25

Shutterstock_1728102091 — Image: Shutterstock

The LLE finally gets a Labour overhaul

by Michael Salmon

Policy Watch

10/07/25

wonkhe-swimming-pool — Image: Shutterstock

National Student Survey 2025

by David Kernohan

Data

9/07/25

Thermometer,Showing,High,Temperature,Is,On,Orange,Background.,Heat,,Measuring — Image: Shutterstock

How can students’ module feedback help prepare for success in NSS?

by Helena Lim

Comment

9/07/25

Colorful,Plasticine,Clay,On,Blue,Background. — Image: Shutterstock

Higher education leadership is at an inflection point – we must transform, or be transformed

by Amanda Broderick

Comment

9/07/25

Shutterstock_2159768735 — Image: Shutterstock

REF panels must reflect the diversity of the UK higher education sector

by Louise Bracken

Comment

8/07/25

Shutterstock_2257182283 — Image: Shutterstock

Defining the value of the UK’s international research partnerships

by Douglas Dowell

Comment

8/07/25

2 responses to “Data delays shouldn’t mean data disorganisation”

John Richards says:

Jul 16 2019 at 8:43 am

You ask the rhetorical question about the cost of bad data, I’m tempted to reply that there is no cost. Obviously that’s wrong, but if I rephrase the question as “Is there a saving to be gained through good data?” then I suggest we get closer to the challenge.

Cleaning poor data, rekeying, validating, manipulating data is a high cost for universities (at least every uni that I’ve seen), but it is also embedded as small parts of many jobs, which makes saving FTEs very difficult, it’s effectively a hidden cost that’s difficult to recover (I’m ignoring efficiency benefits here). Whereas getting great data means (re)joining systems together, potentially investments into data warehouses, retraining staff etc. which are very visible actual costs, without even considering all the change management.

Do we need to do it? Yes! Is the business case easy to make? Not so far… (feel free to post a link to a great business case, I’m happy to learn.

Reply
Martin Howells says:

Jul 16 2019 at 10:58 am

I know what you mean – leaving data as it is is allegedly free. The business case only makes sense if you value the benefits – for example, to decision-making – that good data brings. If you can’t present these benefits you don’t have a business case. All data can be assigned a value based on its importance to the organisation. There’s no sense in improving the quality of data with zero value, because that would be wasted money.

I think we can learn from lean manufacturing – ultimately, quality reduces cost. And cleaning the ‘inventory’ of poor data, for which you outline the costs, is a separate exercise from fixing processes so that poor-quality data doesn’t enter the system. I’m not certain the solutions are systems or warehouses – data quality is a people and process issue.

Reply

Getting there

Data reuse

Doing it right or doing it twice

Share

Share

fest Festival side

post list Latest articles

2 responses to “Data delays shouldn’t mean data disorganisation”

Leave a replyCancel reply