The data dark ages

David Kernohan

Deputy Editor

by David Kernohan

staff

21/10/14

David Kernohan is Deputy Editor of Wonkhe

What’s going on?

With so much of our understanding of social policy issues coming through survey data, problems like these feel almost existential in scope. Online survey tools have made it easier to design and conduct surveys – and often design in the kind of good survey development practices that used to be the domain of specialists. Theoretically, it should be easier to run good quality surveys than ever before – certainly we see more of them (we even run them ourselves).

Is it simply a matter of survey fatigue? Or are people less likely to (less willing to?) give information to researchers for reasons of trust?

In our world of higher education, we have recently seen the Graduate Outcomes response rate drop below 50 per cent for the first time, casting doubt as to its suitability as a regulatory measure. The survey still has accredited official statistics status, and there has been important work done on understanding the impact of non-response bias – but it is a concerning trend. The national student survey (NSS) is an outlier here – it has a 72 per cent response rate last time round (so you can be fairly confident in validity right down to course level), but it does enjoy an unusually good level of survey population awareness even despite the removal of a requirement for providers to promote the survey to students. And of course, many of the more egregious issues with HESA Student have been founded on student characteristics – the kind of thing gathered during enrollment or entry surveys.

A survey of the literature

There is a literature on survey response rates in published research. A meta-analysis by Wu et al (Computers in Human Behavior, 2022) found that, at this point, the average online survey result was 44.1 per cent – finding benefits for using (as NSS does) a clearly defined and refined population, pre-contacting participants, and using reminders. A smaller study by Diaker et al (Journal of Survey Statistics and Methodology, 2020) found that, in general, online surveys yield lower response rates (on average, 12 percentage point lower) than other approaches.

Interestingly, Holton et al (Human Relations, 2022) show an increase in response rates over time in a sample of 1014 journals, and do not find a statistically significant difference linked to survey modes.

ONS itself works with the ESRC-funded Survey Futures project, which:

aims to deliver a step change in survey research to ensure that it will remain possible in the UK to carry out high quality social surveys of the kinds required by the public and academic sectors to monitor and understand society, and to provide an evidence base for policy

It feels like timely stuff. Nine strands of work in the first phase included work on mode effects, and on addressing non-response.

Fixing surveys

ONS have been taking steps to repair LFS – implementing some of the recontacting/reminder approaches that have been successfully implemented and documented in the academic literature. There’s a renewed focus on households that include young people, and a return to the larger sample sizes we saw during the pandemic (when the whole survey had to be conducted remotely). Reweighting has led to a bunch of tweaks to the way samples are chosen, and non-responses accounted for.

Longer term, the Transformed Labour Force Survey (TLFS) is already being trialed, though the initial March 2024 plans for full introduction has been revised to allow for further testing – important given a bias towards older age group responses, and an increased level of partial responses. Yes, there’s a lessons learned review. The old LFS and the new, online first, TLFS will be running together at least until early 2025 – with a knock on impact on APS.

But it is worth bearing in mind that, even given the changes made to drive up responses, trial TLFS response rates have been hovering around just below 40 per cent. This is a return to 2020 levels, addressing some of the recent damage, but a long way from the historic norm.

Survey fatigue

More usually the term “survey fatigue” is used to describe the impact of additional questions on completion rate – respondents tire during long surveys (as Jeong et al observe in the Journal of Development Economics) and deliberately choose not to answer questions to hasten the end of the survey.

But it is possible to consider the idea of a civilisational survey fatigue. Arguably, large parts of the online economy are propped up on the collection and reuse of personal data, which can then be used to target advertisements and reminders. Increasingly, you now have to pay to opt out of targeted ads on websites – assuming you can view the website at all without paying. After a period of abeyance, concerns around data privacy are beginning to reemerge. Forms of social media that rely on a constant drive to share personal information are unexpectedly beginning to struggle – for younger generations participatory social media is more likely to be a group chat or discord server, while formerly participatory services like YouTube and TikTok have become platforms for media consumption.

In the world of public opinion research the struggle with response rates has partially been met via a switch from randomised phone or in-person to the use of pre-vetted online panels. This (as with the rise of focus groups) has generated a new cadre of “professional respondents” – with huge implications for the validity of polling even when weighting is applied.

Governments and industry are moving towards administrative data – the most recognisable example in higher education being the LEO dataset of graduate salaries. But this brings problems in itself – LEO lets us know how much income graduates pay tax on from their main job, but deals poorly with the portfolio careers that are the expectation of many graduates. LEO never cut it as a policymaking tool precisely because of how broadbrush it is.

In a world where everything is data driven, what happens when the quality of data drops? If we were ever making good, data-driven decisions, a problem with the raw material suggests a problem with the end product. There are methodological and statistical workarounds, but the trend appears to be shifting away from people being happy to give out personal information without compensation. User interaction data – the traces we create as we interact with everything from ecommerce to online learning – are for now unaffected, but are necessarily limited in scope and explanatory value.

We’ve lived through a generation where data seemed unlimited. What tools do we need to survive a data dark age?

fest Festival side

TFOHE25_Website_Column_1000x1680_Book@2x

View here

by Mark Leach

featured message

19/05/23

post list Latest articles

OfS Outcomes (B3) data, 2025

by David Kernohan

Data

6/08/25

The rise of the ghost academic

by Anne Tierney

Comment

6/08/25

Is peer review of teaching stuck in the past?

by Nick Grindle

Analysis

4/08/25

Shutterstock_2126860385 — Image: Shutterstock

Peer review is broken, and pedagogical research has a fix

by Madeleine Pownall

Comment

1/08/25

Shutterstock_2464353291 — Image: Shutterstock

Medr is embracing its collaborative role

by Simon Pirotte

Comment

1/08/25

Just 329 students with an EHCP got to a high tariff provider last year

by David Kernohan

Data

31/07/25

Shutterstock_2535655981 — Image: Shutterstock

More comprehensive EDI data makes for a clearer picture of staff social mobility

by Zi Parker

Comment

31/07/25

Shutterstock_231831481 — Image: Shutterstock

Higher education leadership requires multiple versions of yourself

by George Hulene

Comment

30/07/25

wonkhe-graduate-jobs-orange — Image: Shutterstock

The rise of the post-graduation careers service

by Jeremy Swan

Comment

29/07/25

3 Comments

Oldest

Newest

Inline Feedbacks

View all comments

dolphinmagneticc3aee434a5

7 months ago

I wonder whether declining response rates reflects lower commitment to collective action and responsibility – declining social capital.

Baron, Stephen, John Field, and Tom Schuller (eds), Social Capital: Critical Perspectives (Oxford, 2000; online edn, Oxford Academic, 31 Oct. 2023), https://doi.org/10.1093/oso/9780198297130.001.0001, accessed 7 Dec. 2024.

Sue Grey

7 months ago

The uncritical acceptance that ‘the world’ is driven by data, and that all we need to do to improve that world is improve the quality of data, lies at the heart of the ‘failures’ outlined in the article, and reflects the thinking of global economic organisations like the OECD. Until there is a recognition that data does not always lead to optimal social policies, and that there is a humanistic angle which cannot be captured in large scale datasets, the usefulness of data in tackling societal problems will not reflect the faith placed in it.

Julian Westwood

7 months ago

When it comes to taught courses, universities are in the business of changing people (students) and their futures. Yet they do so without every meaningfully addressing their progress or outcomes. Engaging with a regular (twice each year) 20 -30 question survey covering the main aspects of student progression and experience should be central to the student / University contract. It should be mandatory students to complete and for Universities to publish the results.

Share

What’s going on?

A survey of the literature

Fixing surveys

Survey fatigue

Share

Share

fest Festival side

post list Latest articles