Trained to stop learning: How students are experiencing assessment and learning in an age of AI

New research from Wonkhe finds that assessment design – not AI policy – is what determines whether students use AI to learn or to produce. Jim Dickinson and Mack Marshall explain

Jim is an Associate Editor (SUs) at Wonkhe


Mack Marshall is Wonkhe SUs’ Community and Policy Officer

We already know that the vast majority of UK students use AI, that most use it for assessed work, and that ChatGPT dominates.

We know students reach for it primarily to save time and improve quality, that they worry about cheating accusations and hallucinations, and that most institutions now have a policy. That ground has been covered.

What the existing evidence has not addressed is the set of questions underneath the adoption statistics. Do students feel they have actually learned what they have produced? What are they weighing up when they decide how to use AI on a specific piece of work? Do they think their assessments test understanding – and if not, what would? Those are the questions we set out to answer in our new research.

The findings are uncomfortable. Students are submitting work they cannot fully explain, facing AI policies that do not function in practice, and responding strategically to assessment systems that reward production over understanding. The costs of that incoherence fall hardest on the most conscientious students, while disabled students are sourcing cognitive support through AI that they describe their universities as not providing – in a policy environment where that use is ambiguous at best.

But the findings are not only about institutional failure. They also reveal something the sector has not reckoned with – the seriousness with which many students are thinking about what AI means for their learning. Students in this research are not passively working through confusing policy. Many are actively constructing personal ethical frameworks – theories of practice about the relationship between tools, effort, learning, and identity – that are often more considered than the institutional guidance they receive. They are doing this work largely alone, with little support and no recognition. That intellectual and moral seriousness is itself a finding, and a resource the sector has barely begun to draw on.

The research identifies what appears to change student behaviour more reliably than any policy – the presence of a visible future accountability moment. Students who know they will need to demonstrate understanding describe using AI to test themselves, generate counter-arguments, and check their reasoning. Students without that downstream moment have no equivalent incentive. Assessment design emerges from this research as the factor most strongly associated with whether AI supports learning or replaces it.

The recommendations that follow are not about better detection or stricter policy. They are about whether higher education still asks students to be present – to think, to struggle, to understand – or whether it has settled for asking them to produce. That question predates AI. What AI has done is make the answer visible, because a system that cannot tell the difference between a student who understood and a student who produced something plausible was already in trouble. The sector’s response should not start with what to do about AI. It should start with what assessment is for – and whether the answer it is currently giving is one it is willing to defend.

Full report

Deck of key findings

Headlines

Finding 1: The gap between submitting work and understanding it predates AI – and assessment design is making it wider

Nearly half of students worry their grades don’t reflect what they actually know, and 38 per cent admit submitting work they couldn’t fully explain. AI has widened and accelerated a pre-existing gap, but the strongest predictors are assessment design features, not AI use itself.

Finding 2: AI use is not one thing, and treating it as one thing has left students without useful guidance

Students describe at least six distinct modes of AI use – from search replacement to production acceleration – each with different relationships to learning and different ethical implications. The same student routinely moves between modes on the same assignment, meaning any policy that treats AI use as a single behaviour will misfire.

Finding 3: Students know what properties genuine assessment should have, are rarely experiencing them, and have designed alternatives

Only 21 per cent feel their course primarily rewards thinking and reasoning, yet students across disciplines converge on what real assessment should look like – visible individual understanding, application, accountability, and developmental feedback. Many have designed specific alternatives that would test understanding, but none resemble a traditional essay submitted to a VLE.

Finding 4: Visible accountability moments are disappearing without replacement, and students use AI very differently when they know accountability is coming

Students who know they will need to demonstrate understanding in person use AI to test themselves, interrogate answers, and check their reasoning. When no downstream verification exists, the same students describe using AI on autopilot – the accountability moment changes not whether AI is used but how.

Finding 5: Discipline variation is structural, not incidental, and uniform policy will misfire in almost every context

The relationship between AI and learning differs fundamentally across creative arts, computing, healthcare, and humanities – shaped by professional identity, the nature of the knowledge, and students’ ethical commitments to their field. A blanket institutional policy cannot account for this variation and will produce either meaningless restrictions or meaningless permissions in most disciplines.

Finding 6: AI policy is present in almost every institution and functional in almost none – and AI declarations may be making things worse

Students report tiered frameworks, declaration forms, and module-level guidance that varies between tutors on the same programme – none of it specific enough to tell them what to do when they sit down with an assignment. Declarations are penalising honest students while catching none of the heavy users, functioning as liability management rather than transparency mechanisms.

Finding 7: Policy incoherence is a distributional justice problem that consistently punishes the most conscientious students – and the cost is emotional as well as academic

The costs of unclear AI policy fall hardest on students most trying to comply – some have received lower grades for cautiously avoiding tools they couldn’t confirm were permitted. Meanwhile, the most risk-tolerant students face no equivalent burden, and 59 per cent of all respondents worry AI could reduce their critical thinking.

Finding 8: Women are far less likely to use AI for assessment, and many carry anxiety about AI disadvantage without using it themselves

The gender gap in AI use is the largest demographic difference in the dataset – more than 20 percentage points – and does not disappear when controlling for other factors. Among non-users who worry about competitive disadvantage, 74 per cent are women, carrying the psychological costs of an AI-saturated environment without using the tools themselves.

Finding 9: Disabled students are using AI to meet needs that formal adjustments are not, in their experience, addressing

Students with dyslexia, ADHD, and related conditions describe AI as the most effective cognitive support they have encountered – often more useful than any formal university adjustment. Blanket AI restrictions risk removing this support in the name of academic integrity, in a context where the formal system has demonstrably failed to provide an equivalent.

Finding 10: Feedback timing is breaking the developmental function of assessment – and driving students towards AI

Feedback routinely arrives after students have started the next assignment, making the assessment sequence functionally summative regardless of what module handbooks claim. Where marking criteria are unclear, students turn to AI as a sensemaking tool – institutional opacity around expectations appears to drive AI adoption directly.

Finding 11: AI is compensating for gaps in institutional provision that students can see but that remain unaddressed

A substantial proportion of AI use described in this research would disappear if universities fixed problems within their control – inadequate library search, unclear briefs, inaccessible teaching, thin practical preparation, and slow academic support. Every heavy AI use case is diagnostic information about what the institution is not providing.

Finding 12: Time poverty is a structural driver of AI adoption that students are told is a moral or commitment failing

Students managing paid work, caring responsibilities, and unrealistic reading loads describe AI as the most efficient tool available for work they cannot bring themselves to care about. Where AI is expected or encouraged, the quality gap between paid and free tools is becoming a socioeconomic access issue that higher education should close, not widen.

Finding 13: Peer learning is a reliable route to genuine understanding and a largely untapped resource

Every moment of real learning described in focus groups involved other people – and every student asked whether explaining something to a peer felt intimidating or empowering said empowering, without exception. The unofficial curriculum of navigational knowledge flows through social networks that universities do not control, and formalising peer learning would distribute it more equitably.

Finding 14: Career-confident students are distinguished less by assessment format than by whether their course rewards thinking

The strongest correlates of career confidence are not vocational assessment formats but intellectual honesty – whether feedback develops thinking, whether stated values match actual rewards, and whether assessment tests understanding. Oral examinations have the strongest positive correlation with career confidence while placement-based outputs are among the weakest.

Finding 15: Students who feel they belong reach for AI less – because their course already provides what AI substitutes for

Belonging is one of the most powerful correlates of learning experience in the dataset, and students with stronger belonging are markedly less likely to use AI for assessments. Where belonging is absent, students experience their course as a production line and reach for AI accordingly – making the absence of resourced peer learning a structural driver of AI adoption.

4 Comments
Oldest
Newest
Inline Feedbacks
View all comments
Catherine Hobbs
30 days ago

Great work. It would be even better to have some recommendations of what universities should be doing to improve the situation alongside this diagnosis of the issues. Were there any examples of good practice you found or can suggest?

richard waites
29 days ago

I love a Wonkhe report that lays out a problem using authentic research from the student voice and experience. I am struck by this comment “The design challenge is to build accountability moments that are experienced as developmental conversations rather than surveillance events – and that are accessible to students for whom live performance under pressure is itself a barrier.” After only two days since the report was released I am sensing incoming “surveillance events” rather than the “developmental conversations” we all need in education. It is up to us to make sure the developmental conservations remain at the core of what we do.

Selwyn Seymour
29 days ago
Reply to  richard waites

I suspect this article has overreached itself a bit. Whilst it produces thought-provoking ideas, it is driven by 3 research questions but then goes onto claim 15 ‘findings’ some of which are not even implicated in the ROs, like gender, and the feeling of belonging, for example.