AI enforcement apparatus exists to punish the anxious middle

Back in the summer of 2023, I was sat in a conference hall in Denver watching anti-cheating firms attempt to sell detection tools that their own inventors had already given up on,

I was doodling – and sketched out a little framework.

Imagine students in three groups – those determined to cheat, those who become tempted to cheat, and those who would never try.

The danger with detection-focused approaches, I argued, was that they catch the desperate middle and miss everyone else.

Two and a half years later, having listened to students describe their actual experience of AI and academic integrity across three focus groups and a national survey of over a thousand students, I can now say something a bit stronger.

The enforcement apparatus disproportionately targets the anxious middle – but it also actively harms the students who have chosen not to cheat, and it barely inconveniences the students who have.

Running rings

Maybe the determined cheaters were never going to show up to a focus group and confess, even if it was being run by Wonkhe. But their presence was felt throughout – described by peers who could see what was happening even when the system couldn’t.

A software engineering student described classmates who submit AI-generated code without even checking whether it works:

Some students use the AI for assignments, to create a code, create an assignment. But most of the times, they are wrong. Some students don’t even try to check if the assessment is right. They just go and submit it.

Those students – the lazy end of the determined group – are occasionally caught, precisely because they can’t be bothered to quality-check their own dishonesty. But the implication of that testimony is obvious.

If the ones submitting broken code are sometimes detected, the ones who spend thirty seconds checking the output before submitting are invisible. A computer science student confirmed the logic from the other direction.

Realistically, you could get a passing grade in any written module. If you throw ethics out the window and get AI to write it all and just quickly correct the glaring mistakes.

That’s a student who has chosen not to do it, describing how straightforward it would be if they did. Detection tools built on perplexity and burstiness – how complex the text is, how much variation there is between sentences – are trivially defeated by anyone who knows to add a persona or rephrase a few lines. The determined cheaters have always known it. The system has not caught up.

The anxious middle

If the first circle is largely invisible to the enforcement apparatus, the second – those who become tempted – is where all the alarm bells ring. And the data makes clear why they’re tempted, how they feel about it, and who pays the price.

A paramedic science student described the experience of using AI on an assignment where the brief was unclear and the university’s guidance amounted to “you’re level six learners, do it yourself”.

I kind of felt like I was cheating, because I kept having to say – don’t give me sentences. Just give me really brief points. Because if you give me sentences, I feel like subconsciously I will start copying them, because you can write it way better than I can. But I also thought that, because it’s so easily accessible and everyone else is using it, would I put myself at a disadvantage for not using it?

That’s someone simultaneously trying to limit their own use, feeling guilty about the use they are making, and fearing competitive disadvantage if they stop. It is the anxious middle in miniature – caught between conscience and pragmatism, with no institutional support for either.

One of the most miserable accounts came secondhand. A postgraduate student described a peer – a diligent international student who had worked hard, avoided AI, and been awarded a lower second despite having been an exceptional school student.

They told me they were now forced to use AI. Now they’re in third year, and they were saying – if you ask me if I’ve learned anything in my undergrad, I cannot say it. I can’t be very certain. But now, because I’m using AI, I’m getting good grades, and those good grades would help me in getting into a good master’s.

That student was dragged from the principled group into the tempted group by the grade competition itself. When the system rewards AI-assisted production and punishes unassisted effort, a conscientious student who switches to AI isn’t making a moral failure. They’re making a rational response to a set of incentives that has made honesty costly.

The time pressure then compounds the competitive pressure and the guilt.

You’re not gonna sit there and read a 500-page document on anything. You know, who’s got the time for that when you’ve got deadlines and exams?

The system says you’re a full-time student. Many students can’t be full-time students – and when deadlines bunch, AI becomes a coping mechanism for a workload problem the sector created.

And then there’s the fear. A student on a health-related course captured the specific dread that defines this group – not the determined cheater’s swagger, but the anxious user’s constant vigilance.

I would be worried that if other people are using it, it would give the same answers and stuff, and then you’re more likely to be caught using it. I think that’s probably what puts me off – the “what if I get caught out for using it, and then get booted off my course.”

The determined cheaters aren’t losing sleep over detection. The principled non-users aren’t either. It’s the students in the middle – the ones agonising over every paragraph, trying to work out where the line is, terrified of accidentally crossing it – who carry the full psychological weight of the enforcement system.

Penalised for integrity

If the anxious middle is where the system’s net falls, the principled group is where its absence does damage. Students who have chosen not to use AI – or to use it only in ways they consider ethically defensible – described being actively harmed by a system that cannot distinguish their work from the work of those who cheated their way to a higher grade.

A history and film student put it like this.

It can be really frustrating when I’ve done a piece of work without AI and then someone goes “oh yeah I used AI and I got a 75” and I’m like – well I got in the 60s without using AI but you got a first with using AI, so why is the university not picking that up?

She went further. Because of the unreliability of AI detection tools, she had started self-censoring her own writing – avoiding punctuation that might trigger a false positive.

I get scared to even use some grammar – so I completely don’t use semicolons because I’m too scared that it’s going to pick it up as AI.

Ouch. A principled student who has chosen to do her own work is now afraid of her own punctuation. The detection system has made an honest student write worse – not because she’s done anything wrong, but because the tools that purport to catch cheating can’t tell the difference between AI-generated text and a student who likes semicolons.

And then there’s the pressure to capitulate:

It’s kind of like – well, should I start using AI? But then that goes against my own ethics. It just doesn’t feel rewarding if I were to use it, because it’s not my own work, it’s not my own thoughts.

When asked who would do better if both she and the AI-using student were called in to discuss their essays, she said she’d like to hope herself. She’s probably right. But nobody called them in. The mark on the transcript is the mark on the transcript – and hers is lower.

In another session, a student rep stated it without qualification:

It does feel like students who don’t use AI are disadvantaged at this point because of how our work is graded, even when the student actually understands the topic much more than if they had used AI.

Who gets caught

Map the three circles onto the detection apparatus and the distributional problem is clear.

The determined cheaters – those using AI competently and confidently – sail through undetected. The more sophisticated their use, the less likely any tool or academic is to flag it. As I wrote in 2023, the whole point of generative AI is to mimic human language. The tools designed to spot it are playing a game they structurally cannot win against a determined user.

The anxious middle – students using AI under pressure, often clumsily, often guiltily – are exactly the group most likely to be flagged. Their use is less polished, more detectable, and more likely to produce the patterns that both software and academics are trained to spot. These are the students submitting work at 3am after a shift, using AI to get through a module they didn’t choose on a course that hasn’t given them clear guidance. They’re the ones most likely to end up in a misconduct hearing.

And the principled non-users get no credit at all. Their grades are lower because they’re competing against AI-assisted work without the time savings. Their writing may even be flagged by detection tools for entirely innocent stylistic choices. The system treats their output identically to the output of the competent cheater – because in the final artefact, it simply can’t tell the difference.

The result is an enforcement regime that rewards both integrity and sophisticated dishonesty – but can’t distinguish between them in the output – while punishing financial stress, deadline pressure, and the kind of messy, guilty, half-hearted AI use that characterises students who are actually trying to do the right thing but can’t quite manage it.

Group work makes it worse

Everything described above applies to individual assessment. Group work puts rockets up it.

A game development student described year-long group projects involving ten or eleven people across a range of disciplines – programmers, artists, and designers – each with very different relationships to AI. Some are in disciplines that demand AI fluency. Others have a deep vein of suspicion about it. All are assessed together.

As we noted here, that’s a mixture of students in different disciplines, and students with different risk tolerances around AI use, all sharing accountability for a joint output. The student who avoids AI because the academic regulations are “such a grey area between getting you into trouble immensely, or actually being properly referenced” is graded alongside the student who has been told by their discipline to embrace it.

Existing survey data on group work dissatisfaction is already astonishing. Add AI into the mix and you have a scenario where principled students – Circle C in the original framework – are forced into shared outputs with students operating in Circle A or B, with no mechanism for disentangling who contributed what or how.

One student pushed the point directly:

Do you think that creates a problem when you compare students who won’t use AI, for whatever reason, and those that do?

Another replied:

I 100% agree that students who don’t use AI are disadvantaged, but think that’s a reason not to mandate its use unilaterally.

Weighing up what to do about others’ AI usage becomes an additional burden:

Oh it was bad enough when you just had someone lazy, but someone in my group just straight up pasted ChatGPT into the doc and didn’t even check it. And I’m sat there like – do I snitch? Do I just fix it myself at 2am? Why is that on me? I’m not the AI police.

And it cuts both ways:

I’ve got groupmates who are terrified to even use AI for brainstorming because they think they’ll get thrown off the course. So now I’m spending half my time going “no honestly it’s fine for this bit” and the other half actually doing the project. Like – when did that become my responsibility?

The burden of managing other people’s AI ethics in a group project – where the assessment design gives you no mechanism for doing so – falls on the students who have thought hardest about those ethics. Everyone else just gets on with it.

What the system can’t see

If you’re someone reading this and thinking “well, we just need better detection” – the students have already anticipated you.

The academic judgement framework that protects assessment decisions from challenge was designed for a world in which a marker with a red pen could point to copied sentences. It held, just about, when Turnitin could trawl the internet for matching text.

It does not hold when a marker is asked to determine whether a student used AI based on the frequency of em dashes, whether the subheadings use title case, or whether the essay has a suspicious fondness for the word “foster”.

As I’ve argued before, extending the scope of unchallengeable “academic judgement” to cover AI-misconduct determinations puts students in an impossible position. They can’t challenge the decision because the university has deemed it a matter of academic expertise. But the “expertise” is often an algorithmic guess dressed up as professional judgment – and the student has no way to prove a negative.

Meanwhile, the students in these focus groups – the ones who have thought carefully about AI use, who can distinguish between six different modes of engagement, who can describe with precision when real learning happens and when it doesn’t – don’t feel consulted on any of it. They’re being governed by policies that most of them have outrun.

The wrong question

I get why, but any fixation on detection is a fixation on the wrong question. Instead of asking how to catch students using AI, the right thing to do is ask why any assessment system produces a set of incentives that punishes the most conscientious students and rewards the least.

The three circles framework was always about incentives, not morals. The determined cheaters have made a choice the system can’t reach. The principled non-users have made a choice the system doesn’t reward. And in between sits the vast anxious middle – students who could go either way depending on whether the system gives them a reason to learn or merely a reason to produce.

Every hour spent on detection tools, AI declarations, and misconduct procedures aimed at the anxious middle is an hour not spent on the thing that would actually change the picture – assessment that includes an accountability moment where students have to show they understand what they’ve produced.

Students told us, repeatedly, that this single structural feature changes everything about how they use AI. When they know they’ll be asked to explain their work, they use AI to learn. When they don’t, they use it to produce. The sector can keep building nets that catch one group and miss two. Or it can redesign the water.