James is Operations Coordinator at Wonkhe

Charles Knight is Assistant Director, Knowledge and Innovation at Advance HE

David Kernohan is Deputy Editor of Wonkhe

We’re sure, like us, you’ve seen it all in past weeks; from articles suggesting AI can create academic papers good enough for journals, to lecturers being urged to review their assessments in light of ChatGPT’s disruptive capabilities.

But are AI text generation tools really the problem? Or do they reveal more serious issues around assessment practices and the academic/student relationship?

Academic integrity

If we continue with current assessment methods, there’s no clear solution on the horizon to mitigate against the use of AI tools. We’ve seen some efforts to employ the “detection tool” approach used for other forms of academic malpractices – but every single one of them has been beaten in practice, and many flag the work of humans as AI derived.

Simply restricting access is not an option – the technical landscape is moving quickly with Microsoft and others releasing a range of AI enhanced tools (such as Bing Search with ChatGPT) and platforms (such as Microsoft Teams). A myriad of new AI large language models (LLMs) are in the works, or soon to be released, such as Google’s Bard, or NVIDIA’s NeMo. Moving beyond text, LLMs are moving into sophisticated image and video generation. It’s impractical and improbable for students to avoid using such tools as they become ubiquitous in society – a “ban” on such tools would make about as much sense as banning the use of Word’s spell checker due to how ‘frictionless’ they are designed to be.

A starting point for many is that academic integrity is tarnished through students engaging in malpractice (however defined), and even if we accept this deficit framing, it presents a range of both practical and moral problems, as Jan McArthur has argued on Wonkhe in the past in relation to use of plagiarism detection software. Outside of academia, graduates write and generate ideas drawing on a range of sources using a wide variety of tools and approaches – we need to be considering supporting students in using AI as a part of effective academic work and preparation for graduation (and the participation in work and wider society that follows).

Protect, or accept and progress

“Computer-based maths”’ is a movement with the aim of reforming maths education. The principle is that curricula (and by extension, assessment) should “assume computers exist”, and move students towards a more developmentally effective learning experience. Higher education as a whole cannot get away from “assuming AI exists” and that students will use it – so there is a need to build curricula and assessments around this reality.

So rather than simply focusing on protecting the status quo, we could be working with students to design the assessment of the future. Academics have a responsibility to teach students how to use all kinds of tools and resources (search engines, academic literature databases, open data) in a useful but moral and ethical fashion – and already design assessments with a core focus on academic integrity, but augmented with a sense of trust and an understanding that these tools exist and will (and should) be used.

One of fundamental shifts in assessment is therefore likely to be around defining the level of creativity and originality lecturers expect from students, and what these terms will mean. The future of knowledge work is likely to expect people to range across multiple domains of expertise – so asking the right questions (whether this is of people, data, or artificial intelligence) becomes paramount.

For instance, we can see that the current iteration of AI tends to “hallucinate” many things, providing phrased answers that lack factual rigour. There’s been numerous reports of AI generating fake references or inventing entirely spurious facts. Although it is claimed that ChatGPT can write an essay deserving a 2:2, in practice the user requires a great deal of critical reasoning and research to bring the average response up to that standard. Such skills have always been in demand – and it is possible to imagine that “improving” an AI answer in this way as an interesting assessment design.

The QAA recently published a briefing note to support the challenges of academic integrity that the sector is facing. Its advice is clear: communicate early with students and discuss the technology with them; design assessments around co-creation, iteration, and critical thinking skills; and that detection tools are unverified and ineffective. This semester’s forthcoming assessment period may be turbulent across the sector; we’re never returning to pen and paper assessment, just like we’re never going to deny electricity exists. But surely a culture of academic integrity can only be built upon mutual trust and understanding; students don’t inherently wish to cheat, and framing these tools as cheating doesn’t help anyone.

The next steps

To paraphrase William Gibson, AI is here – but it is not evenly distributed. The currently available technology is provided on a semi-commercial basis. Capabilities and terms of use could change at any point at the whim of owners and operators. For example, ChatGPT is a trained language model owned and operated by some of the biggest tech investors in the world: Peter Thiel, Sam Altman, and Elon Musk to name a few. There is already a premium model (faster responses, more reliable access) and it is not unreasonable to suppose that paid for AI will outstrip the current free and freemium models.

We clearly cannot design assessments based on student access to a paid product without planning to ensure all students can access it on the same terms. We can’t build learning and assessment around tools that could disappear or change radically at any moment. We need to assess students based on what they can do, not what they can afford.

As with any new technology there are risks and opportunities to the use of AI in education. Older readers may recall similar concerns as the internet became widely accessible – and downloading an essay from a website replaced the dominant model of buying one from a student in the year above. The technological arms race that followed brought about the rise of the essay mill, and some of the most effective academic malpractice that money could buy. Now that approach has been banned, maybe the latest and greatest AI will pick up the premium misconduct mantle – or something else will arise on the darker corners of the web.

A failure to iterate the design of assessment, and a reliance on the technology of “detection” as a stopgap doesn’t just leave academic integrity in question – it poses fundamental questions about structural inequality too.

8 responses to “ChatGPT, assessment and cheating – have we tried trusting students?

  1. Great article that recognises what many seem to be struggling with – this is a tool which is here to stay and therefore we need to change how we educate (and what future careers look like) in light of this. Trying to “beat” it is like trying to say we should be doing repetitive calculations by hand rather than with a spreadsheet. Current students will definitely be using it in industry once they graduate – we need to prepare them for that reality.

  2. Students of marketing related subjects have been using this technology for some time through critique and analysis of the AI outputs, there may well be other examples.

    I agree with the general direction of the article, but I am not sure I fully agree with this statement as it suggests a very constrained learning and assessment approach ‘“We can’t build learning and assessment around tools that could disappear or change radically at any moment.” – if we embrace authentic assessment approaches such as inquiry into and exploration of the world of AI, we can do this without a concern for particular platforms.

    1. When it comes to “authentic” assessments – a major problem is that current LLMs (e.g. ChatGPT) are arguably much better at completing more “authentic” tasks than rigorous academic writing. The reason is pretty obvious – when you look at the vast quantity of material the LLMs were trained on – most of it was not rigorous academic writing – it was stuff like non-academic books, newspaper articles, blog posts, forum contributions, product brochures, pubic health guidelines, advice pages from charities/NGOs, song lyrics, poems etc etc etc.

  3. ChatGPT is not suitable as a tool for assisting academic research. Quite simply it is a pathological liar. It makes up facts in the same way that it writes bad poetry. It also makes up references. At first glance the results look real and compelling. But dig deeper and it is possible to find many flaws with its output. Furthermore, on pointing out these flaws it apologised, admits it got things wrong and proceeds to provide an alternative, convincing but equally wrong, response. In short ChatGPT cannot be trusted, and any student using it risks getting low marks.

  4. A thoughtful article and the idea of asking students to critique a ChatGBT answer is a good one (though I expect ChatGPT could simply end up critiquing it’s own work). Even before the proliferation of this technology there has been significant concern about cheating in assessments, whether purchasing answers from essay mills (or borrowing from peers). The arrival of take home exams has led to many exams being answered following discussion via WhatsApp (and in the case of MCQs not even discussion). I have even heard students claim that Universities implicitly tolerate it because “they know everyone’s doing it”. If we are serious about restoring some integrity there must be an argument, despite the obvious limitations, for a return to timed, invigilated, in-person exams. We could end up with vivas for all assessments so we check the essay is the student’s own work in the way we check PhD theses are. But I suspect that could not be resourced.

  5. An interesting article

    AI has been used in education for several decades, and over the years, it has been applied in various areas of education such as personalizing learning, improving student assessment, and facilitating research. Some well-known examples of AI in education include IBM’s Watson, Carnegie Learning’s Cognitive Tutor, and Knewton’s Adaptive Learning Platform.

    However, the use of AI language models like ChatGPT in higher education is the subject of ever-increasing controversy. Many people may remember the ‘excitement’ of Second Life, a virtual platform introduced in 2003 that was expected to revolutionize higher education by offering immersive, interactive learning experiences. However, despite early excitement, the platform failed to gain traction in the academic world. Similarly, MOOCs or massive open online courses were once touted as the future of higher education, but they have largely failed to live up to their potential in the academic world.

    There are the obvious concerns about cheating in assessments with the arrival of ChatGPT, but as highlighted by the comments and in the article here the model is only as ‘clever’ as the initial search terms used to initiate the response. Also inclusion of assertion reasoning questions (ARQs), a type of test question that assesses a person’s ability to evaluate and analyze information render the use of AI obsolete.

    While ChatGPT has great potential, we do need to be cautious about its use in education to avoid repeating past failures such as Second Life and MOOCs. If we don’t tread carefully, we risk relegating ChatGPT to the same fate.The ability to interact with AI is fast becoming a ubiquitous feature in modern society, and we want our undergraduates to excel in this skill. Thus, we need to find a way to harness the potential of ChatGPT while ensuring that it doesn’t compromise the integrity of education.

Leave a Reply