During a recent discussion that I attended on generative AI, there was that rare and fleeting thing in higher education – a panel in unanimous agreement, within the first five minutes, no less.
With heads nodded and backs patted regarding the impossibility of ever really being able to ban (or accurately detect) students’ use of AI, the panel moved onto much more interesting discussion – on the further complexities on what to actually do with AI tools like Chat GPT, Microsoft Copilot and Google Bard.
But as I sat in front of my Zoom webinar window watching the comments cycle furiously across the screen, I got thinking – why aren’t we considering the other side of the question, what AI might be able to do for students?
There was an interesting moment where the panel looked at some early examples of staff and student guidelines on the use of generative AI in assessment. The more progressive end of the spectrum still seemed to have fallen on some iteration of the following:
Students must declare use of any AI tools, and refrain from using them without the expressed permission of their tutor.
For me, while maybe a little better than an outright ban, this approach still seems to wildly underestimate the impact these tools are about to have on our lives.
Prohibition has a long and storied history of not really working, and that’s for things that, sometimes, are hard to get your hands on in the first place.
As has been pointed out on Wonkhe before, in the case of generative AI tools like Microsoft’s soon to be launched Copilot, the approach above would be more like giving every student a car and then asking them to declare if they intend to use the brakes.
Any staff member in charge of assessment and regulation needs to begin watching what the next phase of AI is going to look like.
This isn’t a student logging into an external site like Open AI’s Chat GPT and typing their essay question in. This is a student logging onto their university-provided laptop, using the university-provided Microsoft package, and being prompted, while they sit in the university coffee shop, whether they’d like Copilot to turn their lecture notes into a presentation, using the university style guides they’ve got saved on their local files.
If you find the above situation an aberration of academic integrity, you’re going to need to start thinking a whole lot more progressively – because that’s likely just scraping the surface of what the technology can do.
A bear in the woods
In much of the coverage about the impact of AI on higher education, I’ve also been surprised with the lack of consideration of how the sector is regulated.
From a regulatory perspective, for now at least, universities don’t just offer education for education’s sake. We are regulated by outcomes – employment, experience, prospects. Regardless of what any of our personal views are, in the eyes of the regulator, more so than ever before, students are consumers.
Universities have to ensure that their students are getting something (read: prospects and job readiness) or tempt the wrath of a certain B3 bear with its Micky Mouse degree beating stick.
This, to me, cuts to the heart of current debates around AI.
If academic integrity is viewed as an abstract ideal – something immovable and unchanging – then a student using Microsoft Copilot to help with an assessed presentation should be punished.
But surely, in reality, academic integrity is not this. It is self-defined, and enshrined in individual institutions’ regulations and individuals’ within those institutions interpretation of those regulations.
As such, if we as a sector have control over how we view academic integrity, we must also have the power to shift the dial.
In doing so, we have to look at the world around us in order to better design our rules and regulations to suit the needs and behaviours of our students.
In a world where one university is preparing students for the world of work, encouraging the use of all digital tools at their disposal (including AI) and designing assessments that encourage the creative use of such tools, and another is seeking to restrict the use of such technology to fit with an antiquated regulatory framework it can’t be bothered to change, which institution do we think students will want to choose?
A new flavour of assessment: authentic and original
There’s a reason academics and university support staff are losing sleep over generative AI tools, and that reason is plagiarism.
However, as pointed out in this short but excellent presentation by Dr. Philippe De Wilde, if we as a sector shifted our fixation from seeking out plagiarism at every opportunity to rewarding originality, many of our fears would be allayed.
Imagine, for a second, we paused to ask ourselves – do we want students to simply compile resources and parrot the conclusions back in a relatively structured way (someone with a better sense of irony might say, in quite a robotic way), or do we want to design assessments that utilise research skills, new technologies, fact-checking and critical thinking, delivered through alternative means including vivas and presentations.
This opportunity to shift towards portfolio-based assessments which better prepare students for the kind of activity they will actually be carrying out in the world of work should not be underestimated.
It should not be viewed as the gradual decay of academic integrity. If we reduce the number of essays and assessments we ask of our students, we give those students (and academics) more time to engage critically with materials, and ensure when they are assessed, they are asked for justifications of their tools and research methodologies (including the use of technologies and potential ethical issues).
The student gets the opportunity to build skills using the kind of tools that will make them more employable, and the course leaders get a potentially much richer, more varied marking experience, driven by in-person conversations about subject expertise.
AI literacy is not something we can, or should, be punishing for or seeking to prevent. It is something we need to actively build into our curricula and assessment practices. And fast.
To pick up the conversation about how AI is going to impact on higher education, join us for our online event: The avalanche is here on 19 April.
3 responses to “Generative AI can change assessment for the better”
“do we want to design assessments that utilise research skills, new technologies, fact-checking and critical thinking, delivered through alternative means including vivas and presentations.”
The problem is that new AI tools will be better at all these things than our students will be (and in some cases already are). We need to move beyond thinking that AI can’t do creativity, critical thinking, fact checking and make ethical judgements. It can. There is no magic spark that only humans have.
Will the viva then just be the student describing to the marker how they used the AI to produce their portfolio? And how do they develop subject expertise if all the work is handed off to the AI to produce for them? It is hard to critically assess the product if the critical thinking of the AI is superior to the student.
I agree with the conclusions that use of AI needs to be integral to some assessments. But if we want Universities to still be around in the future we need to assert primacy to humans beings knowing stuff and doing stuff, like writing and thinking for themselves. Practically speaking in terms of assessment, this means more exams, along with assignments where the students pass on their assessment instructions for the AI to do for them.
“We need to move beyond thinking that AI can’t do creativity, critical thinking, fact checking and make ethical judgements. It can.”
Let’s be more precise about this: it can produce text output which resembles the above at a higher standard than many humans are capable of. That’s not the same as reaching that text output by a method which we would consider to be “critical thinking” or “ethical judgment”.
That distinction should provide plenty of space to produce assessments which check for what you actually want to check for, remembering that modern AIs fail comically badly once the implicit scope set by their training data is exceeded.
Unfortunately, word meanings are not precise, and what we consider words to mean ultimately ends up in a philosophical debate. For the purposes of assessment, what counts is what is graded as demonstrating “critical thinking” etc..My contention is that AI can already or at least will equal exceed most if not all students in most areas we currently assess.
This article would be fine if we were just stuck with GPT3.5 level models from now. on. But modern AI’s now very rarely comically fail. That solace did not last long (since GPT-4 was launched). Where there are failure areas they are being addressed with developments such as plug ins. On the contrary, my experience with GPT-4 even now makes it hard to see much space for assessments that fit current models of coursework where we assign credit to students based on their own work.
Given how slow moving higher education can be, we have to plan for assessment not on the current best in class (GPT-4) or the previous generation more widely used (GPT-3.5 and earlier), but on the future generations (GPT-5 and beyond), which will coming likely within a year. However, at the moment no-one really has a good idea on how we do this with the current configuration of higher education.