This article is more than 2 years old

Marking? Explain it in football terms

Andy Grayson ponders the difference between a referee making offside decisions and an academic marking essays.
This article is more than 2 years old

Andy Grayson is Associate Professor in Psychology at Nottingham Trent University. 

Who’d be a football referee? I mean, seriously, who’d do that job? You have to make difficult decisions about complex matters, under time constraints.

On occasion, mistakes are made. Sometimes, even when multiple experts replay those complex matters over and over, there is still no consensus on whether a particular outcome is “correct”. And literally millions of people might be watching.

Thankfully, in higher education, we don’t have to deal with this. Not, at least, with the “millions of people watching” bit. However, the rest is familiar. We too have to make difficult decisions about complex matters, under time constraints. Take for example the business of marking an essay.

Bill Shankly

Of course, I’m not pretending that the outcome of marking a student’s work is comparable in importance to the outcome of judging whether a defender’s arm is in an unnatural position when struck by the ball in the area during a crucial premiership match. Those two scenarios are not at all equivalent. No, the judgement about the student’s piece of work is considerably more important. Because life isn’t a game.

So, what can a referee guarantee the players of each team? That the outcomes of all their decisions will be correct? That every other referee will agree with their decisions? I think not. And we can’t make those guarantees to students either.

Marking a piece of assessed work at university involves the exercise of academic judgments. We shouldn’t pretend to students that the same academic judgments would always be made by any given marker. And we shouldn’t pretend that they will always be “correct” in some sort of objective sense. We should also be prepared to have grown-up conversations with students to this effect, in the same way that Jürgen Klopp and Mark Clattenburg might talk once the heat of the match has receded.

After the final whistle

It’s long been a concern of mine how we handle conversations with students who are not happy about the grade that they have received for a piece of assessed work. It’s my experience that the more honest we can be with them, the more satisfied they are with the outcome of such conversations.

That might sound paradoxical given that part of that honesty entails saying, “yes, it is the case that a different marker might have given your work a different grade.” How on earth does that “admission” lead to greater student satisfaction?

The answer lies in the notion of empowerment. If we are open and honest with students, we empower them to be confident in their ability to evaluate their own work. If we hide behind the shield of, “we know best, and this is the only grade that could be awarded” then we take away their capacity to disagree. University life should be all about enabling disagreement.


This is clearly not to say that “anything goes”. So let’s map out the parameters of what we can (and cannot) promise to students, assuming that we are competent and sufficiently expert to mark their work in the first place.

The type of assessed work that I am talking about in this piece is the essay (or “report” or “dissertation” or any of those longer pieces that require a marker to come to an overarching, holistic judgment).

In my view, when assessing student work of this type, we are in a position to promise students three things. We can guarantee to mark their work:

  • Carefully
  • In line with published criteria
  • In good faith

That’s it. Nothing more, and crucially, nothing less. And that is more or less what a referee can promise. Except their criteria are generally referred to as “rules’”.

Having guaranteed students these things, we have to be honest about the fact that the conclusion we come to about their work is thereafter a matter of academic judgement. And that’s the point at which different markers can legitimately, and fairly, come to different conclusions. In the context of the specialised, complex grading decisions that have to be made at degree level, we have to look for fairness in the process of assessment, more than in the outcomes of it.


The alternative is to find ways of assessing work that are less judgment dependent and more algorithmic. For some of the time, and for some types of subject matter, that’s fine. We can use short answer questions and multiple-choice tests (for example) for which there can be much higher levels of inter-marker agreement. The outcomes of such tests can be “fairer”. This is rather like the process of drawing horizontal lines across a screen and measuring whether the striker’s toe is nearer the goal than the defender’s heel. A kind of academic VAR.

However, just as VAR cannot be used to determine the outcome of every decision made on a football field (no matter how much money you spend on it), our more “objective” tests are not suited to every kind of assessment that we need to set. Relying too much on this approach to measuring student performance, in order perhaps to achieve demonstrably higher inter-marker agreement, turns out to be unfair in other ways. It privileges those, for example, who are able to marshal a lot of facts and set-piece answers, to the detriment of those who are stronger at the higher-level skills of working with ideas inventively, synthesising, thinking critically, and so forth.

Publication of criteria is crucial to this implicit contract between marker and marked. The student has to have a way of understanding what is going on in the head of the person who is charged with assessing their work. And there has to be the possibility of discussion (‘dialogue’) about the grading of the piece of work, framed by reference to those criteria.

In what ways does this empower students to evaluate their own work? Well, if we simply exercise the power inherent in our role, and assert (in effect), “I am right, you are wrong, and all my colleagues would agree on that” then the student who disagrees with us has nowhere to go with their own view of what they have done.

If, on the other hand, they are permitted to see that disagreement can exist with regards to these judgments, then they are in a position to make informed decisions about how to tackle subsequent tasks. They can be encouraged to be confident in working on their approach to these matters. And confidence in studying is an important commodity.

Red or yellow card?

Bear in mind, the types of judgements we are talking about are inherently complex. They often revolve around deciding whether someone has performed “excellently” on a criterion, or merely “very well”. The overall grade for a piece of work depends on such fine-cut judgments on multiple criteria, many of which can arguably (and fairly) fall one way or another.

And that is not dissimilar to the way in which the results of a football match can depend on whether a player is sent off for “using excessive force” instead of merely being cautioned for a “reckless” challenge, or whether the forward who was standing in an offside position when the winning goal went in was adjudged to have been interfering with play or not. The fact that these sorts of judgments are made in higher education when not so very many people are watching makes it extra important to be open and honest about them.

4 responses to “Marking? Explain it in football terms

  1. Hey Andy, keep this up! (As if you needed any encouragement.) In my reading one of your key propositions is that ‘objective’ testing “privileges those, for example, who are able to marshal a lot of facts and set-piece answers, to the detriment of those who are stronger at the higher-level skills of working with ideas inventively, synthesising, thinking critically, and so forth.”

    In my (internationalist) view this box-ticking approach to exams that prioritises the measure over the measured (much like the British box-ticking approach to, well, everything) is what’s most wrong with the British educational system, not only at universities but especially in secondary schools. My son is seven and a half and in Year 3; I quake at the prospect of his facing this homogeneisation and I am beginning to seek out opportunities for enrichment, as an antidote, whether in the UK or elsewhere.

    In my post at a 1992 university (which, because of its social media policy, I cannot name here) almost every one of the handful of excellent undergraduate research students with whom I’ve had the privilege to work has ended up there because s/he’s been failed (I mean under-rated, not categorically ‘failed’) by the exams system. In short, the system’s mismeasure of them has been my good fortune.

    I remember a conversation some years ago with Bill Durden, the former president of Dickinson College who later was recruited to Bath Spa University, during which, appealing to the German system, he extolled the virtues of oral exams, both in terms of testing one’s conversancy (literally!) with the subject matter and in terms of efficiency of evaluation. Of course, unlike the written format, in orals there is no VAR- although perhaps that will change in the era of Zoom.

    1. Thanks for those encouraging words, Matthew! I think we are very much on the same page with this. Picking up your final point, I am also particularly keen on the whole notion of oral assessments which go beyond the ‘presentation’ format and become more like a viva/discussion format. I do agree that these are a really good way of assessing conversancy (command) of a subject area. Putting the resource implications to one side for the moment, we shouldn’t be afraid of the difficulties inherent in ‘assessing’ such events, because those difficulties are no more ‘difficult’ than the difficulties of assessing extended pieces of written work (like essays and so forth). We simply need to take responsibility for the complex judgments that we are then charged with making, and not try to subcontract that responsibility to more objective (but, in many contexts, probably less valid/meaningful) additive/standardised measures.

  2. Couldn’t agree more Andy. It’s such a jarring shock for our students though, to come from a school system which prioritises tick box marking, into a system where we have to admit, nay are honest enough to say with confidence, that it’s a judgement. It mirrors the change in the way they’re asked to study too – no more rote learning “facts” to be recited, now we ask them to demonstrate evaluation and critical thought. Some students really love that transition from black and white to grey, and they seem to be the ones who do really well at uni. Some hate it, and I do confess to feeling sorry for them – I feel like they’ve been set up to think education (and assessment) is one thing and then find out that it isn’t. Horrible trick to play.

Leave a Reply