TEF results - Who moved up and who fell down?

The TEF panel used ‘flags’ to indicate where each individual core and split metric was significantly lower or higher than would be expected.

Reading through the provider statements, it is generally clear where the flags have been noted – sometimes implicitly – using language like “substantially below benchmark”.

The TEF guidance documented how flags would be used in the assessment process thus:

A provider with three or more positive flags (either + or ++) and no negative flags (either – or – – ) should be considered initially as Gold.
A provider with two or more negative flags should be considered initially as Bronze, regardless of the number of positive flags. Given the focus of the TEF on excellence above the baseline, it would not be reasonable to assign an initial rating above Bronze to a provider that is below benchmark in two or more areas.
All other providers, including those with no flags at all, should be considered initially as Silver.

At the time, Wonkhe wrote a full explanation of the flagging process.

However, this was always going to be moderated by an examination of split metrics and provider statements, although – again – the extent of this was supposed to be limited.

“The likelihood of the initial hypotheses being maintained after the additional evidence in the provider submission is considered will increase commensurately with the number of positive or negative flags on core metrics. That is, the more clear-cut performance is against the core metrics, the less likely it is that the initial hypothesis will change in either direction in light of the further evidence.”

So the question being begged today was, to what extent did the panel chose to exercise their discretion, and overrule the tyranny of metrics in order to arrive at a final outcome?

The answer is: a lot. 64 times (out of 295 judgements – 22%), to be precise, and 35 times for higher education institutions and alternative providers.

How did the panel exercise their judgements?

Three institutions had their final assessments downgraded from their initial hypothesis: BPP University, Bucks New University, and the British School of Osteopathy.
Thirty-three institutions had their final assessments upgraded from their initial hypothesis, including eight in the Russell Group, and twelve in London.
Seventeen institutions were upgraded from a Bronze to Silver, including University College London, King’s College London, and the University of Bristol.
Fifteen institutions were upgraded from a Silver to a Gold, including Imperial College London, the University of Nottingham, and the University of Birmingham.
The Royal Veterinary College was upgraded from a Bronze to a Gold!

The TEF panel will no doubt have many reasons for this, and their statements give an indication as to whether provider submissions were deemed to address negative flags to justify an upgrade. Some of these reasons are suggested in the panel judgements that we have already covered. Institutions’ in London in particular appear to have been given some significant leeway for poor NSS scores. As my colleague Ant Bagshaw put it:

Once again, TEF has failed by trying to capture the full and varied nuance of a whole institution into a single word category and a few bullet points of unvalidated text. This can be remedied by reading the statements the data on performance. But these will take some work (if it’s possible) to be understandable to a less wonkishly-inclined audience. While the TEF process has aimed for transparency, it has missed the mark. In doing so, it may have made things much more opaque than they needed to be.

Well, we will at least try our best to make these deviations from the data clear to our own ‘wonkishly-inclined audience’.

Here are those higher education institutions which were upgraded in graph form, together with Wonkhe’s ‘flag score’ we have used elsewhere (explanation here). The labels denote the ‘initial hypothesis’ as determined by the metrics, and the colour’s those institutions’ final outcomes. Those institutions moved up with higher ‘flag scores’ might have been easier judgements for the panel to have made then others, but we have no doubt that there will be (and already is) some disgruntlement about how that discretion was exercised.

Scroll across to see all institutions.