This article is more than 8 years old

by David Kernohan

26/06/17

Fun with flags (pt 2): How did metrics work in TEF outcomes?

How did the TEF's flagging system actually work and what can it tell us about the final outcomes? We've taken a look at the data.

This article is more than 8 years old

by David Kernohan

Analysis

26/06/17

David Kernohan

Deputy Editor

by David Kernohan

staff

21/10/14

David Kernohan is Deputy Editor of Wonkhe

Ant Bagshaw

by Ant Bagshaw

staff

6/05/16

Ant Bagshaw is a Senior Advisor in L.E.K. Consulting’s Global Education Practice and co-editor, with Debbie McVitty, of Influencing Higher Education Policy

Tags

Ant Bagshaw

Ant Bagshaw is a Senior Advisor in L.E.K. Consulting’s Global Education Practice and co-editor, with Debbie McVitty, of Influencing Higher Education Policy

In the run up to the TEF results, one of the features of the exercise posing the most questions was way in which providers’ benchmarked scores were ‘flagged’.

In December, Wonkhe published a guide to the flagging process and shared HEFCE’s provisional data on how institutions had fared on the data. Flags are vitally important to the TEF process as they indicate the extent to which an institution is beating, or losing against, the measure expected for its students.

Where flags appear on the data sheets, they can show a positive or negative flag, or even a ‘double positive’ or ‘double negative’ flag for more significant results in either direction way. To be more precise, a double flag deviates from the benchmark by 3%, a single flag by 2% – fans of the UK Key Performance Indicators will recall that 3% is the significance bar for that exercise. There can also be a neutral score which means that the institution is performing more or less as expected.

While the exercise’s published guidance stated that these flags would not cancel each other out – the notion was that it should have been difficult to receive a Gold award with any negative marks against you – a number of negatively marked providers received Gold. It seems, therefore, that there has been some balancing off between the ups and the downs.

However, what follows are calculations that some may consider could be used to approximate a ‘panelless’ TEF and we should be very careful not to allow that interpretation. In viewing these graphs we should be clear that none of this work shows what an institution ‘should have had’ – the panel has always had the final say in allocating awards based on metrics, context and the provider statement. Neither should we assume that anything here approximates the raw allocations that the panel used as a starting point – none but the panel and secretariat can know what these were, and – as before – these are irrelevant in the context of determining TEF awards.

‘Implied awards’

We tested the extent of the application of flag criteria by allocating an “implied award” based on the criteria given in the guidance – any institution with three or more positive flags and no negative flags was given a Gold, any institution with two or more negative flags was given a Bronze, and all the others were given a Silver. The below chart shows all those institutions whose ‘implied award’ differed from their final outcome.

TEF final outcomes changed from initial hypothesis

Flag scoring

Next, we tried to use the flags to produce a ranking. Our scoring system uses the flags and combines the positive and negative. We’ve allocated a score of +2 to a double positive, +1 to a positive and 0 to neutral. A negative flag gets -1, and double negative -2. We shouldn’t labour under the assumption that double flags are, in every case, ‘twice as good’ or ‘twice as bad’ as singles. You can see from the ‘z-scores’, which show in numerical form a measure of the standard error (the distance from the benchmarked mean), that some providers are much further away. However, the assessors and panel were instructed to take into account the flags in the first instance, when looking at providers’ performance. Thus our combination of the flags, creating a 25-point scale, gives a good idea of the spread of performance (click for full-screen for the best view).

Flags as UKPIs

We also decided to see what would happen if we treated the TEF core metrics as if they were UKPIs – in this case we simply allocated a score of 1 for a double-positive flag and of -1 for a double-negative flag. This 13 point scale gives a tighter spread but gives a better indication of what would, in other settings, be seen as significant over- and under- performance. (click for full-screen for the best view).

fest Festival side

TFOHE25_Website_Column_1000x1680_Book@2x

View here

by Mark Leach

featured message

19/05/23

post list Latest articles

Shutterstock_2482435561 — Image: Shutterstock

How Labour’s 10-year health plan for England joins up with higher education and research

by James Coe

Policy Watch

4/07/25

One year on from the election, Labour is losing the student vote

by Myles Hanlon

Comment

4/07/25

perfect-data-wonkhe — Image: Shutterstock

Helping professional services get confident with data

by Sophie Crouchman

Comment

4/07/25

Podcast: International, student leaders, metascience

by Team Wonkhe

Podcasts

4/07/25

177norwichuniversityoftheartsoncork — Image: Hugh Jones

Higher education postcard: Norwich University of the Arts

by Hugh Jones

Comment

4/07/25

Frame,Made,Of,Wooden,Cubes,And,Puzzle,Pieces,On,Lilac — Frame made of wooden cubes and puzzle pieces on lilac background

Civic engagement offers a firm foundation for universities contributing to regional economic growth agendas

by Iona Clark

Comment

3/07/25

Shutterstock_2357719141 — Image: Shutterstock

The crisis in the youth sector is a big problem for universities

by Fiona Walsh McDonnell

Comment

2/07/25

For some, the heat is an access issue

by Laura Duncan

Comment

1/07/25

Shutterstock_2604792291 — Image: Shutterstock

Defunding level 7 apprenticeships in health and care may backfire on lower levels

by Denise Baker

Comment

1/07/25

Should students’ unions reach for the stars?

by Jim Dickinson

Comment

30/06/25

2 responses to “Fun with flags (pt 2): How did metrics work in TEF outcomes?”

Andy Leighton says:

Jun 26 2017 at 10:40 am

You’ve got to wonder at an assessment that gives London Met [-11/25 or -5/6, whichever you prefer] a Bronze…. anything other than a FAIL is ridiculous.

Reply
David Morris says:

Jun 26 2017 at 4:54 pm

I think that’s a question that should be asked of the ‘rigorous threshold standards’ enforced by QAA that everyone’s been keen to stress since Thursday…

Reply

‘Implied awards’

Flag scoring

Flags as UKPIs

Share

Share

fest Festival side

post list Latest articles

2 responses to “Fun with flags (pt 2): How did metrics work in TEF outcomes?”

Leave a replyCancel reply