(Bench)mark my words: managing preparations for subject-level TEF

As we eagerly await results of the Teaching Excellence Framework, it’s a good time to probe a little as to how the plans for future exercises are coming along. With subject-level TEF still promised, if a little further away than originally intended, there’s a lot of detailed data to think about as institutions aim to position themselves as well as possible for future rounds of evaluation.

Many a member of the House of Lords shared their or their lobbyist’s unhappiness with TEF, its ratings and the metrics. The ‘lessons learned’ review – certainly a test for DfE’s TEF team to balance many competing interests – will no doubt return to the peers’ concerns. Yet by the time it is conducted, we will know a lot more about the public impact of the results on institutions.

Masters of all they survey

Universities have for some years now been working on their data gathering capabilities. Senior managers and planners now have dashboards of data from which to compare modules, programme, departments, faculties and more.

From those internal data sources comes the predictive modelling which has aimed to enable the deployment of rapid responses where an area of the university is underperforming, and likely to drag down scores in the National Student Survey. Universities are also making use of qualitative data from students’ comments to understand concerns, respond where necessary, and – importantly for this survey-heavy era – be seen to respond to students.

Staff engagement with this surveying is just as important as students’. The quest for high response rates, challenging at times when there are regular complaints of students’ ‘survey fatigue’, is not just about robust data. It is also about the need to convince academic and professional staff that the results both meaningfully reflect students’ views, and that they are to be used in ways which make the collection of data worthwhile.

Looking further ahead, it’s also possible that students’ survey responses could form part of the assessment of learning gain, much sought after by those seeking to refine measures of teaching quality.

While it’s interesting to compare disciplinary performance within a university, that’s not the nature of TEF’s game. The benchmarking of NSS results in TEF uses the principal JACS codes (the highest level of subject coding, such as ‘social studies’ or ‘engineering and technology’) as well as students’ age, ethnicity, gender and disability. That gives 1,620 benchmarking groups: for TEF, institutions are compared against similar profiles of students rather than in a set peer group of other providers.

Many universities are therefore thinking about how they can augment the disciplinary benchmarking in their internal analysis beyond the data coming from NSS (for which the results are not benchmarked for students’ characteristics). Module survey data can go some way to helping with this; for example, Electric Paper (which supplies evaluation surveys) is able to compare students’ responses against a subject benchmark built on its data from across institutions for comparison at both subject area and individual module level, adding a level of detail not otherwise available.

The data burden

The weight placed on the data available – particularly NSS and JACS – is significant, and already crucial to the information supplied to prospective students and for quality assessment.

NSS wasn’t created for the latter purpose, and there should be recognition that this extension of its purpose means that the topics covered by the survey are bound to get more attention than other aspects of the student experience. TEF, in its current form, only uses the first twelve questions of NSS. Course organisation, libraries and IT facilities, and communication skills are currently outside the exercise’s scope. Making the case for investment in these areas is going to be harder when the emphasis of evaluation lies elsewhere. Investment may be directed to areas where there isn’t real need for improvement, in order to make only marginal gains for TEF.

Student experiences can vary greatly across the different departments in a university. More often than is supposed this is due to factors within universities’ control, such as student-to-staff ratios, tutoring systems, contact time, or other more nebulous aspects of departmental cultures. Data can be used to identify where there are real differences, after accounting for baseline differences between academic disciplines. But that level of nuance presumes that universities have the capacity to systematically harvest and process data, tools to communicate data effectively, and the necessary support for action to improve. Those that do will have a significant advantage.

Metrics mania

The last few years have seen a shift from qualitative evaluation methods towards metrics. There is some well-founded criticism of the metrics approach, but this can be clouded by an uncritical resistance to change in some quarters of the sector.

Yet as well as becoming more prevalent, metrics are also getting more complicated. TEF benchmarks are a perfect example of this, compared to the hitherto simple use of absolute NSS scores in league tables.

The noble motive behind TEF – to create parity of esteem for teaching and research – should be lauded. But this needs to be balanced by appreciation of the realities of the data we have, and its limitations in realising this endeavour. Institutions must develop their data gathering and analysis capacity to make sure data is used for enhancement. And we should ensure that attention and investment are devoted to areas important to students’ learning, not just to those metrics that happen to be selected for assessment.

With thanks to Corony Edwards and colleagues managing student surveys for their assistance with this article. If you’d like to explore the key TEF questions further, join the debate at Wonkhe’s conference on 8 June.