Research assessment has always been controversial, yet it necessarily persists

One of the ostensible rationales for the Research Excellence Framework (REF) and its predecessors is to provide the evidence and scaffold for a policy of research funding selectivity.

In arguing about research assessment exercises it is all too common to elide the management and delivery of the exercise itself with the policy it supports. However, it is possible to support the idea of research funding selectivity while having concerns about the way the quality of research is assessed – it is also an entirely reasonable position to entirely accept a system of peer review as a measure of excellence without signing up to selectivity as the best way to organise research funding.

Acknowledged excellence

I’ve told the tale of how research came to be something you would expect to find in universities before. These days it is almost axiomatic that a university is a place where research is performed, but this idea (seen from the perspective of around a millennium of university activity) came comparatively late to the story. Newman’s The Idea of the University is often considered as the foundational text for UK higher education – and he was dead against it. If you wanted to put a date on the moment this all changed, the 1945 Percy Report expressed it thus:

Research is a necessary concomitant of all higher teaching

Why so? Coming out of the Second World War, it was very clear that research was our national pathway to economic growth. State and industrial investment in defence-focused research and development had grown, the peacetime overhang of this capacity is what drove the huge increases in productivity and living standards in the 1950s and beyond. The higher education system was not (at that point) churning out the scientists and technicians needed for this brave new world – and to do so universities needed a greater engagement in innovation and this should inform teaching.

This initially happened via a mechanism that dates back to Haldane. Universities receive what basically amounted to block grants from the state, in ever increasing amounts and separately from any project funding the government may also offer, to support research (and teaching!) as a general activity.

At this point funding (some £412m by 1964-65) was allocated primarily by volume of activity, but – as the 1963 Robbins Report notes:

It is therefore unavoidable that […] there should be some differences in achievement and reputation as between institutions. It is also unavoidable that because of the varying expense of different kinds of education and research different institutions should receive different subventions. What is important is that what differences there are should rest clearly on differences of function on the one hand, and on acknowledged excellence in the discharge of functions on the other. There should be no freezing of institutions into established hierarchies; on the contrary there should be recognition and encouragement of excellence wherever it exists and wherever it appears.

From research funding to science funding

As John Agar tells it, Lord (Victor) Rothschild’s report A Framework for Government Research and Development was central to the shift from seeing universities as supporting “research” (as a praxis and as an outcome in itself) to supporting “sciences” as an output that could drive wealth creation and productivity. Rothschild was a 1971 appointment to Edward Heath’s new Central Policy Review Staff (CPRS). A former Shell executive, Cambridge academic, and MI5 operative, Victor Rothschild was brought in to revitalise a drifting Heath administration with big policy ideas and insights into delivery – very much the Dominic Cummings (a self-described “rationalist” and a thorn in the side of the civil service) of his day.

At the time the “Haldane principle” – the idea that scientists are the best people to judge the work of scientists” held sway. As Calver and Parker suggest, Rothschild’s position was that “users of the end-product are the best people to control and finance applied research.” As you may imagine, this – and the recommendation that research funding should be transferred from the research councils to government departments – was not a universally popular position at senior levels of academia.

The minister responsible for science policy at the time was briefed by the department to oppose these changes, but on meeting Heath and Rothschild came down in the other direction. Her name was Margaret Thatcher.

Acceptable in the 80s

If we fast-forward to Thatcher’s first term as Prime Minister, if we consider the privatisation of much of the government’s remaining scientific infrastructure, a policy full fees for international students (leading to a 20 per cent drop in numbers over five years) and the 1981 cuts (8.5 per cent over three years) to funding flowing via the University Grants Commission (specifically squeezing non-project science research – the research councils were ringfenced and UGC allocations were a single block that also had to cover the costs of teaching) we can see these ideas being put into practice. It was no longer acceptable to fund researchers simply for being researchers, and the emphasis was very much on the state as a customer of specific research activity.

Importantly, the misery was not spread equally – the first flowerings of selectivity were the volume of funding cuts experienced by each university: while some avoided more of the pain, others (for example former Colleges of Advanced Technology like Aston, Bradford, and Salford) saw cuts of almost a third.

There were concerns, both in government and outside it about the overall health of the system. Thatcher’s response (as reported) was:

The Research Councils have been protected for 10 years but have done nothing to manage their cash limits. There has been no real shift towards useful science and money is still lavished on grand but useless projects such as CERN. At the same time other nations have benefitted from our science because our university scientists are too toffee-nosed to get involved in applications. We can no longer afford to do science for prestige, it must be science for economic benefit.

Around the middle of the decade (Agar links it to the resignation of Michael Heseltine in 1986) this policy began to shift. “Near-market” research was now something that the private sector, not government should be supporting – the role of the state was to refocus on “curiosity-driven” research. For me, a key driver in this shift was the decision of the University Grants Commission to carry out the first “Research Selectivity Exercise” across 1985 and 1986. For the first time, non-project based research funding would be allocated based on a formal measure of research excellence.

To be fair, 1986’s efforts did not constitute the REF we all know and criticise in think tank reports today. UK Universities were asked to complete a four-part questionnaire – covering research income and expenditure, research plans, and research output. This was assessed inhouse (by UGC subject sub-committees, with recourse to external – and anonymous – assessors). To be scrupulously fair, though it was the first systematic review it wasn’t the first actual review of the quality of subject research in universities – that honour goes to the Times Higher Education Supplement, which had started running one in 1982 to an increasingly hostile response.

A history of being rude about research assessment

A genuinely wonderful 1988 paper by AJ Philmore surveys the history of the antagonism that has accompanied this kind of research assessment. He relates that the 1986 exercise was seen, in part as a demonstration of “value for money” for non-specific research funding. Again, the complaints concerned the differentials in outcomes and funding (even though 1986 was linked to only a fraction of total resource) between universities.

The Research Selectivity Exercise was famously secretive: the results were never formally revealed to the public, appearing only via a leak to the press. The assessment method, too, was opaque. It was known that evaluation was based on:

a two-page description of the research achievements and a list of five of the best publications from the previous fiver year period, submitted by each cost-centre;
numbers of research grants, studentships and “new blood” lectureships;
income from industry and other external sources;
fellowships, prizes and other honours awarded to faculty:
peer-review judgements of research performance.

But it was not known what weight was given to each factor, or if these weightings were the same across all cost centres. Even the development and allocations of subjects to cost centres were not exactly clear. There were no appeal mechanisms, and little feedback that could allow “below average” departments to improve. Academic observers laid in to the validity of basing judgements on the available data – there was a perception that not enough weight was given to research publications, for instance. And the perennial complaints about comparing apples with oranges made an early appearance.

1989

Someone must have enjoyed the exercise, as it was in essence repeated in 1989. There were many methodological changes (as Bence and Oppenheim describe) – an astonishing 152 subject area units of assessment, 70 peer review panels (this time with published membership), the assessment of two outputs per member of staff (alongside income, future plans, and research student information), and a five point rating scale revealed very late in the overall process.

This time, the sector was unhappy about the burden of the exercise. Research selectivity was clearly now a big deal, and with high stakes the focus was on making the best possible submission. The Universities Funding Council (as it had now become) noted that “publication data was found to be unreliable, and where it was reliable, it said nothing about the quality of the output”.

Reputation

The tail end of the 1980s was a turbulent time in higher education policy, with a real sense that an era was drawing to a close. The next Research Assessment Exercise was tentatively planned for 1993 – but the admission of former public higher education providers (the polytechnics and others) to the competition and the sector, and the birth of the devolved funding councils, saw this brought forward to 1992.

This exercise also saw the birth of the concept that became known as “REFability”, with providers allowed to select “research active” members of staff in post on 30 June 1992 for submission. There were 72 Units of Assessment this time, and submissions (more than 2,800 from 192 providers) were formally audited. A successful legal challenge by the Institute of Dental Surgery would see a welcome increase in transparency for assessment criteria and methodologies in all future exercises.

Of the research funding available following this exercise – the first time that separated out QR (Quality-based Research) funding allocations were made – the traditional universities received 91 per cent of the available funding.

Lover and folklore

A much-loved facet of the earlier exercises was the production of “the list” – details of every academic publication or activity carried out within a given timeframe and the unit of assessment (one of a possible 69) in question. The 1996 RAE moved away from this approach, becoming a purely sample-based exercise. It also began, for the first time, to consider research culture alongside research outputs. Research funding was not available to departments that received the bottom two of the seven possible quality grades: beyond this allocations were informed by volume, quality and the cost of performing research in a given field.

The first iteration of the new millennium (2001) saw attempts made to standardise (as much as was possible) between the ratings offered by Unit of Assessment panels – we saw the invention of sub-panels and what were then called umbrella groups (now the four main panels) to help address the frequent challenge that some UoAs are “easier” to get good results in than others.

Also new for 2001 was a new consideration of staff circumstances (including a partial answer to the issue of “research poaching”, where a provider would appoint a star researcher from elsewhere late in the submission cycle), and further clarity that research assessment is not (and should not ever be) simply an assessment of published articles.

The final RAE (2008) brought a further major shift – rather than an overall grade for a UoA at a provider, each research output (four full-time researcher) was given a grade, with each submission given the quality profile we are familiar with today. This addressed another common criticism, that a department could rate well based on the contributions of a small number of excellent researchers.

Quality and volume

We could continue to trace this story of criticism and response through the Research Excellence Framework years (2014, 2021 and on to 2029). The process has always shifted to respond to criticisms, but has never seriously looked as if it would be abandoned entirely. Even though the allocation of QR (and related quality-based research funding) still has as much to do with volume as quality, and although older universities still dominate both in assessment results and allocations, and although QR remains a small proportion of the income of the universities that dominate it, there has never really been a better answer to the allocation of limited funding to satisfy limitless demand.

But I can’t really pass up the chance to note Gordon Brown’s 2006 budget. A “radically simplified allocation of the research funding that goes direct to universities” was probably the only time in nearly a century that a Chancellor of the Exchequer had attempted to direct what had always been seen as academic judgement. The Red Book expands on this:

The Government’s firm presumption is that after the 2008 RAE the system for assessing research quality and allocating ‘quality-related’ (QR) funding from the DfES will be mainly metrics based.

Noting the administrative costs, behavioural impacts, problems with assessing interdisciplinary research, and a perceived focus on curiosity-driven rather than user-focused research – the main thrust of the argument was that QR allocations closely correlated both with research council income and other research income (both of which sources being project based and thus directed by a combination of peer and customer review), so the allocation QR should be allocated based on this other income. This was widely perceived as embedding the so-called Matthew Principle, where:

For to him who has will more be given, and he will have abundance; but from him who has not, even what he has will be taken away (Matthew 13:11-12)

This is hauntingly familiar from recent proposals, and it is worth seeing whether the same issues apply.

The full consultation (managed by the Department for Education and Skills) focused largely on information based on the research council grant allocation process, and on other research income – but also retained the idea of volume measures (research active staff, postgraduate research students, TRAC data). New to consideration were bibliographic measures (citation metrics in other words, which became available to assessment panels to aid peer review but have never played a formal role in the system), impact measures (spin-outs, activity with industry – as used in the Higher Education Innovation Fund), and peer review information from staff.

One common criticism, perhaps surprisingly, was one of cost. As the initial HEPI response (from Bahram Bekhradnia and Tom Sastry, two former HEFCE staff members) noted, using the research council systems of project proposal peer review would inevitably place more demand on that system – and that system was far more expensive to run even than the RAE. The overwhelming majority of responses did not think that any of the proposed alternatives to the RAE were suitable – there was general support for a “lighter touch” RAE but not for any specific way of making this happen.

Universities UK spoke for the sector in noting that income measures “ are proxy indicators of quality, rather than measures of quality. Therefore, expert assessment must also continue within the process, for example overseeing the validity and selection of the metrics, interpreting and, where necessary, moderating their outcomes.” Some suggested that the research council peer review rating was itself informed by the “gold standard” RAE assessments, others noted that funding decisions are based on the quality of research project proposals – not the quality of conducted research.

Where next?

QR (and similar devolved funds) is the vestigial remains of the need to support what is often described as “blue skies” research. What it in fact supports, in many case, is the cost of doing business – the full economic cost of carrying out project resource is almost never covered by government, industrial, or charitable contracts: and even if it did this does nothing to maintain capacity in the gaps between projects, or areas where civic and societal good is harder to put an economic value on.

If this is the new purpose of QR funding there is scope to rethink the way it is allocated. For my mind a HEIF-like mechanism linked to impact and outreach (if these things remain public priorities) may have value, but I can imagine the howls of outrage if I put this forward as a serious proposal.

There is always the spectre of bibliometrics, altmetrics, or other things that publishers would like to sell to the sector. It would be inequitable: Not all citations are equal in value or meaning, not all academics and universities have an equal chance of publication, journal and academic prestige is as much about a kind of historical miasma than actual research quality. And it would be impractical: in faster-moving fields the journal has been superseded by other, faster, forms of publications – in other fields other forms of output (monographs, artwork, professional practice) are far more important.

But if we want to link funding to the quality of research (in the broadest sense – research culture and capacity is at least as important as research outputs) then we need some kind of agreed mechanism to decide who does the best research. The only way we can agree on that is to let expert assessors have a say. And that – call it RSE, RAE, REF or something else – is (with perceived bias and burden baked in) pretty much what we currently have on the table.

Research assessment has always been controversial, yet it necessarily persists

David Kernohan

Deputy Editor

Acknowledged excellence

From research funding to science funding

Acceptable in the 80s

A history of being rude about research assessment

1989

Reputation

Lover and folklore

Quality and volume

Where next?

fest Festival side

post list Latest articles

It is high time higher education adopted a harm reduction approach to drug use among students

Higher education postcard: Corpus Christi, Cambridge

Podcast: Student experience, LLE, civic

The LLE finally gets a Labour overhaul

National Student Survey 2025

How can students’ module feedback help prepare for success in NSS?

Higher education leadership is at an inflection point – we must transform, or be transformed

REF panels must reflect the diversity of the UK higher education sector

Defining the value of the UK’s international research partnerships

To make real progress on widening participation in higher education, we need a new mission

Leave a replyCancel reply

Share

Acknowledged excellence

From research funding to science funding

Acceptable in the 80s

A history of being rude about research assessment

1989

Reputation

Lover and folklore

Quality and volume

Where next?

Share

Share

fest Festival side

post list Latest articles

Leave a replyCancel reply