When it comes to assessing research excellence in the UK, the Research Excellence Framework (REF) is the gold standard. But as universities ramp up preparations for the next REF cycle, a question looms large: is the extensive use of peer review in this process really necessary?
The findings of a research study we published recently in Research Policy suggest it might not be: the expert peer review scores and journal rankings derived scores are not as different as we might have imagined. This is based on an analysis of REF 2021 data, examining 13,973 publications across 108 institutions in the business and management discipline – unit of assessment 17, one of REF’s largest and most heterogeneous fields.
The strong correlation between journal rankings and expert scores, particularly for top-tier journals, indicates that journal prestige may be influencing the expert review process, despite journal-blind evaluation. In other words, it seems that research in a highly ranked journal received an equally high REF score, even though the evaluation process is journal-blind.
This correlation persisted even among institutions that have endorsed the Declaration on Research Assessment (DORA) which promotes responsible assessment practices. While many universities publicly endorse DORA and wider responsible assessment principles, our data suggests this endorsement often amounts to little more than “institutional peacocking” -signalling superficial adherence to principles without truly challenging entrenched practices.
This raises critical questions about the future of research evaluation and the transparency of the REF processes. If peer review is to be retained, more openness is required to allow for further equality impact assessment and potential improvements. However, this could lead to even greater costs for an already pricey assessment exercise, suggesting a broader rethink of responsible assessment may be necessary.
The case for rethinking peer review
Peer review has long been considered the cornerstone of academic rigour, especially in the context of responsible assessment. Its central role in the REF emerged as a reaction to the widespread use—and often misuse—of research metrics like the Journal Impact Factor (JIF). Metrics, while helpful, have been criticised for distorting scientific practices, incentivising quantity over quality, and promoting a narrow view of research impact.
Proponents of peer review argue that it offers a nuanced understanding of research quality that metrics cannot capture. By emphasising expert judgement, peer review aims to ensure assessments reflect the originality, significance, and rigour of scholarly work. Seen in this light, peer review is a safeguard against over-reliance on flawed metrics, ensuring fair, transparent, and merit-based evaluations. However, despite these intentions, our evidence suggests that peer review often fails to diverge significantly from what metrics already indicate.
Journal metrics vs. peer review
Journal rankings have become ingrained in academic culture, especially in disciplines like Economics and Management, where rankings guide research dissemination, hiring, and departmental evaluations. While responsible research assessment movements such as DORA and the Leiden Manifesto aim to reduce dependence on these metrics, our findings suggest that traditional rankings still carry substantial weight in expert evaluations.
The troubling part is not that metrics play a role, but that the peer review process, purportedly free from such biases, often mirrors what metrics already tell us. In Business and Management, the CABS Academic Journal Guide (AJG) ranking, which categorises journals by perceived prestige, seemed to directly align with the expert peer review scores in the 2021 REF. This raises the question: if the expert peer review results echo journal rankings, why are we having it? Likewise, does the internal peer review process universities conduct in preparation for the REF provide any added value?
The unique challenges of internal review
The push for rigorous internal peer review as part of REF preparation presents several challenges. First, it places an additional burden on already overstretched academic staff who must juggle teaching, research, and administrative duties. Second, there is the issue of consistency—can internal panels at different institutions apply the same standards as the REF panels, or are we at risk of creating fragmented interpretations of quality?
Furthermore, there is a danger that internal review processes can become an exercise in confirmation bias, where institutions simply reaffirm the rankings provided by journal metrics. If this is the case, then internal reviews may not add significant value to the REF process but instead serve as a costly duplication of effort.
Towards a more efficient approach
What if we took a more streamlined approach to REF submissions? Universities could redirect their efforts away from disciplines where journal metrics are robust and widely accepted, such as business, management and economics, and towards areas where peer review can truly add value—such as interdisciplinary research or emerging fields where metrics are less developed.
By reducing the emphasis on peer review for disciplines with strong existing metrics, we could free up valuable time and resources for researchers to engage in more innovative and impactful work. This approach would not only align with the principles of responsible assessment, but also ensure that the REF process remains both rigorous and efficient.
A call for debate
Ultimately, the question of whether we need peer review for the REF is not about abandoning rigour or lowering standards. It’s about finding the most effective way to measure research excellence and ensuring that the process reflects the realities of different disciplines. As the higher education sector continues to evolve, so too must our approaches to assessment.
We are not the first to raise this question, and we won’t be the last, as there are still aspects of peer review which are unclear. How do we ensure that peer review remains free from both conscious and unconscious bias? How can we make the process more transparent, and who is responsible for auditing these decisions? In the name of fairness, if peer review happens behind closed doors, how can we be sure it is working as intended?
Is it time to rethink our reliance on peer review for the REF? The debate is open, and it’s one that we, as a sector, need to have.
I think I can spot a rather obvious flaw in the claims made here. Can anyone else?
Thank you for being the first to comment on this article, Andrew. Might the full text of the paper provide more clarity? https://www.sciencedirect.com/science/article/pii/S0048733324001422
Why even attempt this analysis when Thelwall et al on behalf of the FRAP (for REF 2021) and the Metric Tide (for REF 2014) had access to the actual scores and have demonstrated while there is some correlation between REF score and JIF, it isn’t that strong? https://link.springer.com/article/10.1007/s11192-023-04735-0
The Metric Tide is particularly revealing on unclassified outputs with high JIF https://www.ukri.org/wp-content/uploads/2021/12/RE-151221-TheMetricTideFullReport-REF2014Scores.pdf
Thank you for your comment, Anne. We are indeed familiar with the studies you mentioned, as well as another similar one. However, there were valid reasons for conducting our analysis, which isn’t a replication but rather uses a different methodology—one that is replicable, unlike those relying on actual REF scores. Perhaps the full text of our paper might offer further clarity: https://www.sciencedirect.com/science/article/pii/S0048733324001422
I accept the analysis, but this has been discussed many times before and it ends with us keeping peer review.
Indeed. Might we improve the transparency around the process though? For example, might we allow for [responsible, e.g. via an application] data sharing with the purpose of informing service improvement?
Could we also facilitate the implementation of responsible assessment principles by resourcing the expert review process better? (Lack of resourcing might lead to an implicit, if not explicit, reliance on metrics.)
These are questions worth engaging with, in our view. They are, however, implications, extended thoughts emerging from our study – available here, open access: https://www.sciencedirect.com/science/article/pii/S0048733324001422
I think the sector has debated this issue at length over the years, yet invariably arrives at the same destination, a sort of Churchillian view of peer review being the “worst form of review, apart from all the others”.
Agreed, Patrick. We could do better with it, though -the principle is reasonable, but the devil is in the details, as there are various ways of implementing it, and different resourcing and transparency options.
I have lost count of the number of times people have purported to have cracked research evaluation using UoA or worse University level correlations which sort of work. It is perhaps the least surprising result ever that at the UoA level there’s a correlation between the best research and the most prestigious journals, prestigious journals don’t get to be that by accepting vast swatches of lower quality research. The critical points are that not everything in those top journals will be world class and more importantly not everything world class will be in those journals.
Agreed on the low level of surprise, Richard Puttock. Which begs the question: why do we pay over the odds engaging in peer-review / expert scoring in REF? At least in the disciplines where journal prestige is acknowledged and generally accepted.
Adina, thanks for taking the time to reply but I fear you missed my point, while there is a good correlation it is the exceptions that are important and deserve the rigour of independent (of journals) peer review to identify those weak articles in high rated journals and strong articles in less well ranked journals. There is also the undesirable feature of putting public policy in the hands of commercial publishers. Happy to continue offline
Spot on Richard!
Thank you for this article, and for the longer peer reviewed article upon which it is based. Overall it seems clear that the REF emphasis on labour-intensive peer review may be inefficient as it appears to largely validate the results of the original peer review of published articles. Does this imply that future REF exercises might be better if they directed resources towards other aspects of research performance and culture? What might this look like I wonder?
I’d have felt this a more worthwhile read if it assessed another branch of the academy than economics and business. From conversations with colleagues working in those areas my understanding is that these are the most metric driven sectors of the academy in general, with some atrociously dysfunctional outcomes: last REF every VC wanted its Business School in the top ten, a desire which unsurprisingly led to a large number (number 11 onwards as I recall) suddenly finding themselves under pressure of closure or disinvestment. As the hierarchy of economic and business journals is so fixed, the current metric heavy system also structurally disadvantages new ideas from young academics in favour of articles from long-standing, high profile economics professors, no matter how good the work itself. It also disadvantages new journals, who inevitably cannot secure good articles as they are too far below the cut off for acceptable metrics for the sector’s leaders
Try running the numbers on panel D – if there is the same level of correlation as you found above, that would be something worth publishing.
Could you share a bit more information about how citations were calculated please? The data sources are listed as “Scopus and Google Scholar” but these typically give different numbers, with some overlap in citations.
Was the date an institution signed DORA taken into account? E.g. Coventry University became a signatory in 2023, it would therefore be illogical to evaluate the impact of it signing DORA by looking at REF2021 data https://www.coventry.ac.uk/research/about-us/dora/