John Blake was last week announced as the new Director for Fair Access and Participation at the Office for Students (OfS).
In a move which indicates a renewed focus on university and school partnership, the appointment puts a former school teacher and senior leader in one of the most critical positions in higher education.
This offers an exciting opportunity. John will no doubt bring a wealth of expertise and lessons from schools and has already signalled a focus on working with universities to “improve attainment for disadvantaged young people throughout their schooling”.
But beyond a welcomed focus on attainment, and developing closer university and school links, there is a more fundamental lesson that higher education and, in particular, those who work in improving student success, can learn from the school sector – the importance of “What Works”.
Where’s your evidence?
In 1999, Professor Rob Coe in his inaugural lecture at Durham University, set out a manifesto for evidence-based education. “Education may not be an exact science”, he explained, “but it is too important to allow it to be determined by unfounded opinion, whether of politicians, teachers, researchers or anyone else”.
Optimistically, Rob predicted that “Before long, everything fashionable, desirable and good will be “evidence-based”. We will have Evidence-Based Policy and Evidence-Based Teaching, Evidence-Based Training – who knows, maybe even Evidence-Based Inspection.” Rob was right.
In the following 20 years, the education sector in England underwent an evidence-informed transformation. This was driven, in large part, by the establishment of the Education Endowment Foundation in 2011, the ‘What Works’ centre for education in England that aims to use evidence to close the attainment gap.
When Rob presented his manifesto in 1999, you could count the number of truly robust randomised controlled trials (widely considered to be the gold standard of evidence) that had taken place in English schools on one hand. Today, more than half the schools in England have participated in an RCT. Gone are the majority of the ethical objections to RCTs in schools, with headteachers across the country showing an increasing desire to engage in the creation and use of evidence.
More broadly, evidence literacy across schools has continually improved. Aside from getting involved in research, teachers and senior leaders now cast a discerning eye on programmes, approaches, and policies, expecting them to be evidence based.
The sector’s leaders have followed suit. Significant policy change is now regularly informed by evidence, most notably demonstrated by the sector’s response to the pandemic. The announcement of a National Tutoring Programme, and of a targeted programme of small-group language support to early years children, were both underpinned and justified by robust evidence. Meanwhile Ofsted’s latest inspection framework is underpinned by an evidence review, while the inspectorate have even begun producing their own reviews of evidence.
These fundamental changes, accompanied by a proliferation of organisations intent on improving the use of evidence in schools (such as the Chartered College of Teaching, or Evidence Based Education) mean that we can now (convincingly) claim to have one of the most evidence-informed school sectors in the world.
Meanwhile in higher education
Thankfully colleagues in this sector, specifically in access and student success, have started on the same journey. Recognising the promise of evidence-based practice, various institutions and organisations have started exploring how we can use evidence to improve access, retention, attainment, and progression, particularly for underrepresented groups.
My institution, King’s College London for instance, founded a What Works team in 2018. The team, now part of the Social Mobility & Widening Participation Department, aims to contribute to the understanding of what works in enabling people to access and succeed at university, and has made rapid progress since its foundation.
Alongside conducting social research with over 100 students and delivering several institution wide surveys (measuring critical student success concepts such as belonging and self-efficacy), the team has run several RCTs involving nearly 12,000 pupils and students. This includes an RCT run in collaboration with TASO designed to measure the effectiveness of K+: our flagship multi-intervention post-16 WP initiative.
Beyond King’s, the establishment of TASO is a hugely exciting development. Set up in 2019 with funding from the OfS and designated as the what works centre for access and student success, their stated mission is to improve lives through evidence-informed practice in higher education. Their work has begun at a pace, and an independent review published last week noted that they are set to fill some of the priority evidence gaps in access and participation.
HE professionals should continue to embrace this new direction towards evidence informed practice. Indeed, we still have some way to go. As Advance HE’s literature review on student success interventions identified this month, institutions must fix the ‘leaky pipeline’ of impact evidence. Specifically, as other recent reviews have argued, there is a particular ‘dearth of causal evidence’ which we must address. But we should not be disheartened.
The experience of the school sector in England in the last 20 years demonstrates what can be achieved for pupils if we focus on “what works”. Let us commit to our own manifesto of evidence-based access and student success.
Reading this, I can’t help but think of the persisting inequalities in school outcomes experienced by working class children. Inequalities, for example between FSM recipients and other children, have remained stubbornly static over the last 10 years. If the answer to these inequalities is to be found in the kind of changes to learning and teaching that an RCT can test, how long should we expect it to take for a ‘what works’ approach to translate into meaningful reductions in inequality? Is there reason to believe that the same approach applied in WP will be able to ameliorate gaps in attainment among school children, or in employment outcomes between graduates? Or are universities being tasked with solving social inequality that extends well beyond the reach of education settings? I’m not expecting you to have the answers to these questions Joe (obviously!), but these are some things I’ve been thinking about since the policy changes were announced.
A really fair challenge Jessica! Thanks for such a thoughtful comment.
I don’t think small numbers of RCTs are the full answer- but they are a start. Some of the EEF’s most important results from RCTs have demonstrated substantial improvements for FSM children and are now being scaled up to thousands of schools across the country (the best example being the Nuffield Early Language Intervention). Trials like this, and subsequent scale up work, can help to close the gap. It is my understanding that the gap was closing prior to COVID, but very difficult to know exactly why of course!
EEF RCTs have also identified some very popular programmes, that schools were spending lots of resource on, that were having no impact (or even a negative impact on pupil outcomes), which is also a hugely useful endeavor (see the EEF’s evaluation of Achievement for All). Also, the more causal evaluations you can do, and build up, the more useful and informative the eventual meta-analyses and reviews of these evaluations will be- these can really inform practice, but it does take time!
More broadly, I totally agree that RCTs alone aren’t the only answer- they’re just one method of answering a particular type of question. Indeed, in recent years, the EEF has used methods beyond RCTs to explore different types of questions (such as whether schools should do GCSEs over 2 or 3 years). These used different QED methods, while they have also started experimenting with different types of trials (such as with their teacher choices trials to test different decisions teachers make in the classroom). Also worth noting that not all RCTs are equal! One of the great thing about EEF RCTs is they are all accompanied by detailed qualitative process evaluations which should provide more information about why something did or didn’t work in that particular context.
We should definitely be exploring a range of methods to answer all the questions we have; this should help us build a richer body of evidence. Probably fair to say though that currently, the type of research we are most lacking in HE is that which demonstrates causal impact, so RCTs likely needed- but certainly not the only answer!
Thanks again for such a thoughtful comment! Always happy to discuss!
(By the way, this is summed up far better than I ever could by the fantastic Triin and Milly (who set up EEF’s approach to evaluation) here: https://journals.sagepub.com/doi/full/10.1177/2096531120913039 )
This is a dangerously naive argument which is as likely to harm students’ education as to benefit it.
A well conducted, carefully designed and analysed RCT never warrants a policy conclusion as simple as “intervention X works”. At best it warrants the highly localised conclusion “on average, for the sample who took part in the trial, intervention X resulted in higher scores than intervention Y for a particular outcome”. Unless the planned policy context happens to be a situation in which intervention Y is the existing approach, intervention X is the planned replacement, the outcome one intends to impact is the same and the students are the same as those who took part in the RCT, that RCT alone is little justification for expecting a positive outcome on average when X is implemented as policy.
Suppose a particular RCT shows that a specific pattern of flipped classroom teaching results in better average scores on a given test of physics facts than a particular pattern of lectures and examples classes for students in the second year of the BSc Applied Physics degree at the University of Upper Worming. This finding does not justify the statement “flipped classrooms work”. This study does not itself justify introducing flipped classrooms even in the startlingly similar context of the first year of the same degree programme at the same institution, let alone for the BA in ancient history at the University of Lower Worming.
Quite apart from their unsubstantiated appeal to the rhetorical power of words like “evidence” and “science”, exponents of using the RCT as the foundation of educational policy often appeal to the apparent success of RCTs in particular areas of medicine. However, the situations are rarely similar enough to underpin that appeal. In the best medical RCTs, allocation to treatment can be masked (no one knows until the final analysis whether a particular individual received the vaccine or the placebo). This is not achievable in educational RCTs, yet adequate masking is critical to the validity of the argument. The best medical RCTs are based on well established basic biological mechanisms which apply with wide generality, while in education there are few, if any, basic and general mechanisms in which planned trials can be grounded. The best medical RCTs use simple and objective positive outcomes (e.g being alive), educational outcomes are unlikely to be as simple to measure or uniformly agreed as defining a positive outcome. The best medical RCTs focus on particular harms as well as average benefits – a painkiller observed to be much more effective on average than existing off the shelf pills is unlikely to be recommended if it causes substantial harm to even a small number of participants in a very large scale trial. Few education RCTs examine the possibility that any participants might be harmed.
Instead of spending large amounts of precious funding on education RCTs which provide little policy relevant knowledge, resource should be diverted to understanding how local contexts interact with the mechanisms through which different interventions act to result in particular outcomes and what consequences (intended, unintended or harmful) might result from changing those contexts, mechanisms or outcomes. Instead of spreading naive messages about “what works”, wonkhe journalists should dig deeper into the evidence behind ‘evidence based education’, not least noting that the countries who have bought into the “evidence based education” slogans in the last 20 years have not noticeably leapt to the top of international comparison tables nor otherwise seen a substantial positive transformation of their educational outcomes, however measured.
Thanks for such a detailed and thoughtful response Adrian!
You certainly raise some of the challenges of using RCTs, but I don’t think our positions are completely at odds. I agree that one RCT doesn’t mean you can say ‘this intervention works’ in all contexts; as I’ve said in the comment above, not all RCTs are equal- of course, they need large enough samples to support making decisions from them, while they also need appropriate outcome measures, limited attrition, and other markers of robustness; but also, critically, they should be accompanied with high quality process evaluations, so the context, the mechanisms, the theory of change of an intervention, may be fully explored and understood. Much care also needs to be taken, of course, on interpretation so that results aren’t misinterpreted.
More broadly, not every question is most appropriately answered by an RCT- we need a wide range of methods to answer different questions.
Milly Nevill sums up lots of these points (and more, including the importance of data archiving and analysis over time) brilliantly I think, but I understand you may disagree!: https://educationendowmentfoundation.org.uk/news/eef-blog-randomised-controlled-trials-or-how-to-train-your-dragon
My broad argument that we need to continue ramping up our efforts to effectively evaluate more of what we’re doing in access and participation I think still stands- and this isn’t just a discussion about RCTs. However, I recognise you may disagree! Always happy to discuss! (I’m on joe.collin@kcl.ac.uk).
Sorry, but the argument here is just as flawed. Larger samples will not help with the task of transporting a result from RCT to policy. Larger samples at best help with precision of parameter estimates and with justifying the exclusion of the randomisation process as playing the sole causal role in the local difference in outcomes. Similarly reduced attrition will help answer the question “x worked better than y on average for the sample in the study in the context of the study” but will not substantially improve our knowledge about whether x is good policy anywhere.
There is something to be said about good process evaluations: they can begin to help address the complex relationship between context, mechanisms and outcomes that provides a much better chance of reliably informing policy than RCTs. The problem is that I have yet to see good process evaluation in education RCT reports. The so called “logic models” which have become the standard drop-in to those reports are usually shallow to the point of vacuity and the occasional interesting nugget in the discussion of implementation issues disappears when results are summarised, leaving only misleading “effect size” (or worse still “months’ progress”) from which policy gets magically conjured, to the likely detriment of learners.
The Nevill blog post to which you link only exacerbates the error. It also talks about RCTs as evaluating “what workS” which is simply not a reasonable interpretation of the knowledge they generate. Even if perfectly conducted (massive sample sizes, zero attrition, obvious and non-controversial outcome measures, zero spillover, fully masked participants etc), RCTs evaluate what workED better than some alternative in some context on average for some people. Nevill does not even touch on the difficulties caused by the inability of almost all education trials to meet the basic analytic assumptions needed for the logic of the RCT to hold.
Some of the critical flaws in these “what works” arguments are superbly highlighted in Deaton, A., & Cartwright, N. (2018). Understanding and misunderstanding randomized controlled trials. Social Science & Medicine, 210, 2-21. Well worth a read.