Can student reviews provide us with meaningful insights into the quality of higher education at a provider? There is good reason to assume not – few students have comparable experiences at multiple providers like TripAdvisor’s contributors have of restaurants.
Reviews are good predictors
A couple of years ago I was highly sceptical about the value of user reviews. Tiring of hearing the perennial promises that the Care Quality Commission (CQC), England’s health and social care regulator, would look at social media posts to identify poor quality care, my colleague and I decided to investigate. Much to our surprise, we found that patient reviews and social media posts were good predictors of the outcome of CQC’s in-depth inspections. When the data from multiple sources was combined, it proved even more effective than any of the individual data sources. Collectively, despite the majority having no clinical training and only interacting with a fraction of the services offered by a hospital, we found that patients provided meaningful insights into quality.
Once these findings were made public, the Quality Assurance Agency (QAA) asked my colleagues and I to investigate whether reviews of higher education providers might also provide some effective insight. To find out, we gathered over 200,000 reviews from Facebook, Whatuni.com and StudentCrowd.com, cleaned and combined the data, and calculated a 365-day moving average “collective-judgement score”.
We looked at these collective-judgement scores immediately before the outcomes of HEFCE’s 2016-17 Annual Provider Review (APR), Teaching Excellence Framework (TEF) year two ratings, and the overall satisfaction scores from the National Student Survey (NSS) for 2015-17, examining how well collective judgement tied in with those outcomes. And what we found mirrored our findings in healthcare: individual data sources were effective predictors of other quality measures and, when combined, they were even better.
Of course, the results are not perfect. Using TEF as an example, there were some Gold providers with a poor collective-judgement score prior to the outcome of the awards, and conversely some Bronze providers with a good collective-judgement score. However, the average collective-judgement score of a Gold provider was higher than that of a Silver provider, which was in turn higher than that of a Bronze provider. Across all of the “mainstream” measures we compared to, we found that the higher a provider’s collective-judgement score, the more likely they were to perform well.
But why is this the case? The reason student reviews are able to predict the outcome of other quality measures appears to be linked to the “wisdom of crowds” phenomenon – which suggests that, under the right circumstances, groups can be remarkably insightful. This can be the case even if the majority of people within a group are not especially well informed. While we as individuals seldom have all the necessary facts to make an accurate assessment, and are subject to numerous heuristics and biases, when our individual assessments are aggregated in the right way, our collective assessment is often highly accurate.
Although an individual review may be focused on just one of hundreds of programmes a provider offers, and not representative of the wider view of that programme, when all the reviews are combined the result is usually an accurate, balanced view on quality.
Real time, public, data
This “wisdom of students”: means the collective-judgement score is an effective predictor of other quality measures, but it also has a number of other attractive qualities. Collective-judgement is available in a more timely manner than many existing data sets, often at a more granular-level, offers new insights at different stages of the student experience, and adds no burden to providers’ existing duties.
It does of course have drawbacks too. Measures such as APR, TEF and NSS are not without their critics, and one must question whether agreeing with them to varying degrees is a positive.
In our research we have been careful only to use reviews that students have actively made public (e.g. we have not searched individuals’ Facebook profiles), and any future use of this metric must be mindful to maintain the privacy of reviewers. Finally, there is the clear incentive for providers to enter their own reviews to project a positive image. Steps can be taken to identify and reduce the impact of (or penalise) such activities, and the impact will always be limited by the large and growing volume of genuine feedback, but it cannot be wholly discounted.
The report published today by QAA is only the first step, and the findings suggest there is merit in exploring this idea even further. We will work to add additional data sources to increase the accuracy and coverage of the approach, and, from autumn 2018, QAA will pilot this approach to quality management with ten providers. The pilot will not form part of a formal quality assessment, but will test how this pattern holds over time and what improvements can be made. For now though, we can say that student reviews can indeed provide us with meaningful insights into the quality of higher education at a provider, and this offers exciting opportunities for both quality oversight and quality improvement.