Sampling isn’t the answer to the OfS B4 requirement burden

It now looks like providers won't need to retain every piece of assessed work for five years. But for Paul Greatrix, even retaining a sample represents a burden

Paul Greatrix is Registrar at The University of Nottingham, author and creator of Registrarism and a Contributing Editor of Wonkhe.

One of the big items on what is an ever-growing worry list for higher education in England is the implementation of the B4 condition of registration by the Office for Students.

There is significant concern in the sector about how this condition of registration, which is intended to ensure that students are assessed effectively and that assessment is valid and reliable and awards are credible, is going to work in practice.

The B4 condition includes the following guidance:

335U. As part of its approach to assessing compliance with this condition, the OfS is likely to need access to students’ assessed work, including for students who are no longer registered on a course. A provider is therefore expected to retain appropriate records of students’ assessed work for such regulatory purposes for a period of five years after the end date of a course. Where possible, a provider is expected to retain records of student assessments in an anonymised form by removing students’ personal data from the records.

This is further amplified by a subsequent piece of guidance in the same document about the consequences of absence of such records:

335W. The absence of records of students’ assessed work may lead the OfS to make negative inferences about a provider’s compliance and/or may result in the OfS taking targeted regulatory action to address the risk that it is unable to monitor compliance and regulate effectively.

The calm before the storm

This issue was explored by the House of Lords Industry and Regulators Committee in its discussion with the OfS Chair and Chief Executive on 9 May 2023.

In response to questions on retention of assessed work, OfS Chief Executive Susan Lapworth said:

We know that to make judgments about whether the assessment for this group of students in this institution is rigorous and credible we may need to look at samples of student work. That is the driver and that feels perfectly appropriate to me as regulatory policy.

We have not said that providers should keep all work for ever. Our guidance says that we expect them to keep appropriate examples of assessed work for five years, and that is broadly consistent with previous guidance from a sector body, so I do not think there is anything completely new here….We recognise that the sector wants more explanation about this and for us to be a bit clearer about how we interpret that guidance. That is why we have set up a working group with representatives from the sector to try to work through those practical issues and find a way for us to unpack the guidance a bit more so that we are clearer for everybody about how we interpret “appropriate” in that context. It is an example of us hearing the point, understanding the practical consequences, and being willing to work with the sector to clarify what we mean. We intend that we will end up being able to do the regulatory job that we have set out to do.

There is an awful lot riding on the word “appropriate” here as Lapworth acknowledges, but reports of discussions with the OfS prior to this point would suggest that it really does appear that this was being taken to mean everything. This would therefore mean that all assessments for all students need to be retained for five years after the end of a course. This would represent a staggering burden, requiring institutions to establish new systems to collect, collate, organise and store securely many hundreds of thousands of items, of various forms, each year.

A recent survey estimated a minimum cost in the range of £270,000 to £1m per institution depending on size. This seems to me very much at the conservative end and my estimate would be that this will cost the English sector significantly more than £500m in set up and running costs over the next five years.

And all of this is to allow for the possibility, however remote, that an inspector may, at some point, want to assess the judgements on standards of marking and assessment in relation to a particular course at an institution. In other words, the English sector would be required to retain tens of millions of pieces of assessed work, at massive expense, on the off-chance that the OfS had a concern about standards on a particular course. This would be, by any definition, a disproportionate burden.

Tears before bedtime

But things have moved on. Following further reflection and some serious engagement with the sector, as reported to the Lords committee, the OfS has adjusted its position. Draft guidance has been in circulation which recognises some of the challenges articulated by the B4 critics and sets out ways in which universities retaining a sample of student work can try to ensure it is appropriate for the purposes envisaged by the OfS.

The OfS deserves credit for engaging with the sector, responding to these concerns and making the shift from the retention of all assessed work to sampling. This feels to me like a significant step. Indeed I was one of the people involved with this engagement as a representative of the Association of Heads of University Administration (AHUA) and I have to commend the openness of those from the OfS who participated.

However, the B4 requirement still represents a major logistical issue for universities and colleges and the scale of the effort required to retain even an appropriate sample of assessed work will be huge.

Readers of a certain age will remember the old TQA/Quality Assessment era where each subject was inspected across all institutions, and universities had to assemble a base room containing all relevant materials for the week-long visit. As part of this there was a requirement to collect a representative sample of student work across all modules, years and courses – this generally comprised a large part of any base room given the range of materials to be provided. If we imagine though this sampling undertaken every year for every course in every institution then you get a sense of the scale of the endeavour required to meet even a curtailed B4 expectation – the equivalent of a permanent base room (physical or virtual) for every subject across every institution in England.

Whilst the move by the OfS to a sampling methodology is certainly a positive step, the whole approach remains, in my view, fundamentally flawed for a number of reasons:

  • Even with sampling the retention of work is still complex, costly and challenging for universities to do. The size of the sample and the retention period are lesser issues than the additional staffing, systems and structures needed to develop and retain all of these items.
  • It assumes that by looking at a range of assessed work that judgments about the standards set by academics and achieved by students can be reached. But on this basis they cannot. You can’t assess standards or unexplained grade inflation from this evidence base alone – inspectors will be trying to second guess academic judgement or replicate it and that simply is not possible without the full context of every other dimension of the assessment process.
  • The retained material itself therefore is never going to allow the expected judgements to be made – it essentially amounts to re-marking by external inspectors and is simply not a viable strategy for assessing academic standards.
  • This method also represents a challenge to the autonomy of institutions as awarding bodies responsible for their standard of their awards (although admittedly this is baked into HERA).
  • It remains fundamentally the wrong approach therefore. Sampling is better in that it reduces the burden – slightly. But it still will not achieve what is hoped for, that is a vehicle for assessing standards externally over time.

A far better approach would be to use data already collected – which, where it demonstrates concern about grade inflation, would trigger a potential investigation. It would then be for the university to demonstrate how it was assuring and maintaining academic standards. One of the means they may choose to do this is to provide a sample of retained work. But prespecifying that for the entire sector in England is hugely disproportionate in that over 99 per cent of the material retained will never, ever be inspected.

The day before you came

Back in 2016 I offered some suggested amendments to what was then the Higher Education and Research Bill as it was being considered in the Lords.

Unsurprisingly this had no effect. But I also made some observations about the assurance of standards which are pertinent here. In order to assure that standards are being appropriately set and achieved by students there is a need to trust the judgement of academic professionals and the range of proxies devised over many years to assure the legitimacy of their collective decisions.

We’re going back a bit here but as Lindop said in relation to public sector higher education in 1985:

the most reliable safeguard of standards is not external validation or any other outside control; it is the growth of the teaching institution as a self-critical academic community.

The Lindop committee is also quoted for similar reasons by Roger Brown – a quality wars veteran with the medals to prove it – in his recent piece for HEPI on the problems of the current regulatory regime which seems to offer little to support the quality of education.

And Reynolds (again nearly 40 years ago, in 1986) argued that given this and the dependence on “the professionalism of the university teacher” there need to be regular reviews of provision, and this

within a framework of maximal university autonomy, constitutes the best guarantee of the maintenance of standards in British universities.

The Reynolds report effectively accepted a suite of proxies for standards, including the external examiner system (although I would argue now that this is past its sell-by date), underpinned by the notion of a self-critical academic community in which the examiners were both the arbiters of standards and the assurers (inevitable in a self-regulated system and where peer review is a critical feature).

By implication then, assuring standards, which are multi-dimensional, requires a suite of inter-related measures including: the involvement of professional and statutory bodies; robust staff selection arrangements; confidence in the quality of academic staff appointed; the allocation of resources needed to provide facilities for education; the qualifications required of admitted students; the procedures established within institutions for the approval, review and monitoring of courses; a national quality assurance framework.

It is these things combined which provide the comfort that standards are being appropriately set and assured by institutions.

Wake me up before you go-go

In order to be assured about academic standards we need a self-critical academic community, institutional autonomy and a set of proxy measures which, combined, offer the reassurance required that standards are being appropriately set and achieved.

The B4 requirement on retention of assessed work does not achieve this. It would be difficult for OfS to backtrack having made the commitment to the approach but the change in the forthcoming guidance shows movement is possible. I really hope therefore that they will consider further whether this really is the best way to address concerns about grade inflation and then take a further step away from the proposed retention model.

I am sure we could all think of better ways to spend that £500m over the next five years.

This article represents the personal views of the author

3 responses to “Sampling isn’t the answer to the OfS B4 requirement burden

  1. “In order to be assured about academic standards we need a self-critical academic community, institutional autonomy”

    These two things can run counter to each other. The latter can remove motivation for the former.

    “and a set of proxy measures which, combined, offer the reassurance required that standards are being appropriately set and achieved.”

    Proxy measures can more readily be, and are being, gamed.

  2. Not to state the obvious but the ability to be self-critical on assessments and marking (both internally and through external examiners) is entirely dependent on adequate formal training in assessment design and examining, which very few academics (including myself) receives or has time to seek out in depth. The lack of adequate formal training results in the blind leading the blind – or at the very least ensures that quality is highly variable. A focus on initial (and inset) teacher training is essential – HEA fellowships, while useful to reflect, are not an adequate substitution for formal training. Without improving this, there’s no chance of standardised and reliable internal quality assurance.

Leave a Reply