This article is more than 4 years old

by Paul Chandler

27/05/20

The data science behind Clearing Plus

For Clearing Plus to be trusted, it must be transparent and clear about how matches are made. Paul Chandler shows us around.

This article is more than 4 years old

by Paul Chandler

Analysis

27/05/20

wonkhe-algorithm-detail — Image: Shutterstock

Paul Chandler

Principal Data Scientist

by David Kernohan

staff

26/05/20

Paul Chandler is Principal Data Scientist at UCAS

How it works

From early July, those not holding an offer or place can see their individual list of matched courses in Track (their online UCAS account) by clicking a button. From there, they can easily send an expression of interest to their chosen universities. After a conversation, the student can decide whether to officially add them to their application. As ever, admissions teams have the final say over who they admit onto their courses.

Clearing Plus works by suggesting courses to students that are typically favoured by similar applicants, and that they are eligible for.

Two critical factors are involved:

Available courses and a university’s own recruitment criteria.
A match score of students and courses based on historical acceptances.

The first of these is very simple to imagine. University of X wants to recruit to their physics course, and therefore submits physics to Clearing Plus, stipulating that it is only visible to applicants with a confirmed A level grade B in maths. They will then receive the details of all unplaced applicants who have clicked on their course to register interest. Applicants won’t see the course if they don’t have the required B (or higher) grade, so admissions teams can have confidence in those registering interest. This means that the applicant’s achieved regulated grade is used, as it would be in any other year.

Participation and relevancy

The widening participation opportunities are obvious. Admissions teams can also choose to use POLAR and SIMD as part of their criteria to effectively reach underrepresented applicants, helping them achieve a diverse student population, and support this important agenda for all of us working in higher education.

The second of these is slightly harder to conceptualise and is commonly referred to as ‘the algorithm’. The output of the algorithm is a match score (between 0 and 1) representing the relevancy of each course to an applicant. It informs the order that available courses will be displayed to each student – a good match would be top of the list.

The match score is derived from the following pieces of information:

The grade/subject profile of the applicant at Level 3 or equivalent (to begin with A levels, BTECs, Scottish Highers, Access to HE Diplomas, and International Baccalaureates will be used, which will be expanded in the future).
Any of the applicant’s main scheme course choices.

Clusters

These pieces of information are used to create a ‘cluster’ which groups a student with peers of similar characteristics from previous cycles, so the historical outcomes of ‘people like them’ can be seen. We’ve been asked if the algorithm allows for ‘mixed qualifications’, which it does as its the combination of qualifications used to create the applicant cluster. For example, if a student with both A level and BTEC grades enters Clearing, they will see the most strongly related courses that have a good record of accepting their combination (or one very similar to it).

Alongside this, every course has been grouped into ‘course clusters’ using the same pieces of information:

The proportion of applicants with each grade/subject profile at Level 3.
The proportion of applicants with specific other applications in the main scheme.

Each combination of applicant cluster and course cluster has a match score. This means each individual applicant and specific course also has a match score. The advantage of clusters is that relevant recommendations can be made without attempting to exactly replicate previous cycles. Pertinent links can be made that may not have been thought of before because the data can show that “people like you were accepted to courses like this”.

Agnosticism

Courses are grouped with other courses with a similar intake in the profile of students (regardless of the size of university or college). The match score is also agnostic of the number of applicants, with the likelihood of similar applicants being offered and accepting a place a key factor in the matching process.

We are naturally aware of the questions people have about algorithms and the bias they can introduce or perpetuate. Clearing Plus could be seen as reinforcing existing patterns of acceptances, such as men being more likely to choose STEM subjects. However, by basing matches on clusters of students who have been previously placed on courses, using factors mentioned earlier (e.g. grades and not sex), students will discover courses which may not have been on their radar in the past, but are qualified to succeed on. With all parts of the admissions process, we’ll be monitoring, measuring, and reporting on how Clearing Plus works to expand student choice.

This matching process is more in line with other online services and industry standards that applicants are used to in their daily lives, but crucially is one that could have the most impact on their lives to date, setting them up for their step into higher education, and possibly future careers.

This article is published in association with UCAS.

festival side Festival side

TFOHE_2024_Website_Column_1000x1680_Book@2x

View here

by Mark Leach

featured message

19/05/23

post list Latest articles

wonkhe-rubber-stamp-regulation — Image: Shutterstock

The sector still needs a say on quality and standards

by David Kernohan

Comment

27/07/24

A regulator fit for the future

by David Kernohan

Policy Watch

26/07/24

Beach,Green,Flag,Weather,Wind,Advice,Spain,Blue,Sea — Image: Shutterstock

The free speech wars may finally be over

by Smita Jamdar

Comment

26/07/24

The government must mitigate the risks of a university going under

by Jonathan Simons

Comment

26/07/24

136technicaluniversityathensoncork — Image: Hugh Jones

Higher education postcard: National Technical University of Athens

by Hugh Jones

Comment

26/07/24

wonkhe-dashboard-update — Image: Shutterstock

OfS data dashboard update, 2024

by David Kernohan

Data

26/07/24

wonkhe-miniature-horse — Image: Shutterstock

Subject TEF rides again (after a fashion)

by David Kernohan

Data

25/07/24

wonkhe-sharing-network — Image: Shutterstock

Publishing what works on access and participation

by Heather Mozley

Comment

24/07/24

wonkhe-yes-no-dice — Image: Shutterstock

OfS investigates eleven providers, imposes eight conditions of registration

by David Kernohan

Policy Watch

23/07/24

8 responses to “The data science behind Clearing Plus”

Steven Jones says:

May 27 2020 at 11:00 am

Thank you for this piece, which is reassuring to those of use with concerns about how CP might be implemented, particularly in light of the ways in which it was initially framed by the government.

It’s important for the sector that the algorithm’s assumptions remain as transparent as possible, and this is a good first step.

Reply
helen newbery says:

May 27 2020 at 3:44 pm

this whole process is impossible to navigate through and is leaving dyslexic student like myslf behind. collge is closed no supprt avaiable for me.
i dont want to slip through the net .
i have worked hard to get a predicted distiction in my foundation level 3 extened in art and design.
i, struggling to apply read what i have to do.
where is the help now ?

Reply
The UCAS Team says:

May 28 2020 at 9:12 am

Hi Helen

Please send us a message on:

Twitter @ucas_online
Facebook http://www.facebook.com/ucasonline
or email us customersupport@ucas.ac.uk

and we’ll help you with your application.

Reply
Mark Corver says:

May 28 2020 at 11:14 am

Thanks for this insight into the service Paul which is very helpful.

Important for providers will be the balance between the two ‘forces’ described within this system. The first force is the data-led ‘discovery’ of possible links between providers and students when neither is aware of them. This is driven mainly by the level 3 grade/subject mix and its similarity to historic recruitment at the course (i.e. what providers actually do rather than what they aspire to do). This is in the vein of the previous PMDS/DCS systems and can lead to those rewarding “I had never even heard of X at Y but it is perfect for me!” / “We wouldn’t normally have considered a nursing applicant but we were pleasantly surprised on how suitable they were” conversations. But also, with about the same frequency, to bemusement and bafflement on both sides.

The second broad force is a more akin to a directory look-up. This is seen in those elements where the providers and applicant expressed preferences are used (so, required min grade/subject, building on the existing HE subject choices, POLAR specs etc). This has a different philosophy, more “I know what I want, now just help me efficiently pick it out of these tens of thousands of courses/applicants”. For providers, it is closer to commercial email segment selection. This force acts on optimising (and reinforce) existing beliefs about good applicants and courses.

Neither approach is right or wrong, but they are different. The first will probably better for broadening intakes (and guarding against provider intake grade optimism), whilst the second will probably be more efficient in terms of contact conversion, but might risk cutting the pool down unfairly. Providers might want to use the system in a way that effectively plumps for one of these two underlying logics, to avoid the risk of forces pulling in opposite directions. If majoring on this service, going through the algorithm code itself to see how it all plays out, though taxing, is the way to gain transparency on what it will do.

Reply
Paul Ashby says:

May 28 2020 at 12:51 pm

Can you clarify what chinese walls will exist between this service and those that UCAS Media sells to universities to target candidates? Will universities be charged either for the core of this service or to effectively ‘boost their signal’?

Reply
The UCAS Team says:

May 29 2020 at 2:08 pm

We made the decision to make Clearing Plus free, to maximise choice for students and reach for providers. For a small fee, providers can add their logo and short description, though this doesn’t improve their relevancy score.

There’s no relationship between the Clearing Plus algorithm and UCAS Media services, they exist side-by-side, as we know students use multiple channels and sources of information and advice during Clearing. UCAS Media mailings are commissioned for a fee by providers based on their specified target criteria rather than a common algorithm.

Reply
Anon says:

May 30 2020 at 10:00 am

How come UCAS Media are selling Enhanced Listings/Profiles for CP then?

Reply
Anon says:

May 30 2020 at 10:04 am

If the algorithm is so good then why not send providers some data about the numbers of unplaced applicants that they would have been matched to in 2019, their match ranking and for which courses (and how many of those applicants ended up with them in 2019). There must be a backup of the db from A level results day 2018 that this could be modelled on and it would help give some more understanding of how the algorithm works for various courses.

Reply

How it works

Participation and relevancy

Clusters

Agnosticism

Share

Share

festival side Festival side

post list Latest articles

8 responses to “The data science behind Clearing Plus”

Leave a Reply Cancel reply