It shouldn’t, on reflection, be this difficult to understand the ways in which the experience of students with particular characteristics differ from the average throughout the student lifecycle.
The screams of agony from sector data wonks had alerted me to the chaos around the Office for Students Access and Participation dataset – the reality of digging into more than a gigabyte of data made the problem more apparent.
The issue was not with the content of the data itself, it was with the way it was organised. Presentationally, the aim had been to let a user view either sector-wide statistics, or data specifically relating to a particular institution. These are reasonable choices, but this decision made it harder to compare one institution with another.
There’s been enough ink spilled around the curse of the league table to make me see this as refreshing and positive in certain moods. But I need to be clear that there are sound reasons for wanting to make comparisons. Many institutional planners have their own “benchmarking groups” – choosing institutions with a certain set of similarities with their own – that are used for understanding where strengths and weaknesses lie, and to identify possible interventions that have worked for others in similar situations and may work for you.
Often you need a table or other visualisation to help you find these comparators. So that’s what I set out to build. It took me a while. I strongly advise you to use the full screen link here – for tableau users I’ve enabled the ability for you to download the workbook. Do drop me an email with any questions, or if you’ve fixed it to make it work better!
Getting your head around the data
The first thing to bear in mind is the sheer variety of what is available. There is data enough to answer all kinds of questions around institutional differences – the decision to release the data as a single lump makes it harder to understand and harder to work with.
To start with the basics – there are four “life cycle stages”:
- Access – students entering the institution
- Continuation – students passing from the first year to the second year of study at an institution
- Attainment – students graduating with a first class or 2:1 degree
- Progression – students in employment or further study 6 months after graduation
For each of these you can look at full time (and apprenticeship) students or part-time students, and a choice of possible course types (all undergraduate, first degree, other undergraduate, undergraduate with postgraduate).
If we start with the (single) options on my tableau, you can choose a split type (broad category of characteristics) and split (precise characteristic) using the menus at the top.
The characteristics range from the expected – sex, ethnicity, the newly-controversial POLAR, disability marker – through to some lesser-seen options – disability type, index of multiple deprivation, free school meals. And then there’s some intersectional variables – IMD or POLAR against ethnicity or sex. Not the ethnicity and sex one that we’ve all been asking for, alas.
You then need to check my least favourite part of this data – the measure detail filter. Depending on the life cycle stage you’ll need to select a different measure –
- For Access – chose “proportion”
- For Attainment – choose “attainment rate”
- For Continuation – choose “continuation rate”
- For Progression – choose “progression rate”
This is daft to me because each of these are effectively the same calculation – the numerator (the number with the stated characteristic) divided by the denominator (the number of students in the whole population under consideration). I could have hard-linked these but I wanted to leave the option to look at other underlying data for the super-keen. Really from a data design perspective I’d have liked to have seen these as separate columns so I could show the other data within the tool tip for the main value
For each of the dot graphs the plotted value is this rate for year 5, with the colour showing the difference between the value for that year and either year 4 or year 1. Darker dots mean a larger change – blue is an increase, orange is a decrease. The sector average for Year 5 is marked with a star. Of course – the year numbers mean different things for different measures:
- Access indicators, Year 1 corresponds to 2013-14, and Year 5 to 2017-18
- Continuation indicators
- for full-time and all apprenticeship students, Year 1 corresponds to 2012-13, and Year 5 to 2016-17.
- for part-time students, Year 1 corresponds to 2011-12, and Year 5 to 2015-16.
- Attainment indicators, Year 1 corresponds to 2013-14, and Year 5 to 2017-18.
- Progression indicators, Year 1 corresponds to 2012-13, and Year 5 to 2016-17
The all years graph is a bit of a mess if you look at all institutions, use the filters on the left to choose a few of interest. This is handy for looking at trends in the ratios. I’ve left in a sector average value for you to compare against.
If you want to look at the difference between the two groups within a given set of characteristics you’ll be wanting the “comparison” graphs. Though they look very similar to the “single” graphs, they tell quite different stories.
We’re looking at the percentage point gap between the ratios (as on the “single” graphs) for the characteristics set by split1 and split2. Not all combinations of splits work, and in the absence of much help in the official documentation I would suggest trial and error with the following caveats:
- Generally, the larger group needs to be in split one, the smaller group needs to be in split two. The exception is (with depressing inevitability) that “male” goes in split one and “female” in split two
- The only splits available for “access” compare the population of 18 year olds with a given characteristic with 18 year olds at the institution with the same characteristic.
- If you select N/A it’s as if you are looking at the “single” graphs, and you will need to change the measure detail as appropriate.
What this tells you is the size of the gap between split 1 and split 2 – a positive value means that the ratio for split 1 is higher than split 2, a negative value means the opposite. Again I’ve used colour to show the differences across years for the dot graph, and because we get information on statistical significance for these comparisons I’ve used shapes to denote where a difference is significant.
This flag also means I can pull together the (surprisingly few) instances across the sector where there is a significant difference between the experiences of students with differing characteristics at a given institution.
My definition is this
OfS provides their own guidance for using the data and the dashboards they have created. Alongside a detailed set of field definitions, we also enjoy delights like an explanation of the correct use of filters in Microsoft Excel – “we recommend that users highlight the header row and choose filter from the data menu in MS Excel to add a filter to every column. Each column can then be filtered as required.”
The idea that someone new to Excel might be expected to work with this data (even the set for a single institution is around 5 megabytes) is something that concerns me greatly.
We also get a walk-through the OfS tableau dashboards. As above, these set up to let you see either sector in aggregate or an individual institution’s data – making it harder to make comparisons.