Are areas full of students more likely to be Covid hotspots?
A glance at the Wonkhe dashboards would suggest this is a reasonable conclusion to draw – there are no Mid-level Super Output Areas (MSOA) in England with more than 100 Covid-19 cases in the last 7 days that have less than 2,000 students in residence.
As you have probably come to expect, things are a bit more complicated than that.
A Mid-level Super Output Area (MSOA) is just a collection of nearby Lower Super Output Areas (LSOAs). And LSOAs are funny old things.
The clue is in the name. An LSOA is made up of a number of Output Areas (OAs) – a tiny local area used in the creation of UK Census estimates. Like with everything, these were first used in Scotland back in 1981 – only being used in England, Wales, Northern Ireland for the 2001 Census. OAs themselves are built from postcode areas, but unlike postcode areas they are designed to reflect the demographic characteristics of the area. So, you’ll almost never find an OA that is half-urban and half-rural – or that is half student halls and half family homes. Because an OA is around 125 households, this is not as difficult as it might sound.
OAs are great, but they are difficult to work with for two reasons. The first, and most obvious, problem is that they are really small and there are a lot of them. The second is that is quite difficult to present data about them without risking identifying individual people – generally not a nice thing to do. For this reason the most commonly used small area statistical geographies are LSOAs.
So far, so lovely. However whereas OAs are designed carefully to be areas with fairly homogeneous demographic, LSOAs are just a bunch of OAs lumped together. So you could have an OA full of student halls, an OA with young families, an OA with single professionals, and an OA full of well off retired people all in the same LSOA. Where possible LSOAs are made up of similar areas (the size of an LSOA is variable between 400 and 1,200 households for this reason) but this is not always possible.
To get round this, the Office for National Statistics has come up with a set of (frankly) horrible area group descriptions, which give the user of LSOA-resolution data some insight into what the area is like. Whoever came up with a name like “Asian Traits” needs to take a long hard look at themselves, but the area types we might be interested in rejoice in the not-quite-as-bad-but-still-awful name of “Cosmopolitan student neighbourhoods”. Seriously, this is an actual thing – check out map 8 on this page.
How do they come up with these names? There’s a methodology section, where you can learn which 60 chunks of census data underpin these groups – educational attainment, type of residence, age, ethnicity, and employment. Suffice it to say, it is complicated – there is not a direct relationship between the number or proportion of students in an LSOA and whether or not it is a “Cosmopolitan student area”.
What does this mean for Covid-19 data?
Four graphs – Covid-19 cases for week 40 on the side, and (from left to right and top to bottom) the median age in the LSOA, the percentage of students during term time in the LSOA, the number of students, and the density (people per square km on the other axis. The colour shows the LSOA group, the size the population. You can filter by local authority area (top) and/or LSOA group name (bottom)
By default we are looking at “cosmopolitan student areas” over the whole of England. As you can see, even on a measure like proportion of students these are not directly comparable (if you are wondering, the handful of places with a student population greater than 100 per cent is an artifact of me using two different data sources, one of them being an estimate based on 10 year old data). Looking at this plot by designation tells us something that should be very obvious – other factors are clearly having an impact on case rates.
Looking at designated “student areas” means that we can be a bit more confident about relationships with other data. I’ve added trend lines on each graph to help with that. These are meaningless for a large data set like this, but are very handy if you look at a local area.
Are student areas also Covid hotspots?
For example, if you look at Manchester there is a strong positive correlation with both the number and proportion of students in an LSOA and Covid-19 cases. But there is also a (very slightly weaker) negative correlation between Covid-19 cases and the median age of people living in the LSOA – places that skew younger have more cases than places that skew older. These effects are similar in other large university cities in areas of high viral prevalence (less so in Birmingham, oddly), but there is not a universal relationship. If you look at Sheffield there is a strong relationship between the proportion of students in an area and Covid-19 cases, but no relationship based on age or the number of students.
We should also be clear that not all LSOAs with high numbers of Covid-19 cases are “cosmopolitan student areas” – of the two LSOAs with the highest number of cases in England one (in Leeds) is a student area, and one (in Sheffield) is full of “Young ethnic communities”. Clearly there are other effects going on rather than just a student one.
But there does appear to be a student effect – with more students in an area being a decent indicator of the number of Covid-19 cases in that area. Whether this is down to communal living, student behavior, initial viral load, or some other variable is not yet clear.