Datafication in higher education refers to collecting and processing data at all levels, from individual to institutional, national and beyond, impacting discursive and material practices in the sector.
Universities have always collected data on their students and staff, but contemporary digital technology enables the collection of much more varied data at a more granular level, at a greater velocity, and often in real time.
Universities must collect specific data for statutory reporting, such as HESA’s Data Futures, which has not evolved without challenges. The reporting requirements have expanded in time, leading to some voices warning about the significant burden this brings for universities. The purposes of collecting and processing digital user data go beyond reporting to authorities. They include improving the student experience and personalisation of learning, automating processes, and improving institutional efficiency.
Digital technologies and platforms collect user data from university students and staff during teaching, learning, and other activities on platforms. User data come in different forms and include content produced by staff and students (like discussion forum posts in a learning management system), user behaviour registered by platforms (such as information about clicks, time spent on particular tasks, or the sequence of user movement on the platform), and metadata (IP address, machine number, platform access time). They are often merged with data provided by staff and students or their universities (including information on grades, modules taken, socio-economic background) to allow for different analyses.
User data is processed to produce various data products, such as analytics, recommendations, behavioural nudges, and automated decisions. These can be products that universities may procure – for example, student engagement metrics calculated on university data – or they can be integrated into digital platforms as features, such as learning analytics in a learning management system with a dashboard displayed to students or staff.
The pervasive idea is that data products are inherently valuable.
However, like other organisations and sectors, higher education institutions face significant challenges and struggles in making user data valuable and useful. Here are three key struggles we identified in a recently completed piece of research.
A mismatch between beliefs and realities
First up, we found an omnipresent belief in the value of digital data and data products in the university sector. All of the people we interviewed felt that data could support the sector – however, most of our participants reported that many university actors, particularly leadership, believed that data collection on its own would solve existing and future problems. But the belief in the value of data was not being realised, at least not to the extent that participants would wish.
Universities struggle to make digital data useful and valuable for different reasons. We found a lack of usefulness and reliability of data products. Participants spoke about various analytics not being representative of what they promise.
For example, they mentioned that student engagement scores do not represent academic engagement and that learning analytics do not represent learning – and for this reason, they are not being widely used in practice. Participants also talked about the lack of reliability, which mainly originates from the fact that many activities happening at universities are not digitally recorded and hence not included in analytics. Therefore, big promises of datafied technology made by many edtech companies have not been delivered.
Many university actors are mesmerised by the imagined certainty of quantified measures in the form of various metrics. However, our participants felt that universities have already developed practices that support students and monitor their engagement long ago. Standard practices such as formative assessment support students and serve as relevant information to course tutors. Universities face significant internal mismatch and incoherence in how analytics and other quantified metrics are perceived and valued, leading to data products not being systematically used even if they are produced.
Increased costs and work
Edtech and datafication promise institutional efficiencies. Not only is data envisaged to improve working conditions and higher education provision, but datafication is also supposed to help save time and money.
Instead, our research finds that datafication brings more work and costs to universities.
Having thousands of students and staff, universities have to manage petabytes of data, which is a complex task technologically, financially, and legally. The costs of data storage alone have substantially increased, on top of other new costs related to establishing and maintaining the digital ecosystems required for datafication. Universities also deal with legacy software, problems integrating various systems and data flows, the need to ensure data security, cyber attacks, and more. All of this contributes to rising costs for universities to spend on technology.
Our participants reported a general lack of skills and resources to analyse data, understand data outputs, and act on them. New jobs need to be created, including data scientists and project managers, and there needs to be an increase in the number of staff and diversity of skills for IT specialists, vendor managers, and legal support staff.
We found that collecting and managing data requires more work from administrative and academic staff as well. This becomes even more demanding when universities decide to run pilot projects with edtech companies, such as introducing new analytical features into the virtual learning environment software.
Acting on analytics and data outputs brings new challenges. Participants consistently reported a considerable disconnect between data collection and processing on the one hand, and acting on data on the other hand. Effective actions are often too expensive. For example, an algorithm might identify students who are struggling in their studies, and an efficient solution might be to send them automatic emails offering support. However, many of our participants said students do not appreciate automated communication and respond only to academics. While this is understandable, as students are closely connected to their academics, it is also expensive if considered in financial terms.
Aims and governance
We found that people have different ideas about what data and data products could and should do in universities. There does not seem to be a common understanding of data landscapes within institutions. Those who raise questions or even criticism are classed as being anti-progress. The discourse often gets reduced to staff being for or against technology. Questions and discussions are thus reduced to simple binaries.
However, universities can never get internal buy-in for using data from different constituents if they do not overcome such non-constructive binaries.
Participants reported surveillance threats. They felt that there was little evidence of the positive impact of datafication, but at the same time there was a high risk of collecting large amounts of data. Moreover, participants consistently felt that many companies in the edtech industry profit from university data. This is especially pervasive as the edtech industry expands in higher education, building a “platform university”, which could be seen as a new form of privatisation.
Our participants felt that while universities respect rules and legislation on data privacy and collect user consent, individual staff and students are still not meaningfully and transparently informed on what user data is collected when they engage with various platforms, for what reasons, which products are developed based on their data, and what impact and effects these products have.
The need for evidence – and debate
Our analysis shows that datafication is uncontroversial and widely valued when it supports administrative and technical processes. For example, dynamic software licensing can save subscription costs because it ensures that only those who actually use software have access to it. Moreover, librarians can save costs for paying fees to academic publishers based on monitoring the actual access to various digital texts. Technology as such is also widely embraced when it supports work and study, such as enabling meetings by distance via video calls.
However, there is less agreement and more significant controversy about technology collecting data and (especially) processing data to either augment or replace human decision-making and agency or to intervene in teaching, learning, and human cognition, sentiment, and action. In order to build greater support for these dimensions of datafication, universities must invest time in thoroughly and openly discussing use cases for data products. More open and collective debate is needed to determine what data will be collected, for what purpose, how, and to what ends.