Open research data is key to open science

‘The evidence for the benefits of sharing research data is overwhelming’ – the words of Ian Hrynaszkiewicz, Head of Data Publishing at Springer Nature, in his Wonkhe blog last month. It’s a powerful statement, and one that will likely be echoing around the halls of Westminster after the recent hearing of the House of Commons Science and Technology Select Committee on Research Integrity.

Sharing data, methods, and software can all support research integrity by ensuring research can be reproduced and by creating transparency. Being able to verify and build on research results can help accelerate research and science at scale, but in order to make this possible there has to be access to data, for reuse – and the road towards open access can be full of hurdles.

This year’s Open Access Week brought the focus firmly back to why the movement is critical to international research and collaboration with the theme ‘Open in order to…’ But what is it that we’re setting out to achieve?

Supporting open science

At Jisc it’s a theme which informs our work on a daily basis – we support our member universities and research institutes with a range of solutions to ensure the move towards open research data and open access to publications is achievable, and meets policy requirements. Open science is the goal on the horizon but to get there and play a part in the government’s plans for a flourishing research economy, we have to delve into the details, and the data.

Reuse of data can drive innovation – exactly what’s being called for in the UK industrial strategy and the Higher Education and Research Act – so what is stopping this process at present?

For starters, there’s an issue with interoperability – in other words, can the various bits of technical infrastructure talk to one another? It’s a well-known problem between primary and secondary care providers in the NHS, but perhaps less commonly envisaged as a problem for those clever folk in academia.

Secondary use, or analysis, of datasets means previously created data can be used for new projects. But without associated documentation, code, methods, or software, data may have little value. A lack of supplementary information alongside datasets can make them difficult to reuse singly or in combination with other data. Since data is usually collected with a particular research question in mind, reuse is rarely considered.

Other barriers to reuse of open data include a lack of licensing and provenance information; a lack of metadata (making it hard to search for useful datasets) and a lack of awareness among researchers of where to find open datasets. Researchers are also concerned that they may lose their competitive advantage if they share their data for use, reuse, or combination with other data.

The Science and Technology Committee hearing into research integrity considered the ‘crisis in reproducibility’ and how fraud can be stamped out. As Borgman reported earlier this year, reproducibility is often cited as a key reason to share data, in which case data will need to be routinely made open with its associated documentation, code, and software… whatever is needed to verify the results using the same methods. We still need to develop a better understanding of what prevents researchers (and their institutions) from adopting open practices so that interventions can be targeted successfully.

Developing policy

The Open Research Data Taskforce is developing a road map for the UK to meet the aspirations of the Concordat on Open Research Data , but should this go further given the concerns of the inquiry into research integrity?

In part what’s needed are the technological platforms that support research workflows, but, alongside the nuts and bolts, there are of course policy, cultural and skills issues to be tackled before we can see open research data as business as usual for the research community (and their collaborators outside of academia). Changes take time, and the policy framework has to be right in order to drive forward the infrastructure solutions, and the funding, to a sector that aims to support modern research but is having to balance a number of pressures from government and changes.

Meeting open research data aspirations requires services and tools so it is easy for a researcher to store, access and reuse data. This requires curation, storage and access services and technical infrastructure. At Jisc, we are currently working with 17 higher education institutions to develop a shared service solution with a good user experience and cost efficiencies built in. This Research Data Shared Service aims to smooth the path towards efficient research data management. This will facilitate open access to research data, in a sustainable way, and with the necessary archiving and preservation of datasets. Working collaboratively optimises the development a sustainable solution in what is a new area of responsibility for universities.

Open science is not a lofty goal, but our universities and research institutions are at different stages along the road to producing open research data. Fortunately the sector is keen to share their knowledge to support others and there is a huge wealth of open access good practice to draw upon.
Without Government and the public research funding bodies also taking responsibility for ensuring that good practice is embedded in policy, how can we incentivise the behaviours and create a data environment conducive to open research?

The inquiring society will benefit from open science – but we need inquiring minds right now, to make sure we get this right.