The importance of policy for enabling research data sharing

The evidence for the benefits of sharing research data is overwhelming. Researcher credit and recognition, research impact, productivity, return on funding investment, scientific reproducibility and patient outcomes have been found to be improved by enhancing data availability.

One can envisage that, in the future, scholarly publishing will primarily facilitate knowledge dissemination and reuse by enabling researchers to instantly translate new discoveries into practical applications in real time. But this will not be possible without a revolution in how we share and publish research data. Creating this kind of change may need to start with a relatively low-tech procedure: establishing a policy.

Policies everywhere

More than 50 funding agencies worldwide now have policies on research data sharing or management, although some research communities have supported or even mandated data sharing for more than 20 years. Specific journals have often enforced these policies – typically to reflect the needs of academic communities in which they publish journals. This co-emergence of policies from funders and publishers (“top down”) with policies of communities (“bottom-up”) has led to a research data policy landscape that is confusing for researchers and support staff and in much need of standardisation and harmonisation.

A Springer Nature survey this year, with 7719 responses, found two thirds (67%) of researchers share data, either through repositories or supplementary information to published articles. Three quarters (76%) said that it was important to them that their data are discoverable. Half (48%) said that they found organising their data in a presentable and useful way was the biggest barrier to data sharing, followed by a third (37%) who found copyright and licensing confusing.

This revolution in data reuse requires policy clarity from all stakeholders in research, particularly funders, in what is expected of projects and researchers. Simplicity is also needed from publishers, in what they require when authors submit their work; and guidance from institutions is essential in helping researchers follow best practice from the day a project is first conceived.

All’s FAIR in research data

Some academics need help presenting their data in a Findable, Accessible, Interoperable and Reusable (FAIR) way, which is being increasingly demanded by funding agencies. Publishers can and should help with this, but the majority of journals do not, as yet, have strict research data policies or offer additional support. Policy has been reported, in other surveys of researchers, to be both a barrier – if requirements are not understood – and an incentive to data sharing.

Recognising these issues, publishers have begun promoting research data policy standardisation and are in a good position to share good practice between different research communities widely. In 2016 Springer Nature introduced a set of four standard data policies for all its journals – the first major publish to do so. These policy types are:

Type 1: Data sharing and data citation is encouraged
Type 2: Data sharing and evidence of data sharing is encouraged
Type 3: Data sharing encouraged and statements of data availability required
Type 4: Data sharing, evidence of data sharing and peer review of data required

More than 1100 Springer Nature journals have so far introduced a standard policy. Each journal adopts one ‘type’ of data policy and, as there are only four types, the process is as simple and understandable for authors as possible. The data policies have been made available under a CC BY license so they can be built on and used by the community, and we are now working with Jisc, Australian National Data Service and Wiley via the global Research Data Alliance (RDA) community to develop widely agreed community-owned standards. The RDA data policy standardisation and implementation interest group has been gathering requirements for journal and publisher data policies over the summer of 2017 from a diverse community membership – which were presented at this month’s RDA Plenary meeting in Montreal.

This policy initiative complements other publisher collaborations such as efforts, via FORCE11, to implement data citation consistently. And this year other major journal publishers, including Wiley and Elsevier, have introduced similar initiatives to standardise data policies of their journals.

What to look for in a journal research data policy

With all this work in progress there is not – yet – an international standard for research data policy but there is growing consensus on important features of a journal data policy, including considerations for researchers and support staff in deciding where to publish, and how data sharing may be required or supported. Researchers should consider the following questions before deciding where to submit their work– in fact, ideally before data collection has even begun:

What are my funder’s requirements?
What are my institution’s requirements?
What are the data policies of my preferred journals? Are they easy to understand?
Will they enable me to comply with my funder/institutional requirements?
Are they right for my methodology and my discipline?
Am I required to write a data availability statement?
Should my data be hosted:
- In an institutional repository?
- In a subject-specific repository?
- In a broad repository, such as Figshare or Zenodo?
Is citation and linking of data to my article supported?
What support from the publisher is available?

To support the implementation of the four standard data policies at Springer Nature, we have also launched a Research Data Helpdesk where journal authors and editors can receive free advice on finding data repositories and complying with and implementing policy.

Supporting further change

Publishers are using technology and partnering with data repositories to enable policy compliance and data sharing in more effective ways. Many publishers have partnered with trusted data repositories, such as Dryad, Dataverse, Figshare and Zenodo – as well as supporting institutional services. Some researchers indicate they are looking for more direct support in data deposition and curation, to help make their data more discoverable.

To meet this need, at Springer Nature we introduced pilot data support services for authors, which provides them with a drag and drop workflow for depositing data privately and securely, as well as support from Research Data Editors who carry out a range of checks and enhancements on their metadata (the information that helps discovery and interpretation of the files) and the structure and presentation of their datasets. Authors publishing in journals, including Nature and BMC Ecology, have used these services to make the data supporting their articles more discoverable.

Other researchers seek support via dedicated teams within their institutions, advice from their funders, or from external expert agencies such as Jisc or the DCC. Data sharing is liable to remain a mixed economy for the foreseeable future, and although simplifications and consolidations are on their way the best person to support and advise a researcher on their specific situation varies from case to case.

Open access to data has the potential to transform the way research is done, speeding the pace of discovery and its benefits to society. Good data practice makes research more efficient, effective and fulfilling for researchers; but this can only work if publishers, funders and institutions make data sharing as simple as possible for researchers. And policy is the perfect tool to do that.

Iain Hrynaszkiewicz is grateful to Amy Bourke-Waite at Springer Nature for her contributions to an early draft of this article.