The purpose of this paper is to report a study of how research literature addresses researchers' attitudes toward data repository use. In particular, the authors are interested in how the term data sharing is defined, how data repository use is reported and whether there is need for greater clarity and specificity of terminology.
To study how the literature addresses researcher data repository use, relevant studies were identified by searching Library Information Science and Technology Abstracts, Library and Information Science Source, Thomas Reuters' Web of Science Core Collection and Scopus. A total of 62 studies were identified for inclusion in this meta-evaluation.
The study shows a need for greater clarity and consistency in the use of the term data sharing in future studies to better understand the phenomenon and allow for cross-study comparisons. Furthermore, most studies did not address data repository use specifically. In most analyzed studies, it was not possible to segregate results relating to sharing via public data repositories from other types of sharing. When sharing in public repositories was mentioned, the prevalence of repository use varied significantly.
Researchers' data sharing is of great interest to library and information science research and practice to inform academic libraries that are implementing data services to support these researchers. This study explores how the literature approaches this issue, especially the use of data repositories, the use of which is strongly encouraged. This paper identifies the potential for additional study focused on this area.
Thoegersen, J.L. and Borlund, P. (2021), "Researcher attitudes toward data sharing in public data repositories: a meta-evaluation of studies on researcher data sharing", Journal of Documentation, Vol. ahead-of-print No. ahead-of-print. https://doi.org/10.1108/JD-01-2021-0015
Emerald Publishing Limited
Copyright © 2021, Jennifer L. Thoegersen and Pia Borlund
Published by Emerald Publishing Limited. This article is published under the Creative Commons Attribution (CC BY 4.0) licence. Anyone may reproduce, distribute, translate and create derivative works of this article (for both commercial and non-commercial purposes), subject to full attribution to the original publication and authors. The full terms of this licence may be seen at http://creativecommons.org/licences/by/4.0/legalcode
This study examines how researcher data sharing has been studied in the research literature. Over the past decade, there has been an increasing, international demand to make the data underlying research more available to the research community and the public. Pressure to share data has been placed on researchers by funding institutions and journal publishers, many of which have begun to encourage or require researchers to share data (MacMillan, 2014, p. 544). The reasons for this shift in expectations are varied, but Borgman (2012) presents four broad rationales for sharing research data: reproducibility, serving the public interest, asking new questions and advancing research (p. 1067).
However, there is not one single definition of data sharing. In an open science context, which emphasizes the importance of publicly sharing scientific knowledge as soon as practicable, data sharing is framed as research data being made publicly available with as few restrictions on reuse as possible and referred to as Open Data (Nielsen, 2011; para. 2; Open Knowledge Foundation, 2014). The FAIR Guiding Principles take data sharing a step further, focusing not just on public accessibility, but also on utility, encouraging researchers to make data findable, accessible, interoperable and reusable (Wilkinson et al., 2016, p. 3).
Adhering to these expectations presents challenges for researchers, including a lack of resources, the need for data management skills and time constraints (Sayogo and Pardo, 2013, p. S25). Complicating the issue further are the plethora of ways researchers can share data. They must decide how and where they will share their data; data dissemination methods include departmental and researcher websites, by request, cloud services, publications and data journals (Bishoff and Johnston, 2015, p. 11; Mischo and O'Donnell, 2014, p. 35). Funding agencies and journal publishers, as well as the open science and FAIR data movements, generally encourage the use of public data repositories (which Uzwyshyn (2016, p. 18) defines as “large database infrastructures set up to manage, share, access, and archive researchers' datasets”) when practicable and possible from a legal and ethical standpoint (Holdren, 2013, p. 5).
Academic libraries have taken a leading role in supporting and shaping campus and national research data management and sharing (Christensen-Dalsgaard et al., 2012). As academic libraries have taken on this mantle, Library and Information Science (LIS) research has begun to investigate data sharing prevalence among researchers and the factors influencing researcher data sharing. However, most studies are not focused specifically on the use of public data repositories, and there is no comprehensive look at researcher attitudes toward these repositories.
There are two recent reviews of data sharing literature. Examining the concept through three lenses (individual, institutional, international), Chawinga and Zinn's (2019) systematic literature review highlights existing barriers to data sharing and provides suggestions for overcoming these barriers. Alternatively, Perrier et al.'s (2020, p. 14) meta-synthesis of qualitative studies examines researchers' views on data sharing broadly and explores the disconnect between data sharing requirements and the still low level of sharing among researchers. Both studies focus specifically on “open data,” arguing for the importance of publicly available of data, with Chawinga and Zinn (2019) stating that the terms “data sharing” and “open data” are synonymous and defining data sharing as “a deliberate effort to make all raw research data fully available for public access” (p. 110).
The current study, which evaluates both qualitative and quantitative studies, also focuses on the public availability of data, though concentrating specifically on the use of public data repositories. However, in contrast to the previous reviews, this study begins with questioning how the literature uses the term “data sharing,” acknowledging the term's inherent ambiguity and explores how research on data sharing is being conducted across a variety of disciplines.
The overall objectives of this study are to identify how the term “data sharing” is defined and operationalized in the literature, how sharing data in public data repositories is addressed and how researchers' attitudes toward data sharing relate to their data sharing behavior.
The remainder of this paper is structured as follows: Section 2 explains the methodology for identifying studies to include in the analysis. Section 3 presents the results of the study relating to how previous studies have addressed research data sharing and the use of data repositories. Section 4 discusses the results. Finally, Section 5 provides concluding remarks.
2. Methods and dataset
This section outlines the search for and identification of research literature focused on researchers' data sharing attitudes. While the main area of interest was literature in the LIS field, studies on data sharing have been published in many disciplines, and the literature search was also conducted to allow for disciplinary breadth. As such, searches were performed in two LIS databases – Library Information Science and Technology Abstracts (LISTA) and Library and Information Science Source (LISS) – and two multidisciplinary databases – Thomas Reuters' Web of Science Core Collection (WoS) and Scopus. Google Scholar was used as a check to ensure comprehensiveness of results returned from the databases. Incomplete metadata prevented Scholar from being included as a key database in this study.
The search combined the terms data repository, data sharing and open data with attitudes, beliefs and perceptions as well as researcher, scientist and faculty. An example Boolean search would be: (“data repository” or “data repositories” or “data sharing” or “open data”) and (attitudes or beliefs or perceptions) and (researcher or scientist or faculty).
The initial search was performed in all four databases in May 2019 and updated in March 2020. The results for the searches in LISTA, LISS, WoS and Scopus were 90, 105, 169 and 184, respectively, with a total of 548 results (see Table 1). In total, 328 results remained after duplicates were removed. Two studies were removed because they were reporting on the same study as other articles in the list, and six studies were removed for being in a language other than English. Finally, only detailed, empirical studies with researchers as study participants and a focus on influences on data sharing behaviors were included, as determined by a review of the article titles and abstracts and, if necessary, the text of the article. While the literature search was extensive, it was not exhaustive; the focus was on published, peer-reviewed studies, meaning gray literature was excluded.
Appendix lists all studies included in this analysis. A total of 62 studies were analyzed and the following questions were addressed for each study:
Is the term “data sharing” defined and, if so, how?
Is public data repository use addressed and, if so, how?
How do researchers' attitudes toward data sharing compare to their data sharing behavior?
The purposes for most of the studies evaluated fall into three broad groups (see Appendix). The first group of studies (n = 35) is concerned with researcher attitudes (or perceptions) and practices related to data sharing (and sometimes more broadly, data management). The second group of studies (n = 14) are specifically concerned with the factors and barriers influencing researchers' data sharing. The final group of studies (n = 12) explore both researcher attitudes and influences on sharing. The remaining article focuses on the research data management needs of chemistry researchers.
The studies have a broad geographical scope. The location of the studies is shown in Figure 1. Of the 62 studies, 16 were conducted in the United States, 21 were conducted in multiple countries or in an international context . The remaining studies were conducted in individual countries in Africa (2), Asia (7), Australia (1), Europe (13), North America (1) and South America (1).
The studies investigated data sharing attitudes of researchers across an array of disciplines. These areas ranged from very broad, for example, social sciences (Bradić-Martinović and Zdravković, 2014; Kim and Adler, 2015; Polanin and Terzian, 2019), to more specific disciplines, for example, optical coherence tomography (Lurie et al., 2015). Participants in 13 of the studies were from health disciplines, while researchers in social sciences disciplines and natural science and engineering disciplines were the focus of 9 and 21 studies, respectively. The remaining eight studies included a mix of participants in fields across these broad disciplinary areas.
While the literature search included studies published between 1977 and 2020, those that met the inclusion criteria were published between 2008 and 2020, the majority (n = 46) since 2015.
For all the studies, researchers were the informants on issues related to data sharing, sometimes in addition to tangential topics such as data management, data literacy and open access. In 12 of the studies, there were additional participants, including library personnel (Mozersky et al., 2020; Scheliga and Friesike, 2014); research participants (Hate et al., 2015; Mazor et al., 2017; Merson et al., 2015); ethics board members (Mazor et al., 2017; Merson et al., 2015; Mozersky et al., 2020); and government employees (Merson et al., 2015; Schmidt et al., 2016). Zenk-Möltgen et al. (2018) first analyzed journal data policies, before surveying authors on data sharing. As this study was focused on researcher attitudes, results from these nonresearcher populations were excluded.
The studies had a mix of quantitative (35), qualitative (23) and mixed (4) approaches. Almost all of the quantitative studies used surveys. Zenk-Möltgen et al. (2018) also conducted a document analysis first. Andreoli-Versbach and Mueller-Langer (2014) based their analysis on researchers' online presence. Survey response rates ranged from 2.2% to 100% with an average of 31.99% (excluding the ten studies for which a response rate was not reported and could not be calculated) while the number of respondents ranged from 40 to 1,829.
Most of the qualitative studies involved interviews or focus groups, which were analyzed for themes. Laine (2017) used interviews to develop case studies relating to two open research projects, and Grubb and Easterbrook (2011) used written questionnaires with open-ended questions.
This section presents the analysis of the studies and addresses each of the study questions individually.
3.1 Is the term “data sharing” defined and, if so, how?
As Kurata et al. (2017, p. 2) discuss, the term “data sharing” can refer to a wide range of behaviors. In some contexts, it is used interchangeably with “open data” (Chawinga and Zinn, 2019, p. 110). In others, it is broader, encompassing any kind of data sharing. The first aim of this study was to identify how these studies define and operationalize data sharing.
For most of the included studies, a definition of “data sharing” was not explicitly stated, though an approximation could be inferred based on context and details of the study. Scheliga and Friesike's (2014) qualitative study focused on obstacles to Open Science among researchers. In this study, “data sharing” referred to sharing data publicly (though not necessarily in a repository). On the other hand, several studies in the health disciplines were focused on patient data, which suggests on-request or restricted access sharing (e.g. Hate et al., 2015; Jao et al., 2015).
There were 13 studies (21%) that explicitly defined data sharing. In addition, two studies (Pardo Martínez and Poveda, 2018, pp. 2–3; Tenopir et al., 2018, p. 892) provided a definition of Open Data, and Andreoli-Versbach and Mueller-Langer (2014, pp. 1624–1625) defined “voluntary data sharing”.
While the definitions are similar, they also vary in terms of what is being shared, how it is being shared and with whom. Four of the studies (Saeed and Ali, 2019, p. 290; Tenopir et al., 2015, p. 3; Wu and Worrall, 2019, p. 765; Zhu, 2019, p. 2) used broad definitions. Wu and Worrall (2019) quoted Borgman's definition directly as “the release of research data for use by others” (p. 765). Zhu (2019) used an almost identical definition: “releasing research data that can be used by others” (p. 2), while Saeed and Ali (2019) defined data sharing as “the practice of making data used for academic research available to other investigators” (p. 290). Tenopir et al. (2015) states that data sharing “...occurs when scientists intentionally make their own data available to other people for their use in research or other related scientific endeavors” (p. 3).
Most of these studies broadly refer to others when identifying with whom data is shared and list examples of data sharing methods ranging from using public data repositories to sharing privately. Tenopir et al. (2011) also used a broad definition of “providing access for use and reuse of data” (p. 1).
Some of the studies qualified sharing by type of data being shared. Kim and Zhang (2015) referred to “raw data sets that have informed pet alublished articles to other researchers outside one's own research group(s) through various means such as data repositories, public web spaces, supplementary materials, or personal communications upon request” (p. 189), and Bezuidenhout (2019) focused “on the sharing of non-human data by individual scientists as part of their daily research practice” (p. 16).
The remaining studies with data sharing definitions qualify the method of sharing. Andreoli-Versbach and Mueller-Langer (2014) focused “on researchers' institutional or personal websites and data entries of the researchers under study in public data repositories” (p. 1625). “the extent to which researchers voluntarily make their data available in a “clearly and precisely documented” way and “readily available to any researcher”.
Borghi and Van Gulick (2018) defined sharing to include “activities involving the dissemination of conclusions drawn from neuroimaging data as well as the sharing of the underlying data itself through a general or discipline-specific repository” (p. 10).
For Kim and Stanton (2013), data sharing behavior was defined “as the extent to which scientists provide other scientists with their research data and information related to their published articles by depositing them into data repositories and providing them upon request” (p. 4). Similar definitions were used by three other studies (Ju and Kim, 2019, pp. 583–584; Kim and Adler, 2015, p. 409; Kim and Nah, 2018, p. 125).
Based on an analysis of these studies, research on data sharing attitudes rarely explicitly defines the term “data sharing,” though the intended meaning can often be inferred by the context (e.g. public sharing in studies concerning Open Science). Among the studies that do define data sharing, definitions vary and often include a variety of methods of sharing, with several of the studies limiting the definition to particular methods of sharing.
3.2 Is public data repository use addressed and, if so, how?
For both accessibility and archiving purposes, the use of public data repositories is preferable to other forms of data sharing and is encouraged by many funders and journal publishers (Holdren, 2013; MacMillan, 2014). It is also more in line with the goals of the Open Data movement (Open Knowledge Foundation, 2014). Unless data need to be restricted for privacy, confidentiality or other ethical or legal reasons, the ideal method of sharing is through an open, public data repository.
Given the emphasis on public data repository use, this study was especially interested in how the included studies addressed public data repositories as opposed to other methods of sharing.
Most of the studies included did not explore public data repository use specifically. Several of the studies mentioned data repositories in their definition of data sharing (Borghi and Van Gulick, 2018, p. 10; Ju and Kim, 2019, pp. 583–584; Kim and Adler, 2015, p. 409; Kim and Nah, 2018, p. 125; Kim and Stanton, 2013, p. 4). However, other ways of sharing are included as well (e.g. via websites and providing data on request).
Similarly, several studies grouped data repository use with other forms of sharing. Investigating economics and management researchers, Andreoli-Versbach and Mueller-Langer (2014, p. 1625) reported on sharing via either public data repositories or websites (16.8% of respondents). Several studies discussed publishing data, but the method of publishing was unclear (Borghi and Van Gulick, 2018; Cheah et al., 2015). Lurie et al. (2015, p. 3) reported 4% (n = 52) of respondents shared data publicly, but not how.
Tenopir et al. (2011) reported high willingness to share research data in a “central data repository with no restrictions” (p. 15) among both research - (74%) and teaching-intensive (79%) respondents.
Of the included studies, 12 (19%) separately reported on some type of data repository use, six of which clearly indicated that the repository was “public” or “open.” Other studies referred to subject repositories, institutional repositories or simply repositories. In interviews with social sciences researchers (n = 30) who collected qualitative data, Mozersky et al. (2020) found that some participants were “unfamiliar with the very idea of sharing qualitative data with a repository” (p. 5), and only one participant reported sharing data in a repository. The highest reported repository usage was in Spallek et al.'s (2019, p. 70) survey of international dental researchers. All respondents (n = 42) indicated some level of support for data sharing and 64% were required (mostly by funding agencies) to share data in a data repository.
When sharing in public data repositories was mentioned specifically and separately, the percentage of respondents reporting public repository use varied from 3.26% to 39.26% (see Table 2). Aydinoǧlu et al. (2017, pp. 278–279) reported on open access data repository and institutional open repository use (8.3 and 3.2%, respectively). In their study of Arab universities, Elsayed and Saleh (2018, pp. 288–290) found 64.4% of respondents shared data, but only 5.1% of these (3.26% of total respondents) did so in an open data repository. Federer et al. (2015, p. 9) found high use of public repositories/databases in the field of health, as did Huang et al. (2012, p. 401) in the fields of biodiversity, biogeography and conservation.
3.3 How do researchers' attitudes toward data sharing compare to their data sharing behavior?
The final aim of this study was to explore how this literature addresses the relationship between researchers' data sharing attitudes and their data sharing behavior.
Throughout the literature on researchers' data sharing attitudes and behaviors, there is a tension between the ideal and reality. A large percentage of researchers support the idea of open data, but far fewer have actually shared their own data (Aydinoǧlu et al., 2017; Diekmann, 2012; Hall, 2013; Zhu, 2019). In a survey of UK-based researchers, though 86% of respondents indicated that sharing data online was important, only 21% had deposited data in an online repository (Zhu, 2019, p. 5). They found no significant differences in sharing between the four, broad disciplinary areas studied (Medical and Life Sciences, Natural Sciences and Engineering, Social Sciences, Arts and Humanities). Similarly, Hall (2013, p. 383) found in interviews with environmental studies faculty at US academic institutions that though most participants believed data sharing was valuable, most also felt that their data would not be useful to others. Aydinoǧlu et al. (2017, pp. 279–280) and Diekman (2012, p. 27) also found researchers with no data sharing experience but positive attitudes toward data sharing in their respective studies of Turkish researchers and American Agricultural sciences researchers.
In interviews with Canadian neurology researchers, Ali-Khan et al. (2017, pp. 2–3) found that a lack of clarity around terms and expectations led to uncertainty and may inhibit data sharing among researchers who are generally favorable toward the concept of Open Science.
Some studies did indicate closer alignment of attitude and action. Interviews with US astronomy researchers showed that participants were apprehensive toward data sharing in principle and practice – due largely to the necessity for very detailed documentation to be able to reuse secondary data and the possibility for misinterpretation (Wynholds et al., 2011, p. 384). Australian social sciences researchers interviewed by Hickson et al. (2016, pp. 259–260) expressed negative attitudes toward data sharing, including concerns that their data either would not be useful to other researchers or would be used by others to publish. Laine's (2017, p. 7) case study of two Finnish interdisciplinary open research projects presents researchers who have enthusiastically embraced openness through most of the research process. Interviewed researchers viewed openness as an asset, as by making their research public early in the research cycle, they can demonstrate work in a particular research area well prior to publishing results.
These diverse views demonstrate the importance of both attitudes and practicalities in influencing data sharing behaviors. A series of studies by Kim and colleagues investigating data sharing behaviors support this assessment (Kim and Adler, 2015; Kim and Burns, 2016; Kim and Kim, 2015; Kim and Nah, 2018; Kim and Stanton, 2013; Kim and Zhang, 2015). These studies examined the relationship between data sharing behaviors and a variety of factors including internal researcher perceptions (career risk, career benefit and effort required to share data) and external factors (e.g. pressure from funders and journal publishers, availability of a data repository). Though results varied depending on population, in all cases both internal and external factors had a significant relationship to data sharing behaviors.
While sharing data is not in itself a new phenomenon, research into researcher data sharing is an emerging area. Research across disciplines, including LIS, is being conducted to better understand researchers' data sharing beliefs, motivations and actions, generally with the intent to identify strategies to guide data sharing practices. As LIS research delves deeper into this area, there is value to be gained by increasing the clarity and granularity in which we speak of, research and report on data sharing.
Given the relative novelty of the research area in LIS, the nebulous nature of the term “data sharing” in the literature is unsurprising. As pointed out by Kurata et al. (2017, p. 2), the vagueness of the term and the discrepancies of its use present a challenge. Many of the included studies did not explicitly define data sharing, and many of those that did used different definitions. When definitions are unclear or diverge, exploring and comparing researchers' attitudes across studies is difficult or impossible. This is especially true when focusing on public data repository use.
Borgman's (2013) broad definition of data sharing is a helpful umbrella term for the many ways in which researchers allow others to access and reuse their research data, and variations of this definition are widely used, including by some of the analyzed studies, providing a clear baseline to support mutual understanding.
However, as the data sharing attitudes and activities of researchers' continue to be explored, focusing on specific kinds of sharing is important. The same researcher can report widely divergent attitudes and behaviors depending on the kind of sharing under discussion (Tenopir et al., 2011, p. 6). In data sharing, particulars are important. In addition, given the discrepancies in defining data sharing in the literature, participants are also likely to have very divergent understandings of the term. As such, more studies that clearly define the type of data sharing under study would be valuable, so that, as we continue to consider the data sharing attitudes and behaviors of researchers, we have the ability to clearly differentiate between different contexts, institutions, countries, disciplines and over time.
More specificity would be especially beneficial in increasing understanding of researchers' attitudes toward using public data repositories. While data repository use was reported on separately in some of the studies, several of the analyzed studies grouped it with other forms of data sharing. When there are no practical, ethical or legal reasons preventing it, the use of data repositories is strongly encouraged by funding agencies, journal publishers and libraries. The benefits of data sharing, often cited in LIS research and data management courses, while not entirely dependent on public accessibility, discoverability and reliable stewardship, are, generally speaking, enhanced by it. As such, researchers' attitudes and behaviors related to public data sharing in data repositories are of particular interest to LIS research. Exploring what influences changes in researcher attitudes and behaviors related to public data repositories could provide an avenue toward increasing their use.
Researchers' influences and attitudes toward data sharing, and their actual data sharing behaviors, are complex and dependent on many factors. Sharing data is viewed largely positively among researchers, but many researchers are hitting one of many walls preventing them from sharing their own data more broadly. Addressing these walls will be important for entities interested in increasing public data sharing.
The studies included in this analysis that explicitly discuss public data repositories found extremely different reported public repository use among researchers. While Zhu (2019, p. 5) found similar data repository attitudes and behaviors across the broad disciplinary areas, it is clear that disciplinary norms contribute to data sharing views (Kim and Adler, 2015, p. 415; Kim and Burns, 2016, p. 240), and certain disciplines (e.g. astronomy) that have a much stronger culture of sharing data, especially publicly (Scheliga and Friesike, 2014, p. 9).
Laine (2017, p. 7) presents an interesting angle for transforming the conversation about data sharing, by shifting one of the perceived risks to data sharing (scooping) to a benefit. Researchers in the research projects examined in this study viewed openness as a way to advertise their work in an area. Framing openness in research in this way could create an incentive for researchers to share earlier and more often.
Another interesting area for exploration would be reducing the effort related to data sharing. A dilemma for librarians and policymakers who want to promote data sharing is what desired outcome to focus on. Public sharing of data in data repositories is preferable; however, it is also a less prevalent method of data sharing. Is it better to encourage any kind of data sharing initially in an effort to gradually modify attitudes and norms? Or is it better to push toward public data repository use specifically? If the latter, it will be necessary to identify ways to reduce barriers including the lack of incentivization and the effort required to identify and deposit in a data repository. This is especially true among researchers working with sensitive and qualitative data, which present unique challenges to public sharing in particular.
In order to better understand how to move forward in data sharing guidance, more studies that clearly differentiate types of data sharing would be beneficial, in both how participants are asked about data sharing and how results are reported. It is difficult to understand researcher attitudes and benchmark progress when definitions are inconsistent. Given the preference for the use of public data repositories to share data when possible, increasing research related to this method of sharing and clearly segmenting it from other methods of sharing in the study design and results allow librarians and policymakers to gain a better understanding of researchers' attitudes toward this method of sharing specifically and how they change over time and across disciplines. Based on the studies in this analysis, frequency of sharing in public data repositories is low among most groups of researchers. Several studies identified barriers to data sharing, and further research into approaches for reducing these barriers should be done. Conversely, additional research exploring why researchers do share in public repositories (as opposed to not sharing data or sharing using alternative methods) could bring insights that could be used to encourage repository use among other researchers.
There were several limitations to the search for and identification of literature for this study. While the search for relevant literature was extensive, it was not comprehensive. LISTA and LISS were chosen to provide depth in LIS literature, due to the central role academic libraries have taken in research data management and sharing. As data sharing studies have been published across many other disciplinary journals as well, searches were performed in Web of Science and Scopus to provide disciplinary breadth. However, these databases are not comprehensive, and data sharing literature across some disciplines may not have been identified. Google Scholar was searched, but was excluded as most of the results were included in the database searches or were gray literature, which was not included in this study. Database searches were performed in English, and only English language studies were included. The search terms focused on data sharing attitudes of researchers. Studies examining data management more broadly often investigate researcher data sharing and may have provided additional insight; however, these were purposefully excluded in order to focus on studies with detailed findings on researcher data sharing.
The overall objectives of this study were to identify how the term “data sharing” is defined and operationalized in the literature, how sharing data in public data repositories is addressed in the literature and how researchers' attitudes toward data sharing compare to their data sharing behavior.
The evaluation showed that studies could be separated into three categories: studies of research data sharing attitudes and practices, studies of influences on researchers' data sharing and studies of both researcher attitudes and influences on data sharing. Though heavily skewed toward the United States, studies were from a wide array of countries, covering a variety of disciplines and employing a mix of qualitative and quantitative methods. Most studies did not explicitly define data sharing, and those that did generally used broad definitions or focused on particular methods of data sharing. Public data repositories as a method of data sharing were also rarely addressed explicitly and separately. When it was, reported public data repository use among researchers varied greatly between studies. Many studies reported a disconnect between researchers' attitudes toward data sharing and their data sharing behaviors.
As libraries continue to promote data sharing among researchers, it will be important for data management librarians to understand both how researchers' data sharing behavior is shaped and how library data management services can help mold this behavior. There are many factors influencing researcher data sharing behaviors, both internal and external. By influencing these influences, librarians and policymakers can help shape the future of the Open Data environment. In order to promote sharing data publicly in data repositories, libraries need to understand researchers' attitudes and behaviors toward this method of sharing, why they share publicly – and why they do not.
This study explored influences on researcher data sharing, specifically via public repositories. It highlighted inconsistencies in how data sharing is defined and categorized, which limits the ability to make comparisons and draw broad conclusions across studies. Sharing data with collaborators is vastly different than sharing publicly, and researchers' attitudes toward these are quite different as well. The wide spectrum of data sharing should be studied in meaningful segments that align with the goal of – inasmuch as possible – advancing science through the sharing of scientific data.
Results of literature search
|Remove double reports||89||17||146||74||326|
|Data sharing focus||20||3||27||19||69|
Reported use of open, open access, online or public repositories/databases
|Study||Location(s)||Discipline(s)||% Of respondents|
|Elsayed and Saleh (2018, p. 292)||Egypt, Jordan, Saudi Arabia||Health and medical sciences, pharmacology, pure sciences, agriculture sciences, engineering||3.26%|
|Saeed and Ali (2019, p. 297)||India||Life sciences, social sciences||5.11%|
|Aydinoǧlu et al. (2017, p. 278)||Turkey||Any||8.30%|
|Zhu (2019, p. 5)||United Kingdom||Any||21.00%|
|Huang et al. (2012, p. 401)||International||Biodiversity, biogeography, conservation||38.10%|
|Federer et al. (2015, p. 9)||USA||Health||39.26%|
Studies included in analysis
|Researcher attitudes, perceptions and practices related to data sharing and data management|
|Abele-Brehm et al.||2019||Germany||Psychology||Mixed||Survey||337|
|Allard and Aydinoǧlu||2012||Turkey||Environmental science||Qualitative||Interviews||12|
|Andreoli-Versbach and Mueller-Langer||2014||International||Economics and management||Quantitative||Online information||488|
|Aydinoǧlu et al.||2017||Turkey||Any||Quantitative||Survey||532|
|Bardyn et al.||2012||USA||Translational medicine||Qualitative||Focus group||8|
|Borghi and Van Gulick||2018||International||Neuroimaging||Quantitative||Survey||144|
|Bradić-Martinović and Zdravković||2014||Bosnia and Herzegovina, Croatia, Serbia||Social sciences||Quantitative||Survey||647|
|Cheah et al.||2015||Thailand||Medicine and public health||Qualitative||Interviews||15|
|Damalas et al.||2018||International||Life sciences||Quantitative||Survey||858|
|Denny et al.||2015||South Africa||Public health||Qualitative||Interviews, focus groups||32|
|Fry et al.||2009||United Kingdom||e-science (Environmental Science, Bioinformatics, Chemistry, Quantitative Social Science)||Qualitative||Interviews||12|
|Hickson et al.||2016||Australia||Human resource management, industrial relations, organizational behavior||Qualitative||Interviews||24|
|Huang et al.||2012||International||Biodiversity, biogeography, conservation||Mixed||Survey||372|
|Jarolímková et al.||2018||Czech Republic||Any||Quantitative||Survey||1,434|
|Kurata et al.||2017||Japan||Natural sciences||Mixed||Interviews||23|
|Lurie et al.||2015||International||Optical coherence tomography||Quantitative||Survey||52|
|Majid et al.||2018||Singapore||Any||Quantitative||Survey||241|
|Mazor et al.||2017||USA||Health||Qualitative||Interviews||34|
|Melero and Navarro-Molina||2020||Spain||Food science and technology||Mixed||Focus group, survey||108*|
|Merson et al.||2015||Vietnam||Clinical research||Qualitative||Interviews, focus groups||48|
|Mozersky et al.||2020||USA||Anthropology, Communications, psychology, public health, social work||Qualitative||Interviews||30**|
|Murillo||2014||USA||Physics and astronomy, biology, geography, geology, marine sciences, environmental sciences, and engineering||Qualitative||Focus groups||14|
|Nicholas et al.||2020||China, France, Malaysia, Poland, Spain, UK, USA||Sciences, social sciences (early career)||Quantitative||Survey||1,600|
|Nicholas et al.||2019||China, France, Malaysia, Poland, Spain, UK, USA||Sciences, social sciences (early career)||Qualitative||Interviews||116|
|Saeed and Ali||2019||India||Life sciences, social sciences||Quantitative||Questionnaire||352|
|Schopfel et al.||2018||France||Any||Quantitative||Survey||432|
|Stürmer et al.||2017||Germany||Social psychology||Quantitative||Survey||88|
|Tenopir et al.||2015||International||Any||Quantitative||Surveys||2,344***|
|Todorova et al.||2019||Bulgaria||Library and information science, computer science||Quantitative||Survey||40|
|Wu and Worrall||2019||USA||Earthquake engineering||Qualitative||Interviews||16|
|Wynholds et al.||2011||USA||Astronomy||Qualitative||Interviews||27|
|Factors and barriers influencing data sharing|
|Bezuidenhout||2019||Kenya, South Africa||Life sciences||Qualitative||Interviews||56|
|Ju and Kim||2019||USA||Biological sciences||Quantitative||Survey||577|
|Kim and Adler||2015||USA||Social sciences||Quantitative||Survey||361|
|Kim and Burns||2016||USA||Biological sciences||Quantitative||Survey||608|
|Kim and Kim||2015||USA||Health sciences||Quantitative||Survey||207|
|Kim and Nah||2018||International||internet researchers||Quantitative||Survey||201|
|Kim and Stanton||2013||USA||STEM||Quantitative||Survey||1,317|
|Kim and Zhang||2015||USA||STEM||Quantitative||Survey||1,298|
|Laine||2017||Finland||Sociology, engineering, chemistry, physics, user-centered design||Qualitative||Interviews||4|
|Linek et al.||2017||Germany||Any||Quantitative||Survey||1,564|
|Luzi et al.||2013||Italy||Environmental science||Quantitative||Survey||523|
|Pardo Martínez and Cotte Poveda||2018||Colombia||Any||Quantitative||Survey||1,042|
|Scheliga and Friesike||2014||Germany, United Kingdom, Switzerland, South Africa||Any||Qualitative||Interviews||22|
|Zenk-Möltgen et al.||2018||International||Sociology, political science||Quantitative||Survey||1829|
|Researcher attitudes and influences on sharing|
|Ali-Khan et al.||2017||Canada||Neurology||Qualitative||Interviews||25|
|Elsayed and Saleh||2018||Egypt, Jordan, Saudi Arabia||Health and medical sciences, pharmacology, pure sciences, agriculture sciences, engineering||Quantitative||Survey||337|
|Federer et al.||2015||USA||Health||Quantitative||Survey||135|
|Grubb and Easterbrook||2011||USA, Canada, United Kingdom, New Zealand||Biology, life sciences, chemistry, physics||Qualitative||Questionnaire||19|
|Hate et al.||2015||India||public health||Qualitative||Interviews, focus groups||66|
|Jao et al.||2015||Kenya||Public health||Qualitative||Interviews||60|
|Polanin and Terzian||2019||International||Social sciences||Quantitative||Survey||247|
|Schmidt et al.||2016||International||Sciences||Quantitative||Survey||1,253|
|Spallek et al.||2019||International||Dentistry||Quantitative||Survey||52|
|Tenopir et al.||2011||International||Sciences, social sciences||Quantitative||Survey||1,329|
|Tenopir et al.||2018||International||Earth and planetary geophysics||Quantitative||Survey||1,372|
|Williams et al.||2019||USA||Agricultural sciences||Qualitative||Interviews||28|
|Data management needs|
|Chen and Wu||2017||China||Chemistry||Quantitative||Survey||119|
Note(s): *Focus group (n = 7); Survey (n = 101); **Excludes non-researcher participants; ***Two separate surveys in 2009/2010 (n = 1,329) and 2013/2014 (n = 1,015)
A distinction was made between studies where authors selected participants from multiple specific countries (n = 7) and studies with an international scope (n = 14).
Ali-Khan, S.E., Harris, L.W. and Gold, E.R. (2017), “Point of view: motivating participation in open science by examining researcher incentives”, Elife, Vol. 6, p. e29319, doi: 10.7554/eLife.29319.
Andreoli-Versbach, P. and Mueller-Langer, F. (2014), “Open access to data: an ideal professed but not practised”, Research Policy, Vol. 43 No. 9, pp. 1621-1633, doi: 10.1016/j.respol.2014.04.008.
Aydinoǧlu, A.U., Dogan, G. and Taskin, Z. (2017), “Research data management in Turkey: perceptions and practices”, Library Hi Tech, Vol. 35 No. 2, pp. 271-289, doi: 10.1108/LHT-11-2016-0134.
Bezuidenhout, L. (2019), “To share or not to share: incentivizing data sharing in life science communities”, Developing World Bioethics, Vol. 19 No. 1, pp. 18-24, doi: 10.1111/dewb.12183.
Bishoff, C. and Johnston, L. (2015), “Approaches to data sharing: an analysis of NSF data management plans from a large research university”, Journal of Librarianship and Scholarly Communication, Vol. 3 No. 2, pp. 1-27, doi: 10.7710/2162-3309.1231.
Borghi, J.A. and Van Gulick, A.E. (2018), “Data management and sharing in neuroimaging: practices and perceptions of MRI researchers”, PloS One, Vol. 13 No. 7, p. e0200562, doi: 10.1371/journal.pone.0200562.
Borgman, C.L. (2012), “The conundrum of sharing research data”, Journal of the American Society for Information Science and Technology, Vol. 63 No. 6, pp. 1059-1078, doi: 10.1002/asi.22634.
Bradić-Martinović, A. and Zdravković, A. (2014), “Researchers' interest in data service in Bosnia and Herzegovina, Croatia, and Serbia”, IASSIST Quarterly, Vol. 38 No. 2, p. 22.
Chawinga, W.D. and Zinn, S. (2019), “Global perspectives of research data sharing: a systematic literature review”, Library and Information Science Research, Vol. 41 No. 2, pp. 109-122, doi: 10.1016/j.lisr.2019.04.004.
Cheah, P.Y., Tangseefa, D., Somsaman, A., Chunsuttiwat, T., Nosten, F., Day, N.P.J., Bull, S. and Parker, M. (2015), “Perceived benefits, harms, and views about how to share data responsibly: a qualitative study of experiences with and attitudes toward data sharing among research staff and community representatives in Thailand”, Journal of Empirical Research on Human Research Ethics, Vol. 10 No. 3, pp. 278-289, doi: 10.1177/1556264615592388.
Christensen-Dalsgaard, B., van den Berg, M., Grim, R., Horstmann, W., Jansen, D., Pollard, T. and Roos, A. (2012), “Ten recommendations for libraries to get started with research data management”, available at: http://libereurope.eu/wp-content/uploads/The research data group 2012 v7 final.pdf.
Diekmann, F. (2012), “Data practices of agricultural scientists: results from an exploratory study”, Journal of Agricultural and Food Information, Vol. 13 No. 1, pp. 14-34, doi: 10.1080/10496505.2012.636005.
Elsayed, A.M. and Saleh, E. (2018), “Research data management and sharing among researchers in Arab universities: an exploratory study”, Ifla Journal-International Federation of Library Associations, Vol. 44 No. 4, pp. 281-299, doi: 10.1177/0340035218785196.
Federer, L.M., Lu, Y.-L., Joubert, D.J., Welsh, J. and Brandys, B. (2015), “Biomedical data sharing and reuse: attitudes and practices of clinical and scientific research staff”, PloS One, Vol. 10 No. 6, p. e0129506, doi: 10.1371/journal.pone.0129506.
Grubb, A.M. and Easterbrook, S.M. (2011), “On the lack of consensus over the meaning of openness: an empirical study”, PloS One, Vol. 6 No. 8, Scopus, doi: 10.1371/journal.pone.0023420.
Hall, N.F. (2013), “Environmental studies faculty attitudes towards sharing of research data”, Proceedings of the 13th ACM/IEEE-CS Joint Conference on Digital Libraries, pp. 383-384.
Hate, K., Meherally, S., Shah More, N., Jayaraman, A., Bull, S., Parker, M. and Osrin, D. (2015), “Sweat, skepticism, and uncharted territory: a qualitative study of opinions on data sharing among public health researchers and research participants in Mumbai, India”, Journal of Empirical Research on Human Research Ethics, Vol. 10 No. 3, pp. 239-250, doi: 10.1177/1556264615592383.
Hickson, S., Poulton, K.A., Connor, M., Richardson, J. and Wolski, M. (2016), “Modifying researchers' data management practices: a behavioural framework for library practitioners”, IFLA Journal, Vol. 42 No. 4, pp. 253-265, doi: 10.1177/0340035216673856.
Holdren, J.P. (2013), “Increasing access to the results of federally funded scientific research”, Office of Science and Technology Policy, available at: https://obamawhitehouse.archives.gov/sites/default/files/microsites/ostp/ostp_public_access_memo_2013.pdf.
Huang, X., Hawkins, B.A., Lei, F., Miller, G.L., Favret, C., Zhang, R. and Qiao, G. (2012), “Willing or unwilling to share primary biodiversity data: results and implications of an international survey”, Conservation Letters, Vol. 5 No. 5, pp. 399-406, doi: 10.1111/j.1755-263X.2012.00259.x.
Jao, I., Kombe, F., Mwalukore, S., Bull, S., Parker, M., Kamuya, D., Molyneux, S. and Marsh, V. (2015), “Research stakeholders' views on benefits and challenges for public health research data sharing in Kenya: the importance of trust and social relations”, PloS One, Vol. 10 No. 9, p. e0135545, doi: 10.1371/journal.pone.0135545.
Ju, B. and Kim, Y. (2019), “The formation of research ethics for data sharing by biological scientists: an empirical analysis”, Aslib Journal of Information Management, Vol. 71 No. 5, pp. 583-600, doi: 10.1108/AJIM-12-2018-0296.
Kim, Y. and Adler, M. (2015), “Social scientists' data sharing behaviors: investigating the roles of individual motivations, institutional pressures, and data repositories”, International Journal of Information Management, Vol. 35 No. 4, pp. 408-418, doi: 10.1016/j.ijinfomgt.2015.04.007.
Kim, Y. and Burns, C.S. (2016), “Norms of data sharing in biological sciences: the roles of metadata, data repository, and journal and funding requirements”, Journal of Information Science, Vol. 42 No. 2, pp. 230-245, doi: 10.1177/0165551515592098.
Kim, Y. and Kim, S. (2015), “Institutional, motivational, and resource factors influencing health scientists' data-sharing behaviours”, Journal of Scholarly Publishing, Vol. 46 No. 4, pp. 366-389, doi: 10.3138/jsp.46.4.05.
Kim, Y. and Nah, S. (2018), “Internet researchers' data sharing behaviors: an integration of data reuse experience, attitudinal beliefs, social norms, and resource factors”, Online Information Review, Vol. 42 No. 1, pp. 124-142, doi: 10.1108/OIR-10-2016-0313.
Kim, Y. and Stanton, J.M. (2013), “Institutional and individual influences on scientists' data sharing behaviors: a multilevel analysis”, Proceedings of the Association for Information Science and Technology, Vol. 50 No. 1, p. 1.
Kim, Y. and Zhang, P. (2015), “Understanding data sharing behaviors of STEM researchers: the roles of attitudes, norms, and data repositories”, Library and Information Science Research, Vol. 37 No. 3, pp. 189-200, doi: 10.1016/j.lisr.2015.04.006.
Kurata, K., Matsubayashi, M. and Mine, S. (2017), “Identifying the complex position of research data and data sharing among researchers in natural science”, Sage Open, Vol. 7 No. 3, p. 2158244017717301, doi: 10.1177/2158244017717301.
Laine, H. (2017), “Afraid of scooping—case study on researcher strategies against fear of scooping in the context of open science”, Data Science Journal, Vol. 16, doi: 10.5334/dsj-2017-029.
Lurie, K.L., Mistree, B.F.T. and Ellerbee, A.K. (2015). “Perspectives of the optical coherence tomography community on code and data sharing”, in Fujimoto, J.G., Izatt, J.A. and Tuchin, V.V. (Eds), p. 93122M, doi: 10.1117/12.2082412.
MacMillan, D. (2014), “Data sharing and discovery: what librarians need to know”, The Journal of Academic Librarianship, Vol. 40 No. 5, pp. 541-549, doi: 10.1016/j.acalib.2014.06.011.
Mazor, K.M., Richards, A., Gallagher, M., Arterburn, D.E., Raebel, M.A., Nowell, W.B., Curtis, J.R., Paolino, A.R. and Toh, S. (2017), “Stakeholders' views on data sharing in multicenter studies”, Journal of Comparative Effectiveness Research, Vol. 6 No. 6, pp. 537-547, doi: 10.2217/cer-2017-0009.
Merson, L., Phong, T.V., Nhan, L.N.T., Dung, N.T., Ngan, T.T.D., Kinh, N.V., Parker, M. and Bull, S. (2015), “Trust, respect, and reciprocity: informing culturally appropriate data-sharing practice in Vietnam”, Journal of Empirical Research on Human Research Ethics, Vol. 10 No. 3, pp. 251-263, doi: 10.1177/1556264615592387.
Mischo, W. and O'Donnell, M. (2014), “An analysis of data management plans in University of Illinois National Science Foundation grant proposals”, Journal of EScience Librarianship, Vol. 3 No. 1, p. 3, doi: 10.7191/jeslib.2014.1060.
Mozersky, J., Walsh, H., Parsons, M., McIntosh, T., Baldwin, K. and DuBois, J.M. (2020), “Are we ready to share qualitative research data? Knowledge and preparedness among qualitative researchers, IRB members, and data repository curators”, IASSIST Quarterly, Vol. 43 No. 4, pp. 1-23, doi: 10.29173/iq952.
Nielsen, M. (2011), “Definitions of open science? The open-science archives”, available at: https://lists.okfn.org/pipermail/open-science/2011-July/005607.html.
Open Knowledge Foundation (2014), “Open data – an introduction”, available at: http://webarchive.okfn.org/okfn.org/201404/opendata/.
Pardo Martínez, C. and Poveda, A. (2018), “Knowledge and perceptions of open science among researchers—a case study for Colombia”, Information, Vol. 9 No. 11, p. 292, doi: 10.3390/info9110292.
Perrier, L., Blondal, E. and MacDonald, H. (2020), “The views, perspectives, and experiences of academic researchers with data sharing and reuse: a meta-synthesis”, PloS One, Vol. 15 No. 2, p. e0229182, doi: 10.1371/journal.pone.0229182.
Polanin, J.R. and Terzian, M. (2019), “A data-sharing agreement helps to increase researchers' willingness to share primary data: results from a randomized controlled trial”, Journal of Clinical Epidemiology, Vol. 106, pp. 60-69, doi: 10.1016/j.jclinepi.2018.10.006.
Saeed, S. and Ali, P.M. (2019), “Research data management and data sharing among research scholars of life sciences and social sciences”, DESIDOC Journal of Library and Information Technology, Vol. 39 No. 6, pp. 290-299, doi: 10.14429/djlit.39.06.14997.
Sayogo, D.S. and Pardo, T.A. (2013), “Exploring the determinants of scientific data sharing: understanding the motivation to publish research data”, Government Information Quarterly, Vol. 30, pp. S19-S31, doi: 10.1016/j.giq.2012.06.011.
Scheliga, K. and Friesike, S. (2014), “Putting open science into practice: a social dilemma?”, First Monday, Vol. 19 No. 9, doi: 10.5210/fm.v19i9.5381.
Schmidt, B., Gemeinholzer, B. and Treloar, A. (2016), “Open data in global environmental research: the Belmont Forum's open data survey”, PloS One, Vol. 11 No. 1, p. e0146695, doi: 10.1371/journal.pone.0146695.
Spallek, H., Weinberg, S., Manz, M., Nanayakkara, S., Zhou, X. and Johnson, L. (2019), “Perceptions and attitudes toward data sharing among dental researchers”, JDR Clinical and Translational Research, Vol. 4 No. 1, pp. 68-75, doi: 10.1177/2380084418790451.
Tenopir, C., Allard, S., Douglass, K., Aydinoǧlu, A.U., Wu, L., Read, E., Manoff, M. and Frame, M. (2011), “Data sharing by scientists: practices and perceptions”, PloS One, Vol. 6 No. 6, p. e21101, doi: 10.1371/journal.pone.0021101.
Tenopir, C., Christian, L., Allard, S. and Borycz, J. (2018), “Research data sharing: practices and attitudes of geophysicists”, Earth and Space Science, Vol. 5 No. 12, pp. 891-902, doi: 10.1029/2018EA000461.
Tenopir, C., Dalton, E.D., Allard, S., Frame, M., Pjesivac, I., Birch, B., Pollock, D. and Dorsett, K. (2015), “Changes in data sharing and data reuse practices and perceptions among scientists worldwide”, PloS One, Vol. 10 No. 8, p. e0134826, doi: 10.1371/journal.pone.0134826.
Uzwyshyn, R. (2016), “Research data repositories: the what, when, why, and how”, Computers in Libraries, Vol. 36 No. 3, pp. 18-21.
Wilkinson, M.D., Dumontier, M., Aalbersberg, Ij. J., Appleton, G., Axton, M., Baak, A., Blomberg, N., Boiten, J.-W., da Silva Santos, L.B., Bourne, P.E., Bouwman, J., Brookes, A.J., Clark, T., Crosas, M., Dillo, I., Dumon, O., Edmunds, S., Evelo, C.T., Finkers, R. and Mons, B. (2016), “The FAIR guiding principles for scientific data management and stewardship”, Scientific Data, Vol. 3, p. 160018.
Wu, S. and Worrall, A. (2019), “Supporting successful data sharing practices in earthquake engineering”, Library Hi Tech, Vol. 37 No. 4, pp. 764-780, doi: 10.1108/LHT-03-2019-0058.
Wynholds, L., Fearon, D.S. Jr, Borgman, C.L. and Traweek, S. (2011), “When use cases are not useful: data practices, astronomy, and digital libraries”, Proceedings of the 11th Annual International ACM/IEEE Joint Conference on Digital Libraries, pp. 383-386.
Zenk-Möltgen, W., Akdeniz, E., Katsanidou, A., Nasshoven, V. and Balaban, E. (2018), “Factors influencing the data sharing behavior of researchers in sociology and political science”, Journal of Documentation, Vol. 74 No. 5, pp. 1053-1073, doi: 10.1108/JD-09-2017-0126.
Zhu, Y. (2019), “Open-access policy and data-sharing practice in UK academia”, Journal of Information Science, pp. 1-12, doi: 10.1177/0165551518823174.
The authors thank Lisa Federer, Erica DeFrain and Rasmus Thøgersen for providing valuable feedback on various drafts of this article. In addition, they thank the anonymous reviewers whose comments helped improve and clarify this manuscript.