To read this content please select one of the options below:

Social science data repositories in data deluge: A case study of ICPSR’s workflow and practices

Wei Jeng (School of Information Sciences, University of Pittsburgh, Pittsburgh, Pennsylvania, USA)
Daqing He (School of Computing and Information, University of Pittsburgh, Pittsburgh, Pennsylvania, USA)
Yu Chi (School of Computing and Information, University of Pittsburgh, Pittsburgh, Pennsylvania, USA)

The Electronic Library

ISSN: 0264-0473

Article publication date: 7 August 2017




Owing to the recent surge of interest in the age of the data deluge, the importance of researching data infrastructures is increasing. The open archival information system (OAIS) model has been widely adopted as a framework for creating and maintaining digital repositories. Considering that OAIS is a reference model that requires customization for actual practice, this paper aims to examine how the current practices in a data repository map to the OAIS environment and functional components.


The authors conducted two focus-group sessions and one individual interview with eight employees at the world’s largest social science data repository, the Interuniversity Consortium for Political and Social Research (ICPSR). By examining their current actions (activities regarding their work responsibilities) and IT practices, they studied the barriers and challenges of archiving and curating qualitative data at ICPSR.


The authors observed that the OAIS model is robust and reliable in actual service processes for data curation and data archives. In addition, a data repository’s workflow resembles digital archives or even digital libraries. On the other hand, they find that the cost of preventing disclosure risk and a lack of agreement on the standards of text data files are the most apparent obstacles for data curation professionals to handle qualitative data; the maturation of data metrics seems to be a promising solution to several challenges in social science data sharing.


The authors evaluated the gap between a research data repository’s current practices and the adoption of the OAIS model. They also identified answers to questions such as how current technological infrastructure in a leading data repository such as ICPSR supports their daily operations, what the ideal technologies in those data repositories would be and the associated challenges that accompany these ideal technologies. Most importantly, they helped to prioritize challenges and barriers from the data curator’s perspective and to contribute implications of data sharing and reuse in social sciences.



The authors thank the iFellowship, guided by the Committee on Coherence at Scale (CoC) for Higher Education, sponsored by the Council on Library and Information Resources (CLIR) and Andrew W. Mellon Foundations, as well as Beta-Phi-Mu Honor Society, which provided research funding for this project. This study is also partially supported by the project titled Research on Knowledge Organization and Service Innovation in the Big Data Environments funded by the National Natural Science Foundation of China (No. 71420107026). The authors also thank Drs Nora Mattern, Liz Lyon, Sheila Corrall, Jian Qin, Jung Sun Oh and Stephen Griffin for their invaluable comments and suggestions on this research project. Last but not least, the authors thank all participants and people who helped facilitate the field study at ICPSR for their valuable input and assistance.


Jeng, W., He, D. and Chi, Y. (2017), "Social science data repositories in data deluge: A case study of ICPSR’s workflow and practices", The Electronic Library, Vol. 35 No. 4, pp. 626-649.



Emerald Publishing Limited

Copyright © 2017, Emerald Publishing Limited

Related articles