This paper aims to examine the nature and sufficiency of descriptive information included in open datasets and the nature of comments and questions users write in relation to specific datasets. Open datasets are provided to facilitate civic engagement and government transparency. However, making the data available does not guarantee usage. This paper examined the nature of context-related information provided together with the datasets and identified the challenges users encounter while using the resources.
The authors extracted descriptive text provided together with (often at the top of) datasets (N = 216) and the nature of questions and comments users post in relation to the dataset. They then segmented text descriptions and user comments into “idea units” and applied open-coding with constant comparison method. This allowed them to come up with thematic issues that descriptions focus on and the challenges users encounter.
Results of the analysis revealed that context-related descriptions are limited and normative. Users are expected to figure out how to use the data. Analysis of user comments/questions revealed four areas of challenge they encounter: organization and accessibility of the data, clarity and completeness, usefulness and accuracy and language (spelling and grammar). Data providers can do more to address these issues.
The purpose of the study is to understand the nature of open data provision and suggest ways of making open data more accessible to “non expert users”. As such, it is not focused on generalizing about open data provision in various countries as such provision may be different based on jurisdiction.
The study provides insight about ways of organizing open dataset that the resource can be accessible by the general public. It also provides suggestions about how open data providers could consider users' perspectives including providing continuous support.
Research on open data often focuses on technological, policy and political perspectives. Arguably, this is the first study on analysis of context-related information in open-datasets. Datasets do not “speak for themselves” because they require context for analysis and interpretation. Understanding the nature of context-related information in open dataset is original idea.
This work was supported by the National Science Foundation [grant number IIS-1441561] and SFU/SSHRC Small grant [Number 632223] provided to the first author.
Gebre, E.H. and Morales, E. (2020), "How “accessible” is open data? Analysis of context-related information and users’ comments in open datasets", Information and Learning Sciences, Vol. 121 No. 1/2, pp. 19-36. https://doi.org/10.1108/ILS-08-2019-0086
Emerald Publishing Limited
Copyright © 2020, Emerald Publishing Limited