To read this content please select one of the options below:

Mining GitHub for research and education: challenges and opportunities

Mohammad AlMarzouq (Department of Quantitative Methods and Information Systems, Kuwait University College of Business Administration, Shidadiya, Kuwait)
Abdullatif AlZaidan (Department of Quantitative Methods and Information Systems, Kuwait University College of Business Administration, Shidadiya, Kuwait)
Jehad AlDallal (Department of Information Science, Kuwait University College of Life Sciences, Shidadiya, Kuwait)

International Journal of Web Information Systems

ISSN: 1744-0084

Article publication date: 3 July 2020

Issue publication date: 8 October 2020

856

Abstract

Purpose

This study aims to highlight the challenges and opportunities of using GitHub as a data source in both research and programming education.

Design/methodology/approach

This study provides general overview of the challenges and opportunities faced while conducting empirical research using GitHub as a data source. The challenges and opportunities are framed using the input–process–output model of open-source software.

Findings

GitHub data accessed from the application programming interface (API) can have several limitations, which can be overcome by Web scraping and using external data repositories such as GHArchive and GHTorrent. There are also several idiosyncrasies about GitHub that researchers need to be aware of to be able to use the data effectively, which can represent an opportunity for research. The challenges and opportunities are summarized for the licenses, community, development process and product of free/libra and open-source software communities hosted on GitHub.

Originality/value

This study provides a summary of GitHub-related challenges and opportunities that researchers can leverage to improve their empirical research. Furthermore, this summary can be a valuable resource for instructors that plan to use GitHub as a data source in their data-focused programming courses.

Keywords

Acknowledgements

This work was supported and funded by Kuwait University, Research Project No. (IQ04/16).

Citation

AlMarzouq, M., AlZaidan, A. and AlDallal, J. (2020), "Mining GitHub for research and education: challenges and opportunities", International Journal of Web Information Systems, Vol. 16 No. 4, pp. 451-473. https://doi.org/10.1108/IJWIS-03-2020-0016

Publisher

:

Emerald Publishing Limited

Copyright © 2020, Emerald Publishing Limited

Related articles