The output of academic literature has increased significantly due to digital technology, presenting researchers with a challenge across every discipline, including materials science, as it is impossible to manually read and extract knowledge from millions of published literature. The purpose of this study is to address this challenge by exploring knowledge extraction in materials science, as applied to digital scholarship. An overriding goal is to help inform readers about the status knowledge extraction in materials science.
The authors conducted a two-part analysis, comparing knowledge extraction methods applied materials science scholarship, across a sample of 22 articles; followed by a comparison of HIVE-4-MAT, an ontology-based knowledge extraction and MatScholar, a named entity recognition (NER) application. This paper covers contextual background, and a review of three tiers of knowledge extraction (ontology-based, NER and relation extraction), followed by the research goals and approach.
The results indicate three key needs for researchers to consider for advancing knowledge extraction: the need for materials science focused corpora; the need for researchers to define the scope of the research being pursued, and the need to understand the tradeoffs among different knowledge extraction methods. This paper also points to future material science research potential with relation extraction and increased availability of ontologies.
To the best of the authors’ knowledge, there are very few studies examining knowledge extraction in materials science. This work makes an important contribution to this underexplored research area.
The research reported on in this paper is supported, in part, by the US National Science Foundation, Office of Advanced Cyberinfrastructure (NSF/OAC: #1940239 and #1940199). The authors also acknowledge the support of Cyra Gallano and Evan Dubrunfaut, Drexel University, for their role as data evaluators.
Zhao, X., Greenberg, J., Meschke, V., Toberer, E. and Hu, X. (2021), "An exploratory analysis: extracting materials science knowledge from unstructured scholarly data", The Electronic Library, Vol. 39 No. 3, pp. 469-485. https://doi.org/10.1108/EL-11-2020-0320
Emerald Publishing Limited
Copyright © 2021, Emerald Publishing Limited