To read this content please select one of the options below:

A dependency-based machine learning approach to the identification of research topics: a case in COVID-19 studies

Haoran Zhu (Huazhong University of Science and Technology, Wuhan, China)
Lei Lei (Department of English, School of Foreign Languages, Shanghai Jiao Tong University, Shanghai, China)

Library Hi Tech

ISSN: 0737-8831

Article publication date: 24 August 2021

Issue publication date: 29 March 2022

525

Abstract

Purpose

Previous research concerning automatic extraction of research topics mostly used rule-based or topic modeling methods, which were challenged due to the limited rules, the interpretability issue and the heavy dependence on human judgment. This study aims to address these issues with the proposal of a new method that integrates machine learning models with linguistic features for the identification of research topics.

Design/methodology/approach

First, dependency relations were used to extract noun phrases from research article texts. Second, the extracted noun phrases were classified into topics and non-topics via machine learning models and linguistic and bibliometric features. Lastly, a trend analysis was performed to identify hot research topics, i.e. topics with increasing popularity.

Findings

The new method was experimented on a large dataset of COVID-19 research articles and achieved satisfactory results in terms of f-measures, accuracy and AUC values. Hot topics of COVID-19 research were also detected based on the classification results.

Originality/value

This study demonstrates that information retrieval methods can help researchers gain a better understanding of the latest trends in both COVID-19 and other research areas. The findings are significant to both researchers and policymakers.

Keywords

Acknowledgements

This study was supported by an MOE (Ministry of Education of China) Foundation Project of Humanities and Social Sciences (Linguistic Complexity-based Research on Text Classification, Grant No. 21YJC740085).

Citation

Zhu, H. and Lei, L. (2022), "A dependency-based machine learning approach to the identification of research topics: a case in COVID-19 studies", Library Hi Tech, Vol. 40 No. 2, pp. 495-515. https://doi.org/10.1108/LHT-01-2021-0051

Publisher

:

Emerald Publishing Limited

Copyright © 2021, Emerald Publishing Limited

Related articles