To read this content please select one of the options below:

Data mining topics in the discipline of library and information science: analysis of influential terms and Dirichlet multinomial regression topic model

Sukjin You (University of Wisconsin-Milwaukee, Milwaukee, Wisconsin, USA)
Soohyung Joo (University of Kentucky, Lexington, Kentucky, USA)
Marie Katsurai (Department of Intelligent Information Engineering and Sciences, Doshisha University, Kyotanabe, Japan)

Aslib Journal of Information Management

ISSN: 2050-3806

Article publication date: 19 December 2022

Issue publication date: 2 January 2024

635

Abstract

Purpose

The purpose of this study is to explore to which extent data mining research would be associated with the library and information science (LIS) discipline. This study aims to identify data mining related subject terms and topics in representative LIS scholarly publications.

Design/methodology/approach

A large set of bibliographic records over 38,000 was collected from a scholarly database representing the fields of LIS and the data mining, respectively. A multitude of text mining techniques were applied to investigate prevailing subject terms and research topics, such as influential term analysis and Dirichlet multinomial regression topic modeling.

Findings

The findings of this study revealed the relationship between the LIS and data mining research domains. Various data mining method terms were observed in recent LIS publications, such as machine learning, artificial intelligence and neural networks. The topic modeling result identified prevailing data mining related research topics in LIS, such as machine learning, deep learning, big data and among others. In addition, this study investigated the trends of popular topics in LIS over time in the recent decade.

Originality/value

This investigation is one of a few studies that empirically investigated the relationships between the LIS and data mining research domains. Multiple text mining techniques were employed to delineate to which extent the two research domains would be associated with each other based on both at the term-level and topic-level analysis. Methodologically, the study identified influential terms in each domain using multiple feature selection indices. In addition, Dirichlet multinomial regression was applied to explore LIS topics in relation to data mining.

Keywords

Citation

You, S., Joo, S. and Katsurai, M. (2024), "Data mining topics in the discipline of library and information science: analysis of influential terms and Dirichlet multinomial regression topic model", Aslib Journal of Information Management, Vol. 76 No. 1, pp. 65-85. https://doi.org/10.1108/AJIM-05-2022-0260

Publisher

:

Emerald Publishing Limited

Copyright © 2022, Emerald Publishing Limited

Related articles