Search results

1 – 10 of over 100000
Article
Publication date: 10 May 2022

Qiang Cao, Xian Cheng and Shaoyi Liao

How to extract useful information from a very large volume of literature is a great challenge for librarians. Topic modeling technique, which is a machine learning algorithm to…

Abstract

Purpose

How to extract useful information from a very large volume of literature is a great challenge for librarians. Topic modeling technique, which is a machine learning algorithm to uncover latent thematic structures from large collections of documents, is a widespread approach in literature analysis, especially with the rapid growth of academic literature. In this paper, a comparison of topic modeling based literature analysis has been done using full texts and abstracts of articles.

Design/methodology/approach

The authors conduct a comparison study of topic modeling on full-text paper and corresponding abstract to assess the influence of the different types of documents been used as input for topic modeling. In particular, the authors use the large volumes of COVID-19 research literature as a case study for topic modeling based literature analysis. The authors illustrate the research topics, research trends and topic similarity of COVID-19 research by using Latent Dirichlet allocation (LDA) and topic visualization method.

Findings

The authors found 14 research topics for COVID-19 research. The authors also found that the topic similarity between using full-text paper and corresponding abstract is higher when more documents are analyzed.

Originality/value

First, this study contributes to the literature analysis approach. The comparison study can help us understand the influence of the different types of documents on the results of topic modeling analysis. Second, the authors present an overview of COVID-19 research by summarizing 14 research topics for it. This automated literature analysis can help specialists in the health and medical domain or other people to quickly grasp the structured morphology of the current studies for COVID-19.

Details

Library Hi Tech, vol. 41 no. 2
Type: Research Article
ISSN: 0737-8831

Keywords

Article
Publication date: 29 April 2022

Chih-Ming Chen, Szu-Yu Ho and Chung Chang

This study aims to develop a hierarchical topic analysis tool (HTAT) based on hierarchical Latent Dirichelet allocation (hLDA) to support digital humanities research that is…

Abstract

Purpose

This study aims to develop a hierarchical topic analysis tool (HTAT) based on hierarchical Latent Dirichelet allocation (hLDA) to support digital humanities research that is associated with the need of topic exploration on the Digital Humanities Platform for Mr. Lo Chia-Lun’s Writings (DHP-LCLW). HTAT can assist humanities scholars on distant reading with analysis of hierarchical text topics, through classifying time-stamped texts into multiple historical eras, conducting hierarchical topic modeling (HTM) according to the texts from different eras and presenting through visualization. The comparative network diagram is another function provided to assist humanities scholars in comparing the difference in the topics they wish to explore and to track how the concept of a topic changes over time from a particular perspective. In addition, HTAT can also provide humanities scholars with the feature to view source texts, thus having high potential to be applied in promoting the effectiveness of topic exploration due to simultaneously integrating both the topic exploration functions of distant reading and close reading.

Design/methodology/approach

This study adopts a counterbalanced experimental design to examine whether there is significant differences in the effectiveness of topic inquiry, the number of relevant topics inquired and the time spent on them when research participants were alternately conducting text exploration using DHP-LCLW with HTAT or DHP-LCLW with Single-layer Topic Analysis Tool (SLTAT). A technology acceptance questionnaire and semi-structured interviews were also conducted to understand the research participants' perception and feelings toward using the two different tools to assist topic inquiry.

Findings

The experimental results show that DHP-LCLW with HTAT could better assist the research participants, in comparison with DHP-LCLW with SLTAT, to grasp the topic context of the texts from two particular perspectives assigned by this study within a short period. In addition, the results of the interviews revealed that DHP-LCLW with HTAT, in comparison with SLTAT, was able to provide a topic terms that better met research participnats' expectations and needs, and effectively guided them to the corresponding texts for close reading. In the analysis of technology acceptance and interview data, it can be found that the research participants have a high and positive tendency toward using DHP-LCLW with HTAT to assist topic inquiry.

Research limitations/implications

The Jieba Chinese word segmentation system was used in the Mr. Lo Chia-Lun’s Writings Database in this study, to perform word segmentation on Mr. Lo Chia-Lun’s writing texts for topic modeling based on hLDA. Since Jieba word segmentation system is a lexicon based word segmentation system, it cannot identify new words that have still not been collected in the lexicon well. In this case, the correctness of word segmentation on the target texts will affect the results of hLDA topic modeling, and the effectiveness of HTAT in assisting humanities scholars for topic inquiry.

Practical implications

An HTAT was developed to support digital humanities research in this study. With HTAT, DHP-LCLW provides hmanities scholars with topic clues from different hierarchical perspectives for textual exploration, and with temporal and comparative network diagrams to assist humanities scholars in tracking the evolution of the topics of specific perspectives over time, to gain a more comprehensive understanding of the overall context of the texts.

Originality/value

In recent years, topic analysis technology that can automatically extract key topic information from a large amount of texts has been developed rapidly, but the topics generated from traditional topic analysis models like LDA (Latent Dirichelet allocation) make it difficult for users to understand the differences in the topics of texts with different hierarchical levels. Thus, this study proposes HTAT which uses hLDA to build a hierarchical topic tree with a tree-like structure without the need to define the number of topics in advance, enabling humanities scholars to quickly grasp the concept of textual topics and use different hierarchical perspectives for further textual exploration. At the same time, it also provides a combination function of temporal division and comparative network diagram to assist humanities scholars in exploring topics and their changes in different eras, which helps them discover more useful research clues or findings.

Details

Aslib Journal of Information Management, vol. 75 no. 1
Type: Research Article
ISSN: 2050-3806

Keywords

Article
Publication date: 19 December 2022

Sukjin You, Soohyung Joo and Marie Katsurai

The purpose of this study is to explore to which extent data mining research would be associated with the library and information science (LIS) discipline. This study aims to…

Abstract

Purpose

The purpose of this study is to explore to which extent data mining research would be associated with the library and information science (LIS) discipline. This study aims to identify data mining related subject terms and topics in representative LIS scholarly publications.

Design/methodology/approach

A large set of bibliographic records over 38,000 was collected from a scholarly database representing the fields of LIS and the data mining, respectively. A multitude of text mining techniques were applied to investigate prevailing subject terms and research topics, such as influential term analysis and Dirichlet multinomial regression topic modeling.

Findings

The findings of this study revealed the relationship between the LIS and data mining research domains. Various data mining method terms were observed in recent LIS publications, such as machine learning, artificial intelligence and neural networks. The topic modeling result identified prevailing data mining related research topics in LIS, such as machine learning, deep learning, big data and among others. In addition, this study investigated the trends of popular topics in LIS over time in the recent decade.

Originality/value

This investigation is one of a few studies that empirically investigated the relationships between the LIS and data mining research domains. Multiple text mining techniques were employed to delineate to which extent the two research domains would be associated with each other based on both at the term-level and topic-level analysis. Methodologically, the study identified influential terms in each domain using multiple feature selection indices. In addition, Dirichlet multinomial regression was applied to explore LIS topics in relation to data mining.

Details

Aslib Journal of Information Management, vol. 76 no. 1
Type: Research Article
ISSN: 2050-3806

Keywords

Article
Publication date: 29 August 2022

Yue Yuan, Kan Liu and Yanli Wang

The purpose of this study is to analyze the topics of COVID-19 news articles for better obtaining the relationship among and the evolution of news topics, helping to manage the…

Abstract

Purpose

The purpose of this study is to analyze the topics of COVID-19 news articles for better obtaining the relationship among and the evolution of news topics, helping to manage the infodemic from a quantified perspective.

Design/methodology/approach

To analyze COVID-19 news articles explicitly, this paper proposes a prism architecture. Based on epidemic-related news on China Daily and CNN, this paper identifies the topics of the two news agencies, elucidates the relationship between and amongst these topics, tracks topic changes as the epidemic progresses and presents the results visually and compellingly.

Findings

The analysis results show that CNN has a more concentrated distribution of topics than China Daily, with the former focusing on government-related information, and the latter on medical. Besides, the pandemic has had a big impact on CNN and China Daily's reporting preference. The evolution analysis of news topics indicates that the dynamic changes of topics have a strong relationship with the pandemic process.

Originality/value

This paper offers novel perspectives to review the topics of COVID-19 news articles and provide new understandings of news articles during the initial outbreak. The analysis results expand the scope of infodemic-related studies.

Details

Aslib Journal of Information Management, vol. 75 no. 2
Type: Research Article
ISSN: 2050-3806

Keywords

Article
Publication date: 15 June 2021

Chao Yang, Cui Huang, Jun Su and Shutao Wang

The paper aims to explore whether topic analysis (identification of the core contents, trends and topic distribution in the target field) can be performed using a more low-cost…

Abstract

Purpose

The paper aims to explore whether topic analysis (identification of the core contents, trends and topic distribution in the target field) can be performed using a more low-cost and easily applicable method that relies on a small dataset, and how we can obtain this small dataset based on the features of the publications.

Design/methodology/approach

The paper proposes a topic analysis method based on prolific and authoritative researchers (PARs). First, the authors identify PARs in a specific discipline by considering the number of publications and citations of authors. Based on the research publications of PARs (small dataset), the authors then construct a keyword co-occurrence network and perform a topic analysis. Finally, the authors compare the method with the traditional method.

Findings

The authors found that using a small dataset (only 6.47% of the complete dataset in our experiment) for topic analysis yields relatively high-quality and reliable results. The comparison analysis reveals that the proposed method is quite similar to the results of traditional large dataset analysis in terms of publication time distribution, research areas, core keywords and keyword network density.

Research limitations/implications

Expert opinions are needed in determining the parameters of PARs identification algorithm. The proposed method may neglect the publications of junior researchers and its biases should be discussed.

Practical implications

This paper gives a practical way on how to implement disciplinary analysis based on a small dataset, and how to identify this dataset by proposing a PARs-based topic analysis method. The proposed method presents a useful view of the data based on PARs that can produce results comparable to traditional method, and thus will improve the effectiveness and cost of interdisciplinary topic analysis.

Originality/value

This paper proposes a PARs-based topic analysis method and verifies that topic analysis can be performed using a small dataset.

Details

Library Hi Tech, vol. 39 no. 4
Type: Research Article
ISSN: 0737-8831

Keywords

Article
Publication date: 10 July 2023

Surabhi Singh, Shiwangi Singh, Alex Koohang, Anuj Sharma and Sanjay Dhir

The primary aim of this study is to detail the use of soft computing techniques in business and management research. Its objectives are as follows: to conduct a comprehensive…

Abstract

Purpose

The primary aim of this study is to detail the use of soft computing techniques in business and management research. Its objectives are as follows: to conduct a comprehensive scientometric analysis of publications in the field of soft computing, to explore the evolution of keywords, to identify key research themes and latent topics and to map the intellectual structure of soft computing in the business literature.

Design/methodology/approach

This research offers a comprehensive overview of the field by synthesising 43 years (1980–2022) of soft computing research from the Scopus database. It employs descriptive analysis, topic modelling (TM) and scientometric analysis.

Findings

This study's co-citation analysis identifies three primary categories of research in the field: the components, the techniques and the benefits of soft computing. Additionally, this study identifies 16 key study themes in the soft computing literature using TM, including decision-making under uncertainty, multi-criteria decision-making (MCDM), the application of deep learning in object detection and fault diagnosis, circular economy and sustainable development and a few others.

Practical implications

This analysis offers a valuable understanding of soft computing for researchers and industry experts and highlights potential areas for future research.

Originality/value

This study uses scientific mapping and performance indicators to analyse a large corpus of 4,512 articles in the field of soft computing. It makes significant contributions to the intellectual and conceptual framework of soft computing research by providing a comprehensive overview of the literature on soft computing literature covering a period of four decades and identifying significant trends and topics to direct future research.

Details

Industrial Management & Data Systems, vol. 123 no. 8
Type: Research Article
ISSN: 0263-5577

Keywords

Article
Publication date: 5 January 2023

Qingqing Zhou

With the rapid development of social media, the occurrence and evolution of emergency events are often accompanied by massive users' expressions. The fine-grained analysis on…

Abstract

Purpose

With the rapid development of social media, the occurrence and evolution of emergency events are often accompanied by massive users' expressions. The fine-grained analysis on users' expressions can provide accurate and reliable information for event processing. Hence, 2,003,814 expressions on a major malignant emergency event were mined from multiple dimensions in this paper.

Design/methodology/approach

This paper conducted finer-grained analysis on users' online expressions in an emergency event. Specifically, the authors firstly selected a major emergency event as the research object and collected the event-related user expressions that lasted nearly two years to describe the dynamic evolution trend of the event. Then, users' expression preferences were identified by detecting anomic expressions, classifying sentiment tendencies and extracting topics in expressions. Finally, the authors measured the explicit and implicit impacts of different expression preferences and obtained relations between the differential expression preferences.

Findings

Experimental results showed that users have both short- and long-term attention to emergency events. Their enthusiasm for discussing the event will be quickly dispelled and easily aroused. Meanwhile, most users prefer to make rational and normative expressions of events, and the expression topics are diversified. In addition, compared with anomic negative expressions, anomic expressions in positive sentiments are more common. In conclusion, the integration of multi-dimensional analysis results of users' expression preferences (including discussion heat, preference impacts and preference relations) is an effective means to support emergency event processing.

Originality/value

To the best of the authors' knowledge, it is the first research to conduct in-depth and fine-grained analysis of user expression in emergencies, so as to get in-detail and multi-dimensional characteristics of users' online expressions for supporting event processing.

Details

Aslib Journal of Information Management, vol. 76 no. 2
Type: Research Article
ISSN: 2050-3806

Keywords

Article
Publication date: 11 June 2013

Heather Lutz and Laura Birou

This paper aims to provide the results of a large‐scale survey of courses dedicated to the field of logistics in higher education. This research is unique because it represents…

1689

Abstract

Purpose

This paper aims to provide the results of a large‐scale survey of courses dedicated to the field of logistics in higher education. This research is unique because it represents the first large‐scale study of both undergraduate and graduate logistics courses.

Design/methodology/approach

Content analysis was performed on each syllabus to identify the actual course coverage: requirements, pedagogy and content emphasis. Content analysis is a descriptive approach to categorize data and the results may be limited by the categorizations used in analysis. This aggregated information was utilized to compare historical research findings in this area with the current skills identified as important for career success. These data provide input for gap analysis between offerings in higher education and those needs identified by practitioners.

Findings

Data gathering efforts yielded a sample of 118 logistics courses representing 77 schools and six different countries. The aggregate number of topics covered in undergraduate courses totalled 95, while graduate courses covered 81 different topics. The primary evaluation techniques include the traditional exams, projects and homework. Details regarding learning objectives and grading schema are provided along with a gap analysis between the coverage of logistics courses and the needs identified by practitioners.

Originality/value

The goal is to use these data as a means of continuous improvement in the quality and value of the educational experience. The findings are designed to foster information sharing and provide data for benchmarking efforts in the development of logistics courses and curricula in academia as well as training and development by professionals in the field of logistics.

Details

Supply Chain Management: An International Journal, vol. 18 no. 4
Type: Research Article
ISSN: 1359-8546

Keywords

Article
Publication date: 26 November 2021

Soohyung Joo, Jennifer Hootman and Marie Katsurai

This study aims to explore knowledge structure and research trends in the domain of digital humanities (DH) in the recent decade. The study identified prevailing topics and then…

Abstract

Purpose

This study aims to explore knowledge structure and research trends in the domain of digital humanities (DH) in the recent decade. The study identified prevailing topics and then, analyzed trends of such topics over time in the DH field.

Design/methodology/approach

Research bibliographic data in the area of DH were collected from scholarly databases. Multiple text mining techniques were used to identify prevailing research topics and trends, such as keyword co-occurrences, bigram analysis, structural topic models and bi-term topic models.

Findings

Term-level analysis revealed that cultural heritage, geographic information, semantic web, linked data and digital media were among the most popular topics in the recent decade. Structural topic models identified that linked open data, text mining, semantic web and ontology, text digitization and social network analysis received increased attention in the DH field.

Originality/value

This study applied existent text mining techniques to understand the research domain in DH. The study collected a large set of bibliographic text, representing the area of DH from multiple academic databases and explored research trends based on structural topic models.

Details

Journal of Documentation, vol. 78 no. 4
Type: Research Article
ISSN: 0022-0418

Keywords

Article
Publication date: 30 May 2023

Carla Bonato Marcolin, Eduardo Henrique Diniz, João Luiz Becker and Henrique Pontes Gonçalves de Oliveira

In a context where human–machine interaction is growing, understanding the limits between automated and human-based methods may leverage qualitative research. This paper aims to…

Abstract

Purpose

In a context where human–machine interaction is growing, understanding the limits between automated and human-based methods may leverage qualitative research. This paper aims to compare human and machine analyses, highlighting the challenges and opportunities of both approaches.

Design/methodology/approach

This study applied qualitative secondary analysis (QSA) with machine learning-based text mining on qualitative data from 25 interviews previously analyzed with traditional qualitative content analysis.

Findings

By analyzing both techniques' strengths and weaknesses, this study complements the results from the original research work. The previous human model failed to point to a particular aspect of the case, while the machine analysis did not recognize the sequence of time in the interviewee's discourse.

Originality/value

This study demonstrates that combining content analysis with text mining techniques improves the quality of the research output. Researchers may, therefore, better handle biases from humans and machines in traditional qualitative and quantitative research.

Details

Qualitative Research in Organizations and Management: An International Journal, vol. 18 no. 2
Type: Research Article
ISSN: 1746-5648

Keywords

1 – 10 of over 100000