Search results

1 – 10 of over 93000
Article
Publication date: 10 May 2022

Qiang Cao, Xian Cheng and Shaoyi Liao

How to extract useful information from a very large volume of literature is a great challenge for librarians. Topic modeling technique, which is a machine learning algorithm to…

Abstract

Purpose

How to extract useful information from a very large volume of literature is a great challenge for librarians. Topic modeling technique, which is a machine learning algorithm to uncover latent thematic structures from large collections of documents, is a widespread approach in literature analysis, especially with the rapid growth of academic literature. In this paper, a comparison of topic modeling based literature analysis has been done using full texts and abstracts of articles.

Design/methodology/approach

The authors conduct a comparison study of topic modeling on full-text paper and corresponding abstract to assess the influence of the different types of documents been used as input for topic modeling. In particular, the authors use the large volumes of COVID-19 research literature as a case study for topic modeling based literature analysis. The authors illustrate the research topics, research trends and topic similarity of COVID-19 research by using Latent Dirichlet allocation (LDA) and topic visualization method.

Findings

The authors found 14 research topics for COVID-19 research. The authors also found that the topic similarity between using full-text paper and corresponding abstract is higher when more documents are analyzed.

Originality/value

First, this study contributes to the literature analysis approach. The comparison study can help us understand the influence of the different types of documents on the results of topic modeling analysis. Second, the authors present an overview of COVID-19 research by summarizing 14 research topics for it. This automated literature analysis can help specialists in the health and medical domain or other people to quickly grasp the structured morphology of the current studies for COVID-19.

Details

Library Hi Tech, vol. 41 no. 2
Type: Research Article
ISSN: 0737-8831

Keywords

Article
Publication date: 7 August 2017

Daniel Carnerud

The purpose of this paper is to explore and describe research presented in the International Journal of Quality & Reliability Management (IJQRM), thereby creating an increased…

1051

Abstract

Purpose

The purpose of this paper is to explore and describe research presented in the International Journal of Quality & Reliability Management (IJQRM), thereby creating an increased understanding of how the areas of research have evolved through the years. An additional purpose is to show how text mining methodology can be used as a tool for exploration and description of research publications.

Design/methodology/approach

The study applies text mining methodologies to explore and describe the digital library of IJQRM from 1984 up to 2014. To structure and condense the data, k-means clustering and probabilistic topic modeling with latent Dirichlet allocation is applied. The data set consists of research paper abstracts.

Findings

The results support the suggestion of the occurrence of trends, fads and fashion in research publications. Research on quality function deployment (QFD) and reliability management are noted to be on the downturn whereas research on Six Sigma with a focus on lean, innovation, performance and improvement on the rise. Furthermore, the study confirms IJQRM as a scientific journal with quality and reliability management as primary areas of coverage, accompanied by specific topics such as total quality management, service quality, process management, ISO, QFD and Six Sigma. The study also gives an insight into how text mining can be used as a way to efficiently explore and describe large quantities of research paper abstracts.

Research limitations/implications

The study focuses on abstracts of research papers, thus topics and categories that could be identified via other journal publications, such as book reviews; general reviews; secondary articles; editorials; guest editorials; awards for excellence (notifications); introductions or summaries from conferences; notes from the publisher; and articles without an abstract, are excluded.

Originality/value

There do not seem to be any prior text mining studies that apply cluster modeling and probabilistic topic modeling to research article abstracts in the IJQRM. This study therefore offers a unique perspective on the journal’s content.

Details

International Journal of Quality & Reliability Management, vol. 34 no. 7
Type: Research Article
ISSN: 0265-671X

Keywords

Article
Publication date: 30 May 2018

Anna L. Neatrour, Elizabeth Callaway and Rebekah Cummings

This paper aims to determine if the digital humanities technique of topic modeling would reveal interesting patterns in a corpus of library-themed literature focused on the future…

1302

Abstract

Purpose

This paper aims to determine if the digital humanities technique of topic modeling would reveal interesting patterns in a corpus of library-themed literature focused on the future of libraries and pioneer a collaboration model in librarian-led digital humanities projects. By developing the project, librarians learned how to better support digital humanities by actually doing digital humanities, as well as gaining insight on the variety of approaches taken by researchers and commenters to the idea of the future of libraries.

Design/methodology/approach

The researchers collected a corpus of over 150 texts (articles, blog posts, book chapters, websites, etc.) that all addressed the future of the library. They ran several instances of latent Dirichlet allocation style topic modeling on the corpus using the programming language R. Once they produced a run in which the topics were cohesive and discrete, they produced word-clouds of the words associated with each topic, visualized topics through time and examined in detail the top five documents associated with each topic.

Findings

The research project provided an effective way for librarians to gain practical experience in digital humanities and develop a greater understanding of collaborative workflows in digital humanities. By examining a corpus of library-themed literature, the researchers gained new insight into how the profession grapples with the idea of the future and an appreciation for topic modeling as a form of literature review.

Originality/value

Topic modeling a future-themed corpus of library literature is a unique research project and provides a way to support collaboration between library faculty and researchers from outside the library.

Details

Digital Library Perspectives, vol. 34 no. 3
Type: Research Article
ISSN: 2059-5816

Keywords

Article
Publication date: 29 April 2021

Heng-Yang Lu, Yi Zhang and Yuntao Du

Topic model has been widely applied to discover important information from a vast amount of unstructured data. Traditional long-text topic models such as Latent Dirichlet…

Abstract

Purpose

Topic model has been widely applied to discover important information from a vast amount of unstructured data. Traditional long-text topic models such as Latent Dirichlet Allocation may suffer from the sparsity problem when dealing with short texts, which mostly come from the Web. These models also exist the readability problem when displaying the discovered topics. The purpose of this paper is to propose a novel model called the Sense Unit based Phrase Topic Model (SenU-PTM) for both the sparsity and readability problems.

Design/methodology/approach

SenU-PTM is a novel phrase-based short-text topic model under a two-phase framework. The first phase introduces a phrase-generation algorithm by exploiting word embeddings, which aims to generate phrases with the original corpus. The second phase introduces a new concept of sense unit, which consists of a set of semantically similar tokens for modeling topics with token vectors generated in the first phase. Finally, SenU-PTM infers topics based on the above two phases.

Findings

Experimental results on two real-world and publicly available datasets show the effectiveness of SenU-PTM from the perspectives of topical quality and document characterization. It reveals that modeling topics on sense units can solve the sparsity of short texts and improve the readability of topics at the same time.

Originality/value

The originality of SenU-PTM lies in the new procedure of modeling topics on the proposed sense units with word embeddings for short-text topic discovery.

Details

Data Technologies and Applications, vol. 55 no. 5
Type: Research Article
ISSN: 2514-9288

Keywords

Article
Publication date: 28 February 2022

Paritosh Pramanik and Rabin K. Jana

This paper aims to discuss the suitability of topic modeling as a review method, identifies and compares the machine learning (ML) research trends in five primary business…

Abstract

Purpose

This paper aims to discuss the suitability of topic modeling as a review method, identifies and compares the machine learning (ML) research trends in five primary business organization verticals.

Design/methodology/approach

This study presents a review framework of published research about adopting ML techniques in a business organization context. It identifies research trends and issues using topic modeling through the Latent Dirichlet allocation technique in conjunction with other text analysis techniques in five primary business verticals – human resources (HR), marketing, operations, strategy and finance.

Findings

The results identify that the ML adoption is maximum in the marketing domain and minimum in the HR domain. The operations domain witnesses the application of ML to the maximum number of distinct research areas. The results also help to identify the potential areas of ML applications in future.

Originality/value

This paper contributes to the existing literature by finding trends of ML applications in the business domain through the review of published research. Although there is a growth of research publications in ML in the business domain, literature review papers are scarce. Therefore, the endeavor of this study is to do a thorough review of the current status of ML applications in business by analyzing research articles published in the past ten years in various journals.

Details

Measuring Business Excellence, vol. 27 no. 4
Type: Research Article
ISSN: 1368-3047

Keywords

Article
Publication date: 14 July 2022

Shrawan Kumar Trivedi, Pradipta Patra, Amrinder Singh, Pijush Deka and Praveen Ranjan Srivastava

The COVID-19 pandemic has impacted 222 countries across the globe, with millions of people losing their lives. The threat from the virus may be assessed from the fact that most…

Abstract

Purpose

The COVID-19 pandemic has impacted 222 countries across the globe, with millions of people losing their lives. The threat from the virus may be assessed from the fact that most countries across the world have been forced to order partial or complete shutdown of their economies for a period of time to contain the spread of the virus. The fallout of this action manifested in loss of livelihood, migration of the labor force and severe impact on mental health due to the long duration of confinement to homes or residences.

Design/methodology/approach

The current study identifies the focus areas of the research conducted on the COVID-19 pandemic. Abstracts of papers on the subject were collated from the SCOPUS database for the period December 2019 to June 2020. The collected sample data (after preprocessing) was analyzed using Topic Modeling with Latent Dirichlet Allocation.

Findings

Based on the research papers published within the mentioned timeframe, the study identifies the 10 most prominent topics that formed the area of interest for the COVID-19 pandemic research.

Originality/value

While similar studies exist, no other work has used topic modeling to comprehensively analyze the COVID-19 literature by considering diverse fields and domains.

Details

Journal of Modelling in Management, vol. 18 no. 4
Type: Research Article
ISSN: 1746-5664

Keywords

Article
Publication date: 24 August 2018

Eunhye (Olivia) Park, Bongsug (Kevin) Chae and Junehee Kwon

The purpose of this study was to explore influences of review-related information on topical proportions and the pattern of word appearances in each topic (topical content) using…

1164

Abstract

Purpose

The purpose of this study was to explore influences of review-related information on topical proportions and the pattern of word appearances in each topic (topical content) using structural topic model (STM).

Design/methodology/approach

For 173,607 Yelp.com reviews written in 2005-2016, STM-based topic modeling was applied with inclusion of covariates in addition to traditional statistical analyses.

Findings

Differences in topic prevalence and topical contents were found between certified green and non-certified restaurants. Customers’ recognition in sustainable food topics were changed over time.

Research limitations/implications

This study demonstrates the application of STM for the systematic analysis of a large amount of text data.

Originality/value

Limited study in the hospitality literature examined the influence of review-level metadata on topic and term estimation. Through topic modeling, customers’ natural responses toward green practices were identified.

研究目的

本研究旨在通过结构性话题建模(STM)方法以开拓评论性内容对于话题组成和词条构成的影响。

研究设计/方法/途径

本论文采用 173,607 份 Yelp.com 在 2015 至 2016 年间的评论内容为样本,STM 分析结合共变量形成话题性建模。

研究结果

话题趋势和话题内容的不同存在于认证过的绿色餐馆与非认证的绿色餐馆中。消费者对于可持续性的食物话题兴趣随着时间而改变。

研究理论限制/意义

本研究对 STM 相关大规模文本型数据的系统分析方法给与启示。

研究原创性/价值

在酒店管理文献中很少有文章研究评论性元数据对于话题和词条预估的影响。通过话题建模,消费者对于绿色措施的反馈获得了梳理和确认。

Open Access
Article
Publication date: 13 February 2024

Nicola Cobelli and Silvia Blasi

This paper explores the Adoption of Technological Innovation (ATI) in the healthcare industry. It investigates how the literature has evolved, and what are the emerging innovation…

Abstract

Purpose

This paper explores the Adoption of Technological Innovation (ATI) in the healthcare industry. It investigates how the literature has evolved, and what are the emerging innovation dimensions in the healthcare industry adoption studies.

Design/methodology/approach

We followed a mixed-method approach combining bibliometric methods and topic modeling, with 57 papers being deeply analyzed.

Findings

Our results identify three latent topics. The first one is related to the digitalization in healthcare with a specific focus on the COVID-19 pandemic. The second one groups up the word combinations dealing with the research models and their constructs. The third one refers to the healthcare systems/professionals and their resistance to ATI.

Research limitations/implications

The study’s sample selection focused on scientific journals included in the Academic Journal Guide and in the FT Research Rank. However, the paper identifies trends that offer managerial insights for stakeholders in the healthcare industry.

Practical implications

ATI has the potential to revolutionize the health service delivery system and to decentralize services traditionally provided in hospitals or medical centers. All this would contribute to a reduction in waiting lists and the provision of proximity services.

Originality/value

The originality of the paper lies in the combination of two methods: bibliometric analysis and topic modeling. This approach allowed us to understand the ATI evolutions in the healthcare industry.

Details

European Journal of Innovation Management, vol. 27 no. 9
Type: Research Article
ISSN: 1460-1060

Keywords

Article
Publication date: 24 July 2020

Thanh-Tho Quan, Duc-Trung Mai and Thanh-Duy Tran

This paper proposes an approach to identify categorical influencers (i.e. influencers is the person who is active in the targeted categories) in social media channels. Categorical…

Abstract

Purpose

This paper proposes an approach to identify categorical influencers (i.e. influencers is the person who is active in the targeted categories) in social media channels. Categorical influencers are important for media marketing but to automatically detect them remains a challenge.

Design/methodology/approach

We deployed the emerging deep learning approaches. Precisely, we used word embedding to encode semantic information of words occurring in the common microtext of social media and used variational autoencoder (VAE) to approximate the topic modeling process, through which the active categories of influencers are automatically detected. We developed a system known as Categorical Influencer Detection (CID) to realize those ideas.

Findings

The approach of using VAE to simulate the Latent Dirichlet Allocation (LDA) process can effectively handle the task of topic modeling on the vast dataset of microtext on social media channels.

Research limitations/implications

This work has two major contributions. The first one is the detection of topics on microtexts using deep learning approach. The second is the identification of categorical influencers in social media.

Practical implications

This work can help brands to do digital marketing on social media effectively by approaching appropriate influencers. A real case study is given to illustrate it.

Originality/value

In this paper, we discuss an approach to automatically identify the active categories of influencers by performing topic detection from the microtext related to the influencers in social media channels. To do so, we use deep learning to approximate the topic modeling process of the conventional approaches (such as LDA).

Details

Online Information Review, vol. 44 no. 5
Type: Research Article
ISSN: 1468-4527

Keywords

Open Access
Article
Publication date: 27 March 2023

Peter Madzík, Lukáš Falát, Lukáš Copuš and Marco Valeri

This bibliometric study provides an overview of research related to digital transformation (DT) in the tourism industry from 2013 to 2022. The goals of the research are as…

4416

Abstract

Purpose

This bibliometric study provides an overview of research related to digital transformation (DT) in the tourism industry from 2013 to 2022. The goals of the research are as follows: (1) to identify the development of academic papers related to DT in the tourism industry, (2) to analyze dominant research topics and the development of research interest and research impact over time and (3) to analyze the change in research topics during the pandemic.

Design/methodology/approach

In this study, the authors processed 3,683 papers retrieved from the Web of Science and Scopus. The authors performed different types of bibliometric analyses to identify the development of papers related to DT in the tourism industry. To reveal latent topics, the authors implemented topic modeling based on latent Dirichlet allocation with Gibbs sampling.

Findings

The authors identified eight topics related to DT in the tourism industry: City and urban planning, Social media, Data analytics, Sustainable and economic development, Technology-based experience and interaction, Cultural heritage, Digital destination marketing and Smart tourism management. The authors also identified seven topics related to DT in the tourism industry during the Covid-19 pandemic; the largest ones are smart analytics, marketing strategies and sustainability.

Originality/value

To identify research topics and their development over time, the authors applied a novel methodological approach – a smart literature review. This machine learning approach is able to analyze a huge amount of documents. At the same time, it can also identify topics that would remain unrevealed by a standard bibliometric analysis.

Details

European Journal of Innovation Management, vol. 26 no. 7
Type: Research Article
ISSN: 1460-1060

Keywords

1 – 10 of over 93000