Search results

1 – 10 of over 125000
Article
Publication date: 10 May 2022

Qiang Cao, Xian Cheng and Shaoyi Liao

How to extract useful information from a very large volume of literature is a great challenge for librarians. Topic modeling technique, which is a machine learning algorithm to…

Abstract

Purpose

How to extract useful information from a very large volume of literature is a great challenge for librarians. Topic modeling technique, which is a machine learning algorithm to uncover latent thematic structures from large collections of documents, is a widespread approach in literature analysis, especially with the rapid growth of academic literature. In this paper, a comparison of topic modeling based literature analysis has been done using full texts and abstracts of articles.

Design/methodology/approach

The authors conduct a comparison study of topic modeling on full-text paper and corresponding abstract to assess the influence of the different types of documents been used as input for topic modeling. In particular, the authors use the large volumes of COVID-19 research literature as a case study for topic modeling based literature analysis. The authors illustrate the research topics, research trends and topic similarity of COVID-19 research by using Latent Dirichlet allocation (LDA) and topic visualization method.

Findings

The authors found 14 research topics for COVID-19 research. The authors also found that the topic similarity between using full-text paper and corresponding abstract is higher when more documents are analyzed.

Originality/value

First, this study contributes to the literature analysis approach. The comparison study can help us understand the influence of the different types of documents on the results of topic modeling analysis. Second, the authors present an overview of COVID-19 research by summarizing 14 research topics for it. This automated literature analysis can help specialists in the health and medical domain or other people to quickly grasp the structured morphology of the current studies for COVID-19.

Details

Library Hi Tech, vol. 41 no. 2
Type: Research Article
ISSN: 0737-8831

Keywords

Book part
Publication date: 23 February 2016

Gabe Ignatow, Nicholas Evangelopoulos and Konstantinos Zougris

The authors apply topic sentiment analysis (several relatively new text analysis methods) to the study of public opinion as expressed in social media by comparing reactions to the…

Abstract

Purpose

The authors apply topic sentiment analysis (several relatively new text analysis methods) to the study of public opinion as expressed in social media by comparing reactions to the Trayvon Martin controversy in spring 2012 by commenters on the partisan news websites the Huffington Post and Daily Caller.

Methodology/approach

Topic sentiment analysis is a text analysis method that estimates the polarity of sentiments across units of text within large text corpora (Lin & He, 2009; Mei, Ling, Wondra, Su, & Zhai, 2007).

Findings

We apply topic sentiment analysis to public opinion as expressed in social media by comparing reactions to the Trayvon Martin controversy in spring 2012 by commenters on the partisan news websites the Huffington Post and Daily Caller. Based on studies that depict contemporary news media as an “outrage industry” that incentivizes media personalities to be controversial and polarizing (Berry & Sobieraj, 2014), we predict that high-profile commentators will be more polarizing than other news personalities and topics.

Originality/value

Results of the topic sentiment analysis support this prediction and in so doing provide partial validation of the application of topic sentiment analysis to online opinion.

Details

Communication and Information Technologies Annual
Type: Book
ISBN: 978-1-78560-785-1

Keywords

Article
Publication date: 29 April 2022

Chih-Ming Chen, Szu-Yu Ho and Chung Chang

This study aims to develop a hierarchical topic analysis tool (HTAT) based on hierarchical Latent Dirichelet allocation (hLDA) to support digital humanities research that is…

Abstract

Purpose

This study aims to develop a hierarchical topic analysis tool (HTAT) based on hierarchical Latent Dirichelet allocation (hLDA) to support digital humanities research that is associated with the need of topic exploration on the Digital Humanities Platform for Mr. Lo Chia-Lun’s Writings (DHP-LCLW). HTAT can assist humanities scholars on distant reading with analysis of hierarchical text topics, through classifying time-stamped texts into multiple historical eras, conducting hierarchical topic modeling (HTM) according to the texts from different eras and presenting through visualization. The comparative network diagram is another function provided to assist humanities scholars in comparing the difference in the topics they wish to explore and to track how the concept of a topic changes over time from a particular perspective. In addition, HTAT can also provide humanities scholars with the feature to view source texts, thus having high potential to be applied in promoting the effectiveness of topic exploration due to simultaneously integrating both the topic exploration functions of distant reading and close reading.

Design/methodology/approach

This study adopts a counterbalanced experimental design to examine whether there is significant differences in the effectiveness of topic inquiry, the number of relevant topics inquired and the time spent on them when research participants were alternately conducting text exploration using DHP-LCLW with HTAT or DHP-LCLW with Single-layer Topic Analysis Tool (SLTAT). A technology acceptance questionnaire and semi-structured interviews were also conducted to understand the research participants' perception and feelings toward using the two different tools to assist topic inquiry.

Findings

The experimental results show that DHP-LCLW with HTAT could better assist the research participants, in comparison with DHP-LCLW with SLTAT, to grasp the topic context of the texts from two particular perspectives assigned by this study within a short period. In addition, the results of the interviews revealed that DHP-LCLW with HTAT, in comparison with SLTAT, was able to provide a topic terms that better met research participnats' expectations and needs, and effectively guided them to the corresponding texts for close reading. In the analysis of technology acceptance and interview data, it can be found that the research participants have a high and positive tendency toward using DHP-LCLW with HTAT to assist topic inquiry.

Research limitations/implications

The Jieba Chinese word segmentation system was used in the Mr. Lo Chia-Lun’s Writings Database in this study, to perform word segmentation on Mr. Lo Chia-Lun’s writing texts for topic modeling based on hLDA. Since Jieba word segmentation system is a lexicon based word segmentation system, it cannot identify new words that have still not been collected in the lexicon well. In this case, the correctness of word segmentation on the target texts will affect the results of hLDA topic modeling, and the effectiveness of HTAT in assisting humanities scholars for topic inquiry.

Practical implications

An HTAT was developed to support digital humanities research in this study. With HTAT, DHP-LCLW provides hmanities scholars with topic clues from different hierarchical perspectives for textual exploration, and with temporal and comparative network diagrams to assist humanities scholars in tracking the evolution of the topics of specific perspectives over time, to gain a more comprehensive understanding of the overall context of the texts.

Originality/value

In recent years, topic analysis technology that can automatically extract key topic information from a large amount of texts has been developed rapidly, but the topics generated from traditional topic analysis models like LDA (Latent Dirichelet allocation) make it difficult for users to understand the differences in the topics of texts with different hierarchical levels. Thus, this study proposes HTAT which uses hLDA to build a hierarchical topic tree with a tree-like structure without the need to define the number of topics in advance, enabling humanities scholars to quickly grasp the concept of textual topics and use different hierarchical perspectives for further textual exploration. At the same time, it also provides a combination function of temporal division and comparative network diagram to assist humanities scholars in exploring topics and their changes in different eras, which helps them discover more useful research clues or findings.

Details

Aslib Journal of Information Management, vol. 75 no. 1
Type: Research Article
ISSN: 2050-3806

Keywords

Article
Publication date: 19 December 2022

Sukjin You, Soohyung Joo and Marie Katsurai

The purpose of this study is to explore to which extent data mining research would be associated with the library and information science (LIS) discipline. This study aims to…

Abstract

Purpose

The purpose of this study is to explore to which extent data mining research would be associated with the library and information science (LIS) discipline. This study aims to identify data mining related subject terms and topics in representative LIS scholarly publications.

Design/methodology/approach

A large set of bibliographic records over 38,000 was collected from a scholarly database representing the fields of LIS and the data mining, respectively. A multitude of text mining techniques were applied to investigate prevailing subject terms and research topics, such as influential term analysis and Dirichlet multinomial regression topic modeling.

Findings

The findings of this study revealed the relationship between the LIS and data mining research domains. Various data mining method terms were observed in recent LIS publications, such as machine learning, artificial intelligence and neural networks. The topic modeling result identified prevailing data mining related research topics in LIS, such as machine learning, deep learning, big data and among others. In addition, this study investigated the trends of popular topics in LIS over time in the recent decade.

Originality/value

This investigation is one of a few studies that empirically investigated the relationships between the LIS and data mining research domains. Multiple text mining techniques were employed to delineate to which extent the two research domains would be associated with each other based on both at the term-level and topic-level analysis. Methodologically, the study identified influential terms in each domain using multiple feature selection indices. In addition, Dirichlet multinomial regression was applied to explore LIS topics in relation to data mining.

Details

Aslib Journal of Information Management, vol. 76 no. 1
Type: Research Article
ISSN: 2050-3806

Keywords

Article
Publication date: 29 August 2022

Yue Yuan, Kan Liu and Yanli Wang

The purpose of this study is to analyze the topics of COVID-19 news articles for better obtaining the relationship among and the evolution of news topics, helping to manage the…

Abstract

Purpose

The purpose of this study is to analyze the topics of COVID-19 news articles for better obtaining the relationship among and the evolution of news topics, helping to manage the infodemic from a quantified perspective.

Design/methodology/approach

To analyze COVID-19 news articles explicitly, this paper proposes a prism architecture. Based on epidemic-related news on China Daily and CNN, this paper identifies the topics of the two news agencies, elucidates the relationship between and amongst these topics, tracks topic changes as the epidemic progresses and presents the results visually and compellingly.

Findings

The analysis results show that CNN has a more concentrated distribution of topics than China Daily, with the former focusing on government-related information, and the latter on medical. Besides, the pandemic has had a big impact on CNN and China Daily's reporting preference. The evolution analysis of news topics indicates that the dynamic changes of topics have a strong relationship with the pandemic process.

Originality/value

This paper offers novel perspectives to review the topics of COVID-19 news articles and provide new understandings of news articles during the initial outbreak. The analysis results expand the scope of infodemic-related studies.

Details

Aslib Journal of Information Management, vol. 75 no. 2
Type: Research Article
ISSN: 2050-3806

Keywords

Article
Publication date: 15 June 2021

Chao Yang, Cui Huang, Jun Su and Shutao Wang

The paper aims to explore whether topic analysis (identification of the core contents, trends and topic distribution in the target field) can be performed using a more low-cost…

Abstract

Purpose

The paper aims to explore whether topic analysis (identification of the core contents, trends and topic distribution in the target field) can be performed using a more low-cost and easily applicable method that relies on a small dataset, and how we can obtain this small dataset based on the features of the publications.

Design/methodology/approach

The paper proposes a topic analysis method based on prolific and authoritative researchers (PARs). First, the authors identify PARs in a specific discipline by considering the number of publications and citations of authors. Based on the research publications of PARs (small dataset), the authors then construct a keyword co-occurrence network and perform a topic analysis. Finally, the authors compare the method with the traditional method.

Findings

The authors found that using a small dataset (only 6.47% of the complete dataset in our experiment) for topic analysis yields relatively high-quality and reliable results. The comparison analysis reveals that the proposed method is quite similar to the results of traditional large dataset analysis in terms of publication time distribution, research areas, core keywords and keyword network density.

Research limitations/implications

Expert opinions are needed in determining the parameters of PARs identification algorithm. The proposed method may neglect the publications of junior researchers and its biases should be discussed.

Practical implications

This paper gives a practical way on how to implement disciplinary analysis based on a small dataset, and how to identify this dataset by proposing a PARs-based topic analysis method. The proposed method presents a useful view of the data based on PARs that can produce results comparable to traditional method, and thus will improve the effectiveness and cost of interdisciplinary topic analysis.

Originality/value

This paper proposes a PARs-based topic analysis method and verifies that topic analysis can be performed using a small dataset.

Details

Library Hi Tech, vol. 39 no. 4
Type: Research Article
ISSN: 0737-8831

Keywords

Article
Publication date: 22 March 2024

Rachana Jaiswal, Shashank Gupta and Aviral Kumar Tiwari

Grounded in the stakeholder theory and signaling theory, this study aims to broaden the research agenda on environmental, social and governance (ESG) investing by uncovering…

Abstract

Purpose

Grounded in the stakeholder theory and signaling theory, this study aims to broaden the research agenda on environmental, social and governance (ESG) investing by uncovering public sentiments and key themes using Twitter data spanning from 2009 to 2022.

Design/methodology/approach

Using various machine learning models for text tonality analysis and topic modeling, this research scrutinizes 1,842,985 Twitter texts to extract prevalent ESG investing trends and gauge their sentiment.

Findings

Gibbs Sampling Dirichlet Multinomial Mixture emerges as the optimal topic modeling method, unveiling significant topics such as “Physical risk of climate change,” “Employee Health, Safety and well-being” and “Water management and Scarcity.” RoBERTa, an attention-based model, outperforms other machine learning models in sentiment analysis, revealing a predominantly positive shift in public sentiment toward ESG investing over the past five years.

Research limitations/implications

This study establishes a framework for sentiment analysis and topic modeling on alternative data, offering a foundation for future research. Prospective studies can enhance insights by incorporating data from additional social media platforms like LinkedIn and Facebook.

Practical implications

Leveraging unstructured data on ESG from platforms like Twitter provides a novel avenue to capture company-related information, supplementing traditional self-reported sustainability disclosures. This approach opens new possibilities for understanding a company’s ESG standing.

Social implications

By shedding light on public perceptions of ESG investing, this research uncovers influential factors that often elude traditional corporate reporting. The findings empower both investors and the general public, aiding managers in refining ESG and management strategies.

Originality/value

This study marks a groundbreaking contribution to scholarly exploration, to the best of the authors’ knowledge, by being the first to analyze unstructured Twitter data in the context of ESG investing, offering unique insights and advancing the understanding of this emerging field.

Details

Management Research Review, vol. ahead-of-print no. ahead-of-print
Type: Research Article
ISSN: 2040-8269

Keywords

Article
Publication date: 10 July 2023

Surabhi Singh, Shiwangi Singh, Alex Koohang, Anuj Sharma and Sanjay Dhir

The primary aim of this study is to detail the use of soft computing techniques in business and management research. Its objectives are as follows: to conduct a comprehensive…

Abstract

Purpose

The primary aim of this study is to detail the use of soft computing techniques in business and management research. Its objectives are as follows: to conduct a comprehensive scientometric analysis of publications in the field of soft computing, to explore the evolution of keywords, to identify key research themes and latent topics and to map the intellectual structure of soft computing in the business literature.

Design/methodology/approach

This research offers a comprehensive overview of the field by synthesising 43 years (1980–2022) of soft computing research from the Scopus database. It employs descriptive analysis, topic modelling (TM) and scientometric analysis.

Findings

This study's co-citation analysis identifies three primary categories of research in the field: the components, the techniques and the benefits of soft computing. Additionally, this study identifies 16 key study themes in the soft computing literature using TM, including decision-making under uncertainty, multi-criteria decision-making (MCDM), the application of deep learning in object detection and fault diagnosis, circular economy and sustainable development and a few others.

Practical implications

This analysis offers a valuable understanding of soft computing for researchers and industry experts and highlights potential areas for future research.

Originality/value

This study uses scientific mapping and performance indicators to analyse a large corpus of 4,512 articles in the field of soft computing. It makes significant contributions to the intellectual and conceptual framework of soft computing research by providing a comprehensive overview of the literature on soft computing literature covering a period of four decades and identifying significant trends and topics to direct future research.

Details

Industrial Management & Data Systems, vol. 123 no. 8
Type: Research Article
ISSN: 0263-5577

Keywords

Article
Publication date: 22 December 2023

Rujing Xin and Yi Jing Lim

This study employs bibliometric analysis to map the research landscape of social media trending topics during the COVID-19 pandemic. The authors aim to offer a comprehensive…

117

Abstract

Purpose

This study employs bibliometric analysis to map the research landscape of social media trending topics during the COVID-19 pandemic. The authors aim to offer a comprehensive review of the predominant research organisations and countries, key themes and favoured research methodologies pertinent to this subject.

Design/methodology/approach

The authors extracted data on social media trending topics from the Web of Science Core Collection database, spanning from 2009 to 2022. A total of 1,504 publications were subjected to bibliometric analysis, utilising the VOSviewer tool. The study analytical process encompassed co-occurrence, co-authorship, citation analysis, field mapping, bibliographic coupling and co-citation analysis.

Findings

Interest in social media research, particularly on trending topics during the COVID-19 pandemic, remains high despite signs of the pandemic stabilising globally. The study predominantly addresses misinformation and public health communication, with notable focus on interactions between governments and the public. Recent studies have concentrated on analysing Twitter user data through text mining, sentiment analysis and topic modelling. The authors also identify key leading organisations, countries and journals that are central to this research area.

Originality/value

Diverging from the narrow focus of previous literature reviews on social media, which are often confined to particular fields or sectors, this study offers a broad view of social media's role, emphasising trending topics. The authors demonstrate a significant link between social media trends and public events, such as the COVID-19 pandemic. The paper discusses research priorities that emerged during the pandemic and outlines potential methodologies for future studies, advocating for a greater emphasis on qualitative approaches.

Peer review

The peer-review history for this article is available at: https://publons.com/publon/10.1108/OIR-05-2023-0194.

Details

Online Information Review, vol. ahead-of-print no. ahead-of-print
Type: Research Article
ISSN: 1468-4527

Keywords

Article
Publication date: 5 January 2023

Qingqing Zhou

With the rapid development of social media, the occurrence and evolution of emergency events are often accompanied by massive users' expressions. The fine-grained analysis on…

Abstract

Purpose

With the rapid development of social media, the occurrence and evolution of emergency events are often accompanied by massive users' expressions. The fine-grained analysis on users' expressions can provide accurate and reliable information for event processing. Hence, 2,003,814 expressions on a major malignant emergency event were mined from multiple dimensions in this paper.

Design/methodology/approach

This paper conducted finer-grained analysis on users' online expressions in an emergency event. Specifically, the authors firstly selected a major emergency event as the research object and collected the event-related user expressions that lasted nearly two years to describe the dynamic evolution trend of the event. Then, users' expression preferences were identified by detecting anomic expressions, classifying sentiment tendencies and extracting topics in expressions. Finally, the authors measured the explicit and implicit impacts of different expression preferences and obtained relations between the differential expression preferences.

Findings

Experimental results showed that users have both short- and long-term attention to emergency events. Their enthusiasm for discussing the event will be quickly dispelled and easily aroused. Meanwhile, most users prefer to make rational and normative expressions of events, and the expression topics are diversified. In addition, compared with anomic negative expressions, anomic expressions in positive sentiments are more common. In conclusion, the integration of multi-dimensional analysis results of users' expression preferences (including discussion heat, preference impacts and preference relations) is an effective means to support emergency event processing.

Originality/value

To the best of the authors' knowledge, it is the first research to conduct in-depth and fine-grained analysis of user expression in emergencies, so as to get in-detail and multi-dimensional characteristics of users' online expressions for supporting event processing.

Details

Aslib Journal of Information Management, vol. 76 no. 2
Type: Research Article
ISSN: 2050-3806

Keywords

1 – 10 of over 125000