Search results

1 – 10 of over 122000
Article
Publication date: 23 September 2024

Bernardo Cerqueira de Lima, Renata Maria Abrantes Baracho, Thomas Mandl and Patricia Baracho Porto

Social media platforms that disseminate scientific information to the public during the COVID-19 pandemic highlighted the importance of the topic of scientific communication…

Abstract

Purpose

Social media platforms that disseminate scientific information to the public during the COVID-19 pandemic highlighted the importance of the topic of scientific communication. Content creators in the field, as well as researchers who study the impact of scientific information online, are interested in how people react to these information resources and how they judge them. This study aims to devise a framework for extracting large social media datasets and find specific feedback to content delivery, enabling scientific content creators to gain insights into how the public perceives scientific information.

Design/methodology/approach

To collect public reactions to scientific information, the study focused on Twitter users who are doctors, researchers, science communicators or representatives of research institutes, and processed their replies for two years from the start of the pandemic. The study aimed in developing a solution powered by topic modeling enhanced by manual validation and other machine learning techniques, such as word embeddings, that is capable of filtering massive social media datasets in search of documents related to reactions to scientific communication. The architecture developed in this paper can be replicated for finding any documents related to niche topics in social media data. As a final step of our framework, we also fine-tuned a large language model to be able to perform the classification task with even more accuracy, forgoing the need of more human validation after the first step.

Findings

We provided a framework capable of receiving a large document dataset, and, with the help of with a small degree of human validation at different stages, is able to filter out documents within the corpus that are relevant to a very underrepresented niche theme inside the database, with much higher precision than traditional state-of-the-art machine learning algorithms. Performance was improved even further by the fine-tuning of a large language model based on BERT, which would allow for the use of such model to classify even larger unseen datasets in search of reactions to scientific communication without the need for further manual validation or topic modeling.

Research limitations/implications

The challenges of scientific communication are even higher with the rampant increase of misinformation in social media, and the difficulty of competing in a saturated attention economy of the social media landscape. Our study aimed at creating a solution that could be used by scientific content creators to better locate and understand constructive feedback toward their content and how it is received, which can be hidden as a minor subject between hundreds of thousands of comments. By leveraging an ensemble of techniques ranging from heuristics to state-of-the-art machine learning algorithms, we created a framework that is able to detect texts related to very niche subjects in very large datasets, with just a small amount of examples of texts related to the subject being given as input.

Practical implications

With this tool, scientific content creators can sift through their social media following and quickly understand how to adapt their content to their current user’s needs and standards of content consumption.

Originality/value

This study aimed to find reactions to scientific communication in social media. We applied three methods with human intervention and compared their performance. This study shows for the first time, the topics of interest which were discussed in Brazil during the COVID-19 pandemic.

Details

Data Technologies and Applications, vol. ahead-of-print no. ahead-of-print
Type: Research Article
ISSN: 2514-9288

Keywords

Article
Publication date: 26 March 2024

Wondwesen Tafesse and Anders Wien

ChatGPT is a versatile technology with practical use cases spanning many professional disciplines including marketing. Being a recent innovation, however, there is a lack of…

Abstract

Purpose

ChatGPT is a versatile technology with practical use cases spanning many professional disciplines including marketing. Being a recent innovation, however, there is a lack of academic insight into its tangible applications in the marketing realm. To address this gap, the current study explores ChatGPT’s application in marketing by mining social media data. Additionally, the study employs the stages-of- growth model to assess the current state of ChatGPT’s adoption in marketing organizations.

Design/methodology/approach

The study collected tweets related to ChatGPT and marketing using a web-scraping technique (N = 23,757). A topic model was trained on the tweet corpus using latent Dirichlet allocation to delineate ChatGPT’s major areas of applications in marketing.

Findings

The topic model produced seven latent topics that encapsulated ChatGPT’s major areas of applications in marketing including content marketing, digital marketing, search engine optimization, customer strategy, B2B marketing and prompt engineering. Further analyses reveal the popularity of and interest in these topics among marketing practitioners.

Originality/value

The findings contribute to the literature by offering empirical evidence of ChatGPT’s applications in marketing. They demonstrate the core use cases of ChatGPT in marketing. Further, the study applies the stages-of-growth model to situate ChatGPT’s current state of adoption in marketing organizations and anticipate its future trajectory.

Details

Marketing Intelligence & Planning, vol. 42 no. 4
Type: Research Article
ISSN: 0263-4503

Keywords

Article
Publication date: 30 August 2023

Donghui Yang, Yan Wang, Zhaoyang Shi and Huimin Wang

Improving the diversity of recommendation information has become one of the latest research hotspots to solve information cocoons. Aiming to achieve both high accuracy and…

Abstract

Purpose

Improving the diversity of recommendation information has become one of the latest research hotspots to solve information cocoons. Aiming to achieve both high accuracy and diversity of recommender system, a hybrid method has been proposed in this paper. This study aims to discuss the aforementioned method.

Design/methodology/approach

This paper integrates latent Dirichlet allocation (LDA) model and locality-sensitive hashing (LSH) algorithm to design topic recommendation system. To measure the effectiveness of the method, this paper builds three-level categories of journal paper abstracts on the Web of Science platform as experimental data.

Findings

(1) The results illustrate that the diversity of recommended items has been significantly enhanced by leveraging hashing function to overcome information cocoons. (2) Integrating topic model and hashing algorithm, the diversity of recommender systems could be achieved without losing the accuracy of recommender systems in a certain degree of refined topic levels.

Originality/value

The hybrid recommendation algorithm developed in this paper can overcome the dilemma of high accuracy and low diversity. The method could ameliorate the recommendation in business and service industries to address the problems of information overload and information cocoons.

Details

Aslib Journal of Information Management, vol. ahead-of-print no. ahead-of-print
Type: Research Article
ISSN: 2050-3806

Keywords

Article
Publication date: 29 April 2021

Heng-Yang Lu, Yi Zhang and Yuntao Du

Topic model has been widely applied to discover important information from a vast amount of unstructured data. Traditional long-text topic models such as Latent Dirichlet…

Abstract

Purpose

Topic model has been widely applied to discover important information from a vast amount of unstructured data. Traditional long-text topic models such as Latent Dirichlet Allocation may suffer from the sparsity problem when dealing with short texts, which mostly come from the Web. These models also exist the readability problem when displaying the discovered topics. The purpose of this paper is to propose a novel model called the Sense Unit based Phrase Topic Model (SenU-PTM) for both the sparsity and readability problems.

Design/methodology/approach

SenU-PTM is a novel phrase-based short-text topic model under a two-phase framework. The first phase introduces a phrase-generation algorithm by exploiting word embeddings, which aims to generate phrases with the original corpus. The second phase introduces a new concept of sense unit, which consists of a set of semantically similar tokens for modeling topics with token vectors generated in the first phase. Finally, SenU-PTM infers topics based on the above two phases.

Findings

Experimental results on two real-world and publicly available datasets show the effectiveness of SenU-PTM from the perspectives of topical quality and document characterization. It reveals that modeling topics on sense units can solve the sparsity of short texts and improve the readability of topics at the same time.

Originality/value

The originality of SenU-PTM lies in the new procedure of modeling topics on the proposed sense units with word embeddings for short-text topic discovery.

Details

Data Technologies and Applications, vol. 55 no. 5
Type: Research Article
ISSN: 2514-9288

Keywords

Article
Publication date: 4 June 2021

Lixue Zou, Xiwen Liu, Wray Buntine and Yanli Liu

Full text of a document is a rich source of information that can be used to provide meaningful topics. The purpose of this paper is to demonstrate how to use citation context (CC…

Abstract

Purpose

Full text of a document is a rich source of information that can be used to provide meaningful topics. The purpose of this paper is to demonstrate how to use citation context (CC) in the full text to identify the cited topics and citing topics efficiently and effectively by employing automatic text analysis algorithms.

Design/methodology/approach

The authors present two novel topic models, Citation-Context-LDA (CC-LDA) and Citation-Context-Reference-LDA (CCRef-LDA). CC is leveraged to extract the citing text from the full text, which makes it possible to discover topics with accuracy. CC-LDA incorporates CC, citing text, and their latent relationship, while CCRef-LDA incorporates CC, citing text, their latent relationship and reference information in CC. Collapsed Gibbs sampling is used to achieve an approximate estimation. The capacity of CC-LDA to simultaneously learn cited topics and citing topics together with their links is investigated. Moreover, a topic influence measure method based on CC-LDA is proposed and applied to create links between the two-level topics. In addition, the capacity of CCRef-LDA to discover topic influential references is also investigated.

Findings

The results indicate CC-LDA and CCRef-LDA achieve improved or comparable performance in terms of both perplexity and symmetric Kullback–Leibler (sKL) divergence. Moreover, CC-LDA is effective in discovering the cited topics and citing topics with topic influence, and CCRef-LDA is able to find the cited topic influential references.

Originality/value

The automatic method provides novel knowledge for cited topics and citing topics discovery. Topic influence learnt by our model can link two-level topics and create a semantic topic network. The method can also use topic specificity as a feature to rank references.

Details

Library Hi Tech, vol. 39 no. 4
Type: Research Article
ISSN: 0737-8831

Keywords

Open Access
Article
Publication date: 30 November 2021

Federico Barravecchia, Luca Mastrogiacomo and Fiorenzo Franceschini

Digital voice-of-customer (digital VoC) analysis is gaining much attention in the field of quality management. Digital VoC can be a great source of knowledge about customer needs…

2079

Abstract

Purpose

Digital voice-of-customer (digital VoC) analysis is gaining much attention in the field of quality management. Digital VoC can be a great source of knowledge about customer needs, habits and expectations. To this end, the most popular approach is based on the application of text mining algorithms named topic modelling. These algorithms can identify latent topics discussed within digital VoC and categorise each source (e.g. each review) based on its content. This paper aims to propose a structured procedure for validating the results produced by topic modelling algorithms.

Design/methodology/approach

The proposed procedure compares, on random samples, the results produced by topic modelling algorithms with those generated by human evaluators. The use of specific metrics allows to make a comparison between the two approaches and to provide a preliminary empirical validation.

Findings

The proposed procedure can address users of topic modelling algorithms in validating the obtained results. An application case study related to some car-sharing services supports the description.

Originality/value

Despite the vast success of topic modelling-based approaches, metrics and procedures to validate the obtained results are still lacking. This paper provides a first practical and structured validation procedure specifically employed for quality-related applications.

Details

International Journal of Quality & Reliability Management, vol. 39 no. 6
Type: Research Article
ISSN: 0265-671X

Keywords

Article
Publication date: 5 September 2019

Nastaran Hajiheydari, Mojtaba Talafidaryani, SeyedHossein Khabiri and Masoud Salehi

Although the business model field of study has been a focus of attention for both researchers and practitioners within the past two decades, it still suffers from concern about…

Abstract

Purpose

Although the business model field of study has been a focus of attention for both researchers and practitioners within the past two decades, it still suffers from concern about its identity. Accordingly, this paper aims to clarify the intellectual structure of business model through identifying the research clusters and their sub-clusters, the prominent relations and the dominant research trends.

Design/methodology/approach

This paper uses some common text mining methods including co-word analysis, burst analysis, timeline analysis and topic modeling to analyze and mine the title, abstract and keywords of 14,081 research documents related to the domain of business model.

Findings

The results revealed that the business model field of study consists of three main research areas including electronic business model, business model innovation and sustainable business model, each of which has some sub-areas and has been more evident in some particular industries. Additionally, from the time perspective, research issues in the domain of sustainable development are considered as the hot and emerging topics in this field. In addition, the results confirmed that information technology has been one of the most important drivers, influencing the appearance of different study topics in the various periods.

Originality/value

The contribution of this study is to quantitatively uncover the dominant knowledge structure and prominent research trends in the business model field of study, considering a broad range of scholarly publications and using some promising and reliable text mining techniques.

Details

foresight, vol. 21 no. 6
Type: Research Article
ISSN: 1463-6689

Keywords

Book part
Publication date: 30 August 2019

Fulya Ozcan

This chapter investigates the behavior of Reddit’s news subreddit users and the relationship between their sentiment on exchange rates. Using graphical models and natural language…

Abstract

This chapter investigates the behavior of Reddit’s news subreddit users and the relationship between their sentiment on exchange rates. Using graphical models and natural language processing, hidden online communities among Reddit users are discovered. The data set used in this project is a mixture of text and categorical data from Reddit’s news subreddit. These data include the titles of the news pages, as well as a few user characteristics, in addition to users’ comments. This data set is an excellent resource to study user reaction to news since their comments are directly linked to the webpage contents. The model considered in this chapter is a hierarchical mixture model which is a generative model that detects overlapping networks using the sentiment from the user generated content. The advantage of this model is that the communities (or groups) are assumed to follow a Chinese restaurant process, and therefore it can automatically detect and cluster the communities. The hidden variables and the hyperparameters for this model are obtained using Gibbs sampling.

Details

Topics in Identification, Limited Dependent Variables, Partial Observability, Experimentation, and Flexible Modeling: Part A
Type: Book
ISBN: 978-1-78973-241-2

Keywords

Article
Publication date: 22 March 2024

Rachana Jaiswal, Shashank Gupta and Aviral Kumar Tiwari

Grounded in the stakeholder theory and signaling theory, this study aims to broaden the research agenda on environmental, social and governance (ESG) investing by uncovering…

Abstract

Purpose

Grounded in the stakeholder theory and signaling theory, this study aims to broaden the research agenda on environmental, social and governance (ESG) investing by uncovering public sentiments and key themes using Twitter data spanning from 2009 to 2022.

Design/methodology/approach

Using various machine learning models for text tonality analysis and topic modeling, this research scrutinizes 1,842,985 Twitter texts to extract prevalent ESG investing trends and gauge their sentiment.

Findings

Gibbs Sampling Dirichlet Multinomial Mixture emerges as the optimal topic modeling method, unveiling significant topics such as “Physical risk of climate change,” “Employee Health, Safety and well-being” and “Water management and Scarcity.” RoBERTa, an attention-based model, outperforms other machine learning models in sentiment analysis, revealing a predominantly positive shift in public sentiment toward ESG investing over the past five years.

Research limitations/implications

This study establishes a framework for sentiment analysis and topic modeling on alternative data, offering a foundation for future research. Prospective studies can enhance insights by incorporating data from additional social media platforms like LinkedIn and Facebook.

Practical implications

Leveraging unstructured data on ESG from platforms like Twitter provides a novel avenue to capture company-related information, supplementing traditional self-reported sustainability disclosures. This approach opens new possibilities for understanding a company’s ESG standing.

Social implications

By shedding light on public perceptions of ESG investing, this research uncovers influential factors that often elude traditional corporate reporting. The findings empower both investors and the general public, aiding managers in refining ESG and management strategies.

Originality/value

This study marks a groundbreaking contribution to scholarly exploration, to the best of the authors’ knowledge, by being the first to analyze unstructured Twitter data in the context of ESG investing, offering unique insights and advancing the understanding of this emerging field.

Details

Management Research Review, vol. 47 no. 8
Type: Research Article
ISSN: 2040-8269

Keywords

Article
Publication date: 9 October 2023

Xiaoguang Wang, Yue Cheng, Tao Lv and Rongjiang Cai

The authors hope to filter valuable information from online reviews, obtain objective and accurate information about the demands of auto consumers and help auto companies develop…

Abstract

Purpose

The authors hope to filter valuable information from online reviews, obtain objective and accurate information about the demands of auto consumers and help auto companies develop more reasonable production and marketing strategies for healthy and sustainable development. This paper aims to discuss the aforementioned objectives.

Design/methodology/approach

The authors collected review data from online automotive forums and generated a corpus after pre-processing. Then, the authors extracted consumer demands and topics using the LDA model. Finally, the authors used a trained Word2vec tool to extend the consumer demand topics.

Findings

Different types of vehicle consumers have the same demands, such as “Space,” “Power Performance,” and “Brand Comparison,” and distinct demands, such as “Appearance,” “Safety,” “Service,” and “New Energy Features”; consumers who buy new energy vehicles are still accustomed to comparing with the brands or models of fuel vehicles; new energy vehicles consumers pay more attention to services and service quality during the purchasing and using process.

Research limitations/implications

The development time of new energy vehicles is relatively short, with some models being available for only one year or even six months. The smaller amount of available data may impact the applicability of topic models. The sample size, especially for new energy vehicles, needs to be increased to improve the general applicability of topic models further.

Practical implications

First, this measure helps online review websites improve their existing review publication mechanisms, enhance the overall quality of online review content, increase user traffic and promote the healthy development of online review websites. Second, this allows for timely adjustments in future product production and sales plans and further enhances automotive companies' ability to leverage online reviews for Internet marketing.

Originality/value

The authors have improved the accuracy and stability of the fused topic model, providing a scientific and efficient research tool for multi-dimensional topic mining of online reviews. With the help of research results, consumers can more easily understand the discussion topics and thus filter out valuable reference information. As a result, automotive companies may gain information about consumer demands and product quality feedback and thus quickly adjust production and marketing strategies to increase sales and market share.

Details

Marketing Intelligence & Planning, vol. 41 no. 8
Type: Research Article
ISSN: 0263-4503

Keywords

1 – 10 of over 122000