Search results

1 – 10 of over 3000
To view the access options for this content please click here
Article
Publication date: 9 October 2019

Francisco Villarroel Ordenes and Shunyuan Zhang

The purpose of this paper is to describe and position the state-of-the-art of text and image mining methods in business research. By providing a detailed conceptual and…

Abstract

Purpose

The purpose of this paper is to describe and position the state-of-the-art of text and image mining methods in business research. By providing a detailed conceptual and technical review of both methods, it aims to increase their utilization in service research.

Design/methodology/approach

On a first stage, the authors review business literature in marketing, operations and management concerning the use of text and image mining methods. On a second stage, the authors identify and analyze empirical papers that used text and image mining methods in services journals and premier business. Finally, avenues for further research in services are provided.

Findings

The manuscript identifies seven text mining methods and describes their approaches, processes, techniques and algorithms, involved in their implementation. Four of these methods are positioned similarly for image mining. There are 39 papers using text mining in service research, with a focus on measuring consumer sentiment, experiences, and service quality. Due to the nonexistent use of image mining service journals, the authors review their application in marketing and management, and suggest ideas for further research in services.

Research limitations/implications

This manuscript focuses on the different methods and their implementation in service research, but it does not offer a complete review of business literature using text and image mining methods.

Practical implications

The results have a number of implications for the discipline that are presented and discussed. The authors provide research directions using text and image mining methods in service priority areas such as artificial intelligence, frontline employees, transformative consumer research and customer experience.

Originality/value

The manuscript provides an introduction to text and image mining methods to service researchers and practitioners interested in the analysis of unstructured data. This paper provides several suggestions concerning the use of new sources of data (e.g. customer reviews, social media images, employee reviews and emails), measurement of new constructs (beyond sentiment and valence) and the use of more recent methods (e.g. deep learning).

Details

Journal of Service Management, vol. 30 no. 5
Type: Research Article
ISSN: 1757-5818

Keywords

To view the access options for this content please click here
Article
Publication date: 13 November 2017

Wu He, Xin Tian, Ran Tao, Weidong Zhang, Gongjun Yan and Vasudeva Akula

Online customer reviews could shed light into their experience, opinions, feelings, and concerns. To gain valuable knowledge about customers, it becomes increasingly…

Abstract

Purpose

Online customer reviews could shed light into their experience, opinions, feelings, and concerns. To gain valuable knowledge about customers, it becomes increasingly important for businesses to collect, monitor, analyze, summarize, and visualize online customer reviews posted on social media platforms such as online forums. However, analyzing social media data is challenging due to the vast increase of social media data. The purpose of this paper is to present an approach of using natural language preprocessing, text mining and sentiment analysis techniques to analyze online customer reviews related to various hotels through a case study.

Design/methodology/approach

This paper presents a tested approach of using natural language preprocessing, text mining, and sentiment analysis techniques to analyze online textual content. The value of the proposed approach was demonstrated through a case study using online hotel reviews.

Findings

The study found that the overall review star rating correlates pretty well with the sentiment scores for both the title and the full content of the online customer review. The case study also revealed that both extremely satisfied and extremely dissatisfied hotel customers share a common interest in the five categories: food, location, rooms, service, and staff.

Originality/value

This study analyzed the online reviews from English-speaking hotel customers in China to understand their preferred hotel attributes, main concerns or demands. This study also provides a feasible approach and a case study as an example to help enterprises more effectively apply social media analytics in practice.

Details

Online Information Review, vol. 41 no. 7
Type: Research Article
ISSN: 1468-4527

Keywords

To view the access options for this content please click here
Article
Publication date: 8 February 2016

Yoosin Kim, Rahul Dwivedi, Jie Zhang and Seung Ryul Jeong

The purpose of this paper is to mine competitive intelligence in social media to find the market insight by comparing consumer opinions and sales performance of a business…

Abstract

Purpose

The purpose of this paper is to mine competitive intelligence in social media to find the market insight by comparing consumer opinions and sales performance of a business and one of its competitors by analyzing the public social media data.

Design/methodology/approach

An exploratory test using a multiple case study approach was used to compare two competing smartphone manufacturers. Opinion mining and sentiment analysis are conducted first, followed by further validation of results using statistical analysis. A total of 229,948 tweets mentioning the iPhone6 or the GalaxyS5 have been collected for four months following the release of the iPhone6; these have been analyzed using natural language processing, lexicon-based sentiment analysis, and purchase intention classification.

Findings

The analysis showed that social media data contain competitive intelligence. The volume of tweets revealed a significant gap between the market leader and one follower; the purchase intention data also reflected this gap, but to a less pronounced extent. In addition, the authors assessed whether social opinion could explain the sales performance gap between the competitors, and found that the social opinion gap was similar to the shipment gap.

Research limitations/implications

This study compared the social media opinion and the shipment gap between two rival smart phones. A business can take the consumers’ opinions toward not only its own product but also toward the product of competitors through social media analytics. Furthermore, the business can predict market sales performance and estimate the gap with competing products. As a result, decision makers can adjust the market strategy rapidly and compensate the weakness contrasting with the rivals as well.

Originality/value

This paper’s main contribution is to demonstrat the competitive intelligence via the consumer opinion mining of social media data. Researchers, business analysts, and practitioners can adopt this method of social media analysis to achieve their objectives and to implement practical procedures for data collection, spam elimination, machine learning classification, sentiment analysis, feature categorization, and result visualization.

Details

Online Information Review, vol. 40 no. 1
Type: Research Article
ISSN: 1468-4527

Keywords

To view the access options for this content please click here
Book part
Publication date: 7 May 2019

Mu-Yen Chen, Min-Hsuan Fan, Ting-Hsuan Chen and Ren-Pao Hsieh

Given the maturation of the internet and virtual communities, an important emerging issue in the humanities and social sciences is how to accurately analyze the vast…

Abstract

Given the maturation of the internet and virtual communities, an important emerging issue in the humanities and social sciences is how to accurately analyze the vast quantity of documents on public and social network websites. Therefore, this chapter integrates political blogs and news articles to develop a public mood dynamic prediction model for the stock market, while referencing the behavioral finance perspective and online political community characteristics. The goal of this chapter is to apply a big data and opinion mining approach to a sentiment analysis for the relationship between political status and economic development in Taiwan. The proposed model is verified using experimental datasets collected from ChinaTimes.com, cnYES.com, Yahoo stock market news, and Google stock market news, covering the period from January 1, 2016 to June 30, 2017. The empirical results indicate the accuracy rate with which the proposed model forecasts stock prices.

Details

Politics and Technology in the Post-Truth Era
Type: Book
ISBN: 978-1-78756-984-3

Keywords

To view the access options for this content please click here
Article
Publication date: 3 January 2018

Lei La, Shuyan Cao and Liangjuan Qin

As a foundational issue of social mining, sentiment classification suffered from a lack of unlabeled data. To enhance accuracy of classification with few labeled data…

Abstract

Purpose

As a foundational issue of social mining, sentiment classification suffered from a lack of unlabeled data. To enhance accuracy of classification with few labeled data, many semi-supervised algorithms had been proposed. These algorithms improved the classification performance when the labeled data are insufficient. However, precision and efficiency are difficult to be ensured at the same time in many semi-supervised methods. This paper aims to present a novel method for using unlabeled data in a more accurate and more efficient way.

Design/methodology/approach

First, the authors designed a boosting-based method for unlabeled data selection. The improved boosting-based method can choose unlabeled data which have the same distribution with the labeled data. The authors then proposed a novel strategy which can combine weak classifiers into strong classifiers that are more rational. Finally, a semi-supervised sentiment classification algorithm is given.

Findings

Experimental results demonstrate that the novel algorithm can achieve really high accuracy with low time consumption. It is helpful for achieving high-performance social network-related applications.

Research limitations/implications

The novel method needs a small labeled data set for semi-supervised learning. Maybe someday the authors can improve it to an unsupervised method.

Practical implications

The mentioned method can be used in text mining, image classification, audio processing and so on, and also in an unstructured data mining-related field. Overcome the problem of insufficient labeled data and achieve high precision using fewer computational time.

Social implications

Sentiment mining has wide applications in public opinion management, public security, market analysis, social network and related fields. Sentiment classification is the basis of sentiment mining.

Originality/value

According to what the authors have been informed, it is the first time transfer learning be introduced to AdaBoost for semi-supervised learning. Moreover, the improved AdaBoost uses a totally new mechanism for weighting.

Details

Kybernetes, vol. 47 no. 3
Type: Research Article
ISSN: 0368-492X

Keywords

To view the access options for this content please click here
Article
Publication date: 31 May 2018

Antonio Usai, Marco Pironti, Monika Mital and Chiraz Aouina Mejri

The aim of this work is to increase awareness of the potential of the technique of text mining to discover knowledge and further promote research collaboration between…

Abstract

Purpose

The aim of this work is to increase awareness of the potential of the technique of text mining to discover knowledge and further promote research collaboration between knowledge management and the information technology communities. Since its emergence, text mining has involved multidisciplinary studies, focused primarily on database technology, Web-based collaborative writing, text analysis, machine learning and knowledge discovery. However, owing to the large amount of research in this field, it is becoming increasingly difficult to identify existing studies and therefore suggest new topics.

Design/methodology/approach

This article offers a systematic review of 85 academic outputs (articles and books) focused on knowledge discovery derived from the text mining technique. The systematic review is conducted by applying “text mining at the term level, in which knowledge discovery takes place on a more focused collection of words and phrases that are extracted from and label each document” (Feldman et al., 1998, p. 1).

Findings

The results revealed that the keywords extracted to be associated with the main labels, id est, knowledge discovery and text mining, can be categorized in two periods: from 1998 to 2009, the term knowledge and text were always used. From 2010 to 2017 in addition to these terms, sentiment analysis, review manipulation, microblogging data and knowledgeable users were the other terms frequently used. Besides this, it is possible to notice the technical, engineering nature of each term present in the first decade. Whereas, a diverse range of fields such as business, marketing and finance emerged from 2010 to 2017 owing to a greater interest in the online environment.

Originality/value

This is a first comprehensive systematic review on knowledge discovery and text mining through the use of a text mining technique at term level, which offers to reduce redundant research and to avoid the possibility of missing relevant publications.

Details

Journal of Knowledge Management, vol. 22 no. 7
Type: Research Article
ISSN: 1367-3270

Keywords

To view the access options for this content please click here
Article
Publication date: 14 November 2016

Konstantinos Domdouzis, Babak Akhgar, Simon Andrews, Helen Gibson and Laurence Hirsch

A number of crisis situations, such as natural disasters, have affected the planet over the past decade. The outcomes of such disasters are catastrophic for the…

Abstract

Purpose

A number of crisis situations, such as natural disasters, have affected the planet over the past decade. The outcomes of such disasters are catastrophic for the infrastructures of modern societies. Furthermore, after large disasters, societies come face-to-face with important issues, such as the loss of human lives, people who are missing and the increment of the criminality rate. In many occasions, they seem unprepared to face such issues. This paper aims to present an automated social media and crowdsourcing data mining system for the synchronization of the police and law enforcement agencies for the prevention of criminal activities during and post a large crisis situation.

Design/methodology/approach

The paper realized qualitative research in the form of a review of the literature. This review focuses on the necessity of using social media and crowdsourcing data mining techniques in combination with advanced Web technologies for the purpose of providing solutions to problems related to criminal activities caused during and after a crisis. The paper presents the ATHENA crisis management system, which uses a number of data mining techniques to collect and analyze crisis-related data from social media for the purpose of crime prevention.

Findings

Conclusions are drawn on the significance of social media and crowdsourcing data mining techniques for the resolution of problems related to large crisis situations with emphasis to the ATHENA system.

Originality/value

The paper shows how the integrated use of social media and data mining algorithms can contribute in the resolution of problems that are developed during and after a large crisis.

Details

Journal of Systems and Information Technology, vol. 18 no. 4
Type: Research Article
ISSN: 1328-7265

Keywords

To view the access options for this content please click here
Article
Publication date: 27 August 2019

Barkha Bansal and Sangeet Srivastava

Vast volumes of rich online consumer-generated content (CGC) can be used effectively to gain important insights for decision-making, product improvement and brand…

Abstract

Purpose

Vast volumes of rich online consumer-generated content (CGC) can be used effectively to gain important insights for decision-making, product improvement and brand management. Recently, many studies have proposed semi-supervised aspect-based sentiment classification of unstructured CGC. However, most of the existing CGC mining methods rely on explicitly detecting aspect-based sentiments and overlooking the context of sentiment-bearing words. Therefore, this study aims to extract implicit context-sensitive sentiment, and handle slangs, ambiguous, informal and special words used in CGC.

Design/methodology/approach

A novel text mining framework is proposed to detect and evaluate implicit semantic word relations and context. First, POS (part of speech) tagging is used for detecting aspect descriptions and sentiment-bearing words. Then, LDA (latent Dirichlet allocation) is used to group similar aspects together and to form an attribute. Semantically and contextually similar words are found using the skip-gram model for distributed word vectorisation. Finally, to find context-sensitive sentiment of each attribute, cosine similarity is used along with a set of positive and negative seed words.

Findings

Experimental results using more than 400,000 Amazon mobile phone reviews showed that the proposed method efficiently found product attributes and corresponding context-aware sentiments. This method also outperforms the classification accuracy of the baseline model and state-of-the-art techniques using context-sensitive information on data sets from two different domains.

Practical implications

Extracted attributes can be easily classified into consumer issues and brand merits. A brand-based comparative study is presented to demonstrate the practical significance of the proposed approach.

Originality/value

This paper presents a novel method for context-sensitive attribute-based sentiment analysis of CGC, which is useful for both brand and product improvement.

Details

Kybernetes, vol. 50 no. 2
Type: Research Article
ISSN: 0368-492X

Keywords

To view the access options for this content please click here
Article
Publication date: 13 September 2019

Collins Udanor and Chinatu C. Anyanwu

Hate speech in recent times has become a troubling development. It has different meanings to different people in different cultures. The anonymity and ubiquity of the…

Abstract

Purpose

Hate speech in recent times has become a troubling development. It has different meanings to different people in different cultures. The anonymity and ubiquity of the social media provides a breeding ground for hate speech and makes combating it seems like a lost battle. However, what may constitute a hate speech in a cultural or religious neutral society may not be perceived as such in a polarized multi-cultural and multi-religious society like Nigeria. Defining hate speech, therefore, may be contextual. Hate speech in Nigeria may be perceived along ethnic, religious and political boundaries. The purpose of this paper is to check for the presence of hate speech in social media platforms like Twitter, and to what degree is hate speech permissible, if available? It also intends to find out what monitoring mechanisms the social media platforms like Facebook and Twitter have put in place to combat hate speech. Lexalytics is a term coined by the authors from the words lexical analytics for the purpose of opinion mining unstructured texts like tweets.

Design/methodology/approach

This research developed a Python software called polarized opinions sentiment analyzer (POSA), adopting an ego social network analytics technique in which an individual’s behavior is mined and described. POSA uses a customized Python N-Gram dictionary of local context-based terms that may be considered as hate terms. It then applied the Twitter API to stream tweets from popular and trending Nigerian Twitter handles in politics, ethnicity, religion, social activism, racism, etc., and filtered the tweets against the custom dictionary using unsupervised classification of the texts as either positive or negative sentiments. The outcome is visualized using tables, pie charts and word clouds. A similar implementation was also carried out using R-Studio codes and both results are compared and a t-test was applied to determine if there was a significant difference in the results. The research methodology can be classified as both qualitative and quantitative. Qualitative in terms of data classification, and quantitative in terms of being able to identify the results as either negative or positive from the computation of text to vector.

Findings

The findings from two sets of experiments on POSA and R are as follows: in the first experiment, the POSA software found that the Twitter handles analyzed contained between 33 and 55 percent hate contents, while the R results show hate contents ranging from 38 to 62 percent. Performing a t-test on both positive and negative scores for both POSA and R-studio, results reveal p-values of 0.389 and 0.289, respectively, on an α value of 0.05, implying that there is no significant difference in the results from POSA and R. During the second experiment performed on 11 local handles with 1,207 tweets, the authors deduce as follows: that the percentage of hate contents classified by POSA is 40 percent, while the percentage of hate contents classified by R is 51 percent. That the accuracy of hate speech classification predicted by POSA is 87 percent, while free speech is 86 percent. And the accuracy of hate speech classification predicted by R is 65 percent, while free speech is 74 percent. This study reveals that neither Twitter nor Facebook has an automated monitoring system for hate speech, and no benchmark is set to decide the level of hate contents allowed in a text. The monitoring is rather done by humans whose assessment is usually subjective and sometimes inconsistent.

Research limitations/implications

This study establishes the fact that hate speech is on the increase on social media. It also shows that hate mongers can actually be pinned down, with the contents of their messages. The POSA system can be used as a plug-in by Twitter to detect and stop hate speech on its platform. The study was limited to public Twitter handles only. N-grams are effective features for word-sense disambiguation, but when using N-grams, the feature vector could take on enormous proportions and in turn increasing sparsity of the feature vectors.

Practical implications

The findings of this study show that if urgent measures are not taken to combat hate speech there could be dare consequences, especially in highly polarized societies that are always heated up along religious and ethnic sentiments. On daily basis tempers are flaring in the social media over comments made by participants. This study has also demonstrated that it is possible to implement a technology that can track and terminate hate speech in a micro-blog like Twitter. This can also be extended to other social media platforms.

Social implications

This study will help to promote a more positive society, ensuring the social media is positively utilized to the benefit of mankind.

Originality/value

The findings can be used by social media companies to monitor user behaviors, and pin hate crimes to specific persons. Governments and law enforcement bodies can also use the POSA application to track down hate peddlers.

Details

Data Technologies and Applications, vol. 53 no. 4
Type: Research Article
ISSN: 2514-9288

Keywords

To view the access options for this content please click here
Article
Publication date: 29 October 2018

Shrawan Kumar Trivedi and Shubhamoy Dey

To be sustainable and competitive in the current business environment, it is useful to understand users’ sentiment towards products and services. This critical task can be…

Abstract

Purpose

To be sustainable and competitive in the current business environment, it is useful to understand users’ sentiment towards products and services. This critical task can be achieved via natural language processing and machine learning classifiers. This paper aims to propose a novel probabilistic committee selection classifier (PCC) to analyse and classify the sentiment polarities of movie reviews.

Design/methodology/approach

An Indian movie review corpus is assembled for this study. Another publicly available movie review polarity corpus is also involved with regard to validating the results. The greedy stepwise search method is used to extract the features/words of the reviews. The performance of the proposed classifier is measured using different metrics, such as F-measure, false positive rate, receiver operating characteristic (ROC) curve and training time. Further, the proposed classifier is compared with other popular machine-learning classifiers, such as Bayesian, Naïve Bayes, Decision Tree (J48), Support Vector Machine and Random Forest.

Findings

The results of this study show that the proposed classifier is good at predicting the positive or negative polarity of movie reviews. Its performance accuracy and the value of the ROC curve of the PCC is found to be the most suitable of all other classifiers tested in this study. This classifier is also found to be efficient at identifying positive sentiments of reviews, where it gives low false positive rates for both the Indian Movie Review and Review Polarity corpora used in this study. The training time of the proposed classifier is found to be slightly higher than that of Bayesian, Naïve Bayes and J48.

Research limitations/implications

Only movie review sentiments written in English are considered. In addition, the proposed committee selection classifier is prepared only using the committee of probabilistic classifiers; however, other classifier committees can also be built, tested and compared with the present experiment scenario.

Practical implications

In this paper, a novel probabilistic approach is proposed and used for classifying movie reviews, and is found to be highly effective in comparison with other state-of-the-art classifiers. This classifier may be tested for different applications and may provide new insights for developers and researchers.

Social implications

The proposed PCC may be used to classify different product reviews, and hence may be beneficial to organizations to justify users’ reviews about specific products or services. By using authentic positive and negative sentiments of users, the credibility of the specific product, service or event may be enhanced. PCC may also be applied to other applications, such as spam detection, blog mining, news mining and various other data-mining applications.

Originality/value

The constructed PCC is novel and was tested on Indian movie review data.

1 – 10 of over 3000