Search results

1 – 10 of over 10000
Article
Publication date: 20 August 2018

Dharini Ramachandran and Parvathi Ramasubramanian

“What’s happening?” around you can be spread through the very pronounced social media to everybody. It provides a powerful platform that brings to light the latest news, trends…

Abstract

Purpose

“What’s happening?” around you can be spread through the very pronounced social media to everybody. It provides a powerful platform that brings to light the latest news, trends and happenings around the world in “near instant” time. Microblog is a popular Web service that enables users to post small pieces of digital content, such as text, picture, video and link to external resource. The raw data from microblog prove indispensable in extracting information from it, offering a way to single out the physical events and popular topics prevalent in social media. This study aims to present and review the varied methods carried out for event detection from microblogs. An event is an activity or action with a clear finite duration in which the target entity plays a key role. Event detection helps in the timely understanding of people’s opinion and actual condition of the detected events.

Design/methodology/approach

This paper presents a study of various approaches adopted for event detection from microblogs. The approaches are reviewed according to the techniques used, applications and the element detected (event or topic).

Findings

Various ideas explored, important observations inferred, corresponding outcomes and assessment of results from those approaches are discussed.

Originality/value

The approaches and techniques for event detection are studied in two categories: first, based on the kind of event being detected (physical occurrence or emerging/popular topic) and second, within each category, the approaches further categorized into supervised- and unsupervised-based techniques.

Article
Publication date: 27 November 2020

Hoda Daou

Social media is characterized by its volume, its speed of generation and its easy and open access; all this making it an important source of information that provides valuable…

Abstract

Purpose

Social media is characterized by its volume, its speed of generation and its easy and open access; all this making it an important source of information that provides valuable insights. Content characteristics such as valence and emotions play an important role in the diffusion of information; in fact, emotions can shape virality of topics in social media. The purpose of this research is to fill the gap in event detection applied on online content by incorporating sentiment, more specifically strong sentiment, as main attribute in identifying relevant content.

Design/methodology/approach

The study proposes a methodology based on strong sentiment classification using machine learning and an advanced scoring technique.

Findings

The results show the following key findings: the proposed methodology is able to automatically capture trending topics and achieve better classification compared to state-of-the-art topic detection algorithms. In addition, the methodology is not context specific; it is able to successfully identify important events from various datasets within the context of politics, rallies, various news and real tragedies.

Originality/value

This study fills the gap of topic detection applied on online content by building on the assumption that important events trigger strong sentiment among the society. In addition, classic topic detection algorithms require tuning in terms of number of topics to search for. This methodology involves scoring the posts and, thus, does not require limiting the number topics; it also allows ordering the topics by relevance based on the value of the score.

Peer review

The peer review history for this article is available at: https://publons.com/publon/10.1108/OIR-12-2019-0373

Details

Online Information Review, vol. 45 no. 1
Type: Research Article
ISSN: 1468-4527

Keywords

Article
Publication date: 23 March 2021

Hendri Murfi

The aim of this research is to develop an eigenspace-based fuzzy c-means method for scalable topic detection.

Abstract

Purpose

The aim of this research is to develop an eigenspace-based fuzzy c-means method for scalable topic detection.

Design/methodology/approach

The eigenspace-based fuzzy c-means (EFCM) combines representation learning and clustering. The textual data are transformed into a lower-dimensional eigenspace using truncated singular value decomposition. Fuzzy c-means is performed on the eigenspace to identify the centroids of each cluster. The topics are provided by transforming back the centroids into the nonnegative subspace of the original space. In this paper, we extend the EFCM method for scalability by using the two approaches, i.e. single-pass and online. We call the developed topic detection methods as oEFCM and spEFCM.

Findings

Our simulation shows that both oEFCM and spEFCM methods provide faster running times than EFCM for data sets that do not fit in memory. However, there is a decrease in the average coherence score. For both data sets that fit and do not fit into memory, the oEFCM method provides a tradeoff between running time and coherence score, which is better than spEFCM.

Originality/value

This research produces a scalable topic detection method. Besides this scalability capability, the developed method also provides a faster running time for the data set that fits in memory.

Details

Data Technologies and Applications, vol. 55 no. 4
Type: Research Article
ISSN: 2514-9288

Keywords

Article
Publication date: 3 November 2020

Femi Emmanuel Ayo, Olusegun Folorunso, Friday Thomas Ibharalu and Idowu Ademola Osinuga

Hate speech is an expression of intense hatred. Twitter has become a popular analytical tool for the prediction and monitoring of abusive behaviors. Hate speech detection with…

Abstract

Purpose

Hate speech is an expression of intense hatred. Twitter has become a popular analytical tool for the prediction and monitoring of abusive behaviors. Hate speech detection with social media data has witnessed special research attention in recent studies, hence, the need to design a generic metadata architecture and efficient feature extraction technique to enhance hate speech detection.

Design/methodology/approach

This study proposes a hybrid embeddings enhanced with a topic inference method and an improved cuckoo search neural network for hate speech detection in Twitter data. The proposed method uses a hybrid embeddings technique that includes Term Frequency-Inverse Document Frequency (TF-IDF) for word-level feature extraction and Long Short Term Memory (LSTM) which is a variant of recurrent neural networks architecture for sentence-level feature extraction. The extracted features from the hybrid embeddings then serve as input into the improved cuckoo search neural network for the prediction of a tweet as hate speech, offensive language or neither.

Findings

The proposed method showed better results when tested on the collected Twitter datasets compared to other related methods. In order to validate the performances of the proposed method, t-test and post hoc multiple comparisons were used to compare the significance and means of the proposed method with other related methods for hate speech detection. Furthermore, Paired Sample t-Test was also conducted to validate the performances of the proposed method with other related methods.

Research limitations/implications

Finally, the evaluation results showed that the proposed method outperforms other related methods with mean F1-score of 91.3.

Originality/value

The main novelty of this study is the use of an automatic topic spotting measure based on naïve Bayes model to improve features representation.

Details

International Journal of Intelligent Computing and Cybernetics, vol. 13 no. 4
Type: Research Article
ISSN: 1756-378X

Keywords

Article
Publication date: 1 September 2006

Haichao Dong, Siu Cheung Hui and Yulan He

The purpose of this research is to study the characteristics of chat messages from analysing a collection of 33,121 sample messages gathered from 1,700 sessions of conversations…

1373

Abstract

Purpose

The purpose of this research is to study the characteristics of chat messages from analysing a collection of 33,121 sample messages gathered from 1,700 sessions of conversations of 72 pairs of MSN Messenger users over a four month duration from June to September of 2005. The primary objective of chat message characterization is to understand the properties of chat messages for effective message analysis, such as message topic detection.

Design/methodology/approach

From the study on chat message characteristics, an indicative term‐based categorization approach for chat topic detection is proposed. In the proposed approach, different techniques such as sessionalisation of chat messages and extraction of features from icon texts and URLs are incorporated for message pre‐processing. Naïve Bayes, Associative Classification, and Support Vector Machine are employed as classifiers for categorizing topics from chat sessions.

Findings

Indicative term‐based approach is superior to the traditional document frequency based approach, for feature selection in chat topic categorization.

Originality/value

This paper studies the characteristics of chat messages and proposes an indicative term‐based categorization approach for chat topic detection.

Details

Online Information Review, vol. 30 no. 5
Type: Research Article
ISSN: 1468-4527

Keywords

Article
Publication date: 17 August 2018

Guillaume Gadek, Alexandre Pauchet, Nicolas Malandain, Laurent Vercouter, Khaled Khelif, Stéphan Brunessaux and Bruno Grilhères

Most of the existing literature on online social networks (OSNs) either focuses on community detection in graphs without considering the topic of the messages exchanged, or…

Abstract

Purpose

Most of the existing literature on online social networks (OSNs) either focuses on community detection in graphs without considering the topic of the messages exchanged, or concentrates exclusively on the messages without taking into account the social links. The purpose of this paper is to characterise the semantic cohesion of such groups through the introduction of new measures.

Design/methodology/approach

A theoretical model for social links and salient topics on Twitter is proposed. Also, measures to evaluate the topical cohesiveness of a group are introduced. Inspired from precision and recall, the proposed measures, called expertise and representativeness, assess how a set of groups match the topic distribution. An adapted measure is also introduced when a topic similarity can be computed. Finally, a topic relevance measure is defined, similar to tf.idf (term-frequency, inverse document frequency).

Findings

The measures yield interesting results, notably on a large tweet corpus: the metrics accurately describe the topics discussed in the tweets and enable to identify topic-focused groups. Combined with topological measures, they provide a global and concise view of the detected groups.

Originality/value

Many algorithms, applied on OSN, detect communities which often lack of meaning and internal semantic cohesion. This paper is among the first to quantify this aspect, and more precisely the topical cohesion and topical relevance of a group. Moreover, the proposed indicators can be exploited for social media monitoring, to investigate the impact of a group of people: for instance, they could be used for journalism, marketing and security purposes.

Details

Data Technologies and Applications, vol. 52 no. 4
Type: Research Article
ISSN: 2514-9288

Keywords

Article
Publication date: 3 October 2019

ELyazid Akachar, Brahim Ouhbi and Bouchra Frikh

The purpose of this paper is to present an algorithm for detecting communities in social networks.

Abstract

Purpose

The purpose of this paper is to present an algorithm for detecting communities in social networks.

Design/methodology/approach

The majority of existing methods of community detection in social networks are based on structural information, and they neglect the content information. In this paper, the authors propose a novel approach that combines the content and structure information to discover more meaningful communities in social networks. To integrate the content information in the process of community detection, the authors propose to exploit the texts involved in social networks to identify the users’ topics of interest. These topics are detected based on the statistical and semantic measures, which allow us to divide the users into different groups so that each group represents a distinct topic. Then, the authors perform links analysis in each group to discover the users who are highly interconnected (communities).

Findings

To validate the performance of the approach, the authors carried out a set of experiments on four real life data sets, and they compared their method with classical methods that ignore the content information.

Originality/value

The experimental results demonstrate that the quality of community structure is improved when we take into account the content and structure information during the procedure of community detection.

Details

International Journal of Web Information Systems, vol. 16 no. 1
Type: Research Article
ISSN: 1744-0084

Keywords

Article
Publication date: 24 July 2020

Thanh-Tho Quan, Duc-Trung Mai and Thanh-Duy Tran

This paper proposes an approach to identify categorical influencers (i.e. influencers is the person who is active in the targeted categories) in social media channels. Categorical…

Abstract

Purpose

This paper proposes an approach to identify categorical influencers (i.e. influencers is the person who is active in the targeted categories) in social media channels. Categorical influencers are important for media marketing but to automatically detect them remains a challenge.

Design/methodology/approach

We deployed the emerging deep learning approaches. Precisely, we used word embedding to encode semantic information of words occurring in the common microtext of social media and used variational autoencoder (VAE) to approximate the topic modeling process, through which the active categories of influencers are automatically detected. We developed a system known as Categorical Influencer Detection (CID) to realize those ideas.

Findings

The approach of using VAE to simulate the Latent Dirichlet Allocation (LDA) process can effectively handle the task of topic modeling on the vast dataset of microtext on social media channels.

Research limitations/implications

This work has two major contributions. The first one is the detection of topics on microtexts using deep learning approach. The second is the identification of categorical influencers in social media.

Practical implications

This work can help brands to do digital marketing on social media effectively by approaching appropriate influencers. A real case study is given to illustrate it.

Originality/value

In this paper, we discuss an approach to automatically identify the active categories of influencers by performing topic detection from the microtext related to the influencers in social media channels. To do so, we use deep learning to approximate the topic modeling process of the conventional approaches (such as LDA).

Details

Online Information Review, vol. 44 no. 5
Type: Research Article
ISSN: 1468-4527

Keywords

Book part
Publication date: 30 August 2019

Fulya Ozcan

This chapter investigates the behavior of Reddit’s news subreddit users and the relationship between their sentiment on exchange rates. Using graphical models and natural language…

Abstract

This chapter investigates the behavior of Reddit’s news subreddit users and the relationship between their sentiment on exchange rates. Using graphical models and natural language processing, hidden online communities among Reddit users are discovered. The data set used in this project is a mixture of text and categorical data from Reddit’s news subreddit. These data include the titles of the news pages, as well as a few user characteristics, in addition to users’ comments. This data set is an excellent resource to study user reaction to news since their comments are directly linked to the webpage contents. The model considered in this chapter is a hierarchical mixture model which is a generative model that detects overlapping networks using the sentiment from the user generated content. The advantage of this model is that the communities (or groups) are assumed to follow a Chinese restaurant process, and therefore it can automatically detect and cluster the communities. The hidden variables and the hyperparameters for this model are obtained using Gibbs sampling.

Details

Topics in Identification, Limited Dependent Variables, Partial Observability, Experimentation, and Flexible Modeling: Part A
Type: Book
ISBN: 978-1-78973-241-2

Keywords

Article
Publication date: 29 March 2013

Yuki Hattori and Akiyo Nadamoto

The information of social media is not often written in ordinary web pages. Nevertheless, it is difficult to extract such information from social media because such services…

1369

Abstract

Purpose

The information of social media is not often written in ordinary web pages. Nevertheless, it is difficult to extract such information from social media because such services include so much information. Furthermore, various topics are mixed in social media communities. The authors designate such important and unique information related to social media as tip information. In this paper, they aim to propose a method to extract tip information that has been classified by topic from social networking services as a first step in extracting tip information from social media.

Design/methodology/approach

Themes of many kinds exist in a social media community because users write contents freely. Then the authors first detect the topics from the community and cluster the comment based on the topics. Subsequently, they extract tip information from each cluster. In this time, the tip information is include a user's experience and it has common important words.

Findings

The authors used an experiment to confirm that their proposed method can extract appropriate tip information from a community that a user specifies. The average precision is 69 per cent. A comparison of the authors' proposed method and baseline which is without detection of topic and clustering, the average precision obtained using the authors' proposed method is 18 per cent greater than the baseline.

Originality/value

The authors have three points to extract tip information from social media: topic detection and clustering from the social media using LDA method; extracting an author's actual experiences; and creation of a tip keyword dictionary from user experiments.

Details

International Journal of Web Information Systems, vol. 9 no. 1
Type: Research Article
ISSN: 1744-0084

Keywords

1 – 10 of over 10000