Search results

1 – 10 of over 35000
To view the access options for this content please click here
Article
Publication date: 13 July 2021

Shubham Bharti, Arun Kumar Yadav, Mohit Kumar and Divakar Yadav

With the rise of social media platforms, an increasing number of cases of cyberbullying has reemerged. Every day, large number of people, especially teenagers, become the…

Abstract

Purpose

With the rise of social media platforms, an increasing number of cases of cyberbullying has reemerged. Every day, large number of people, especially teenagers, become the victim of cyber abuse. A cyberbullied person can have a long-lasting impact on his mind. Due to it, the victim may develop social anxiety, engage in self-harm, go into depression or in the extreme cases, it may lead to suicide. This paper aims to evaluate various techniques to automatically detect cyberbullying from tweets by using machine learning and deep learning approaches.

Design/methodology/approach

The authors applied machine learning algorithms approach and after analyzing the experimental results, the authors postulated that deep learning algorithms perform better for the task. Word-embedding techniques were used for word representation for our model training. Pre-trained embedding GloVe was used to generate word embedding. Different versions of GloVe were used and their performance was compared. Bi-directional long short-term memory (BLSTM) was used for classification.

Findings

The dataset contains 35,787 labeled tweets. The GloVe840 word embedding technique along with BLSTM provided the best results on the dataset with an accuracy, precision and F1 measure of 92.60%, 96.60% and 94.20%, respectively.

Research limitations/implications

If a word is not present in pre-trained embedding (GloVe), it may be given a random vector representation that may not correspond to the actual meaning of the word. It means that if a word is out of vocabulary (OOV) then it may not be represented suitably which can affect the detection of cyberbullying tweets. The problem may be rectified through the use of character level embedding of words.

Practical implications

The findings of the work may inspire entrepreneurs to leverage the proposed approach to build deployable systems to detect cyberbullying in different contexts such as workplace, school, etc and may also draw the attention of lawmakers and policymakers to create systemic tools to tackle the ills of cyberbullying.

Social implications

Cyberbullying, if effectively detected may save the victims from various psychological problems which, in turn, may lead society to a healthier and more productive life.

Originality/value

The proposed method produced results that outperform the state-of-the-art approaches in detecting cyberbullying from tweets. It uses a large dataset, created by intelligently merging two publicly available datasets. Further, a comprehensive evaluation of the proposed methodology has been presented.

Details

Kybernetes, vol. ahead-of-print no. ahead-of-print
Type: Research Article
ISSN: 0368-492X

Keywords

To view the access options for this content please click here
Article
Publication date: 24 July 2020

Thanh-Tho Quan, Duc-Trung Mai and Thanh-Duy Tran

This paper proposes an approach to identify categorical influencers (i.e. influencers is the person who is active in the targeted categories) in social media channels…

Abstract

Purpose

This paper proposes an approach to identify categorical influencers (i.e. influencers is the person who is active in the targeted categories) in social media channels. Categorical influencers are important for media marketing but to automatically detect them remains a challenge.

Design/methodology/approach

We deployed the emerging deep learning approaches. Precisely, we used word embedding to encode semantic information of words occurring in the common microtext of social media and used variational autoencoder (VAE) to approximate the topic modeling process, through which the active categories of influencers are automatically detected. We developed a system known as Categorical Influencer Detection (CID) to realize those ideas.

Findings

The approach of using VAE to simulate the Latent Dirichlet Allocation (LDA) process can effectively handle the task of topic modeling on the vast dataset of microtext on social media channels.

Research limitations/implications

This work has two major contributions. The first one is the detection of topics on microtexts using deep learning approach. The second is the identification of categorical influencers in social media.

Practical implications

This work can help brands to do digital marketing on social media effectively by approaching appropriate influencers. A real case study is given to illustrate it.

Originality/value

In this paper, we discuss an approach to automatically identify the active categories of influencers by performing topic detection from the microtext related to the influencers in social media channels. To do so, we use deep learning to approximate the topic modeling process of the conventional approaches (such as LDA).

Details

Online Information Review, vol. 44 no. 5
Type: Research Article
ISSN: 1468-4527

Keywords

To view the access options for this content please click here
Article
Publication date: 24 September 2020

Toshiki Tomihira, Atsushi Otsuka, Akihiro Yamashita and Tetsuji Satoh

Recently, Unicode has been standardized with the penetration of social networking services, the use of emojis has become common. Emojis, as they are also known, are most…

Abstract

Purpose

Recently, Unicode has been standardized with the penetration of social networking services, the use of emojis has become common. Emojis, as they are also known, are most effective in expressing emotions in sentences. Sentiment analysis in natural language processing manually labels emotions for sentences. The authors can predict sentiment using emoji of text posted on social media without labeling manually. The purpose of this paper is to propose a new model that learns from sentences using emojis as labels, collecting English and Japanese tweets from Twitter as the corpus. The authors verify and compare multiple models based on attention long short-term memory (LSTM) and convolutional neural networks (CNN) and Bidirectional Encoder Representations from Transformers (BERT).

Design/methodology/approach

The authors collected 2,661 kinds of emoji registered as Unicode characters from tweets using Twitter application programming interface. It is a total of 6,149,410 tweets in Japanese. First, the authors visualized a vector space produced by the emojis by Word2Vec. In addition, the authors found that emojis and similar meaning words of emojis are adjacent and verify that emoji can be used for sentiment analysis. Second, it involves entering a line of tweets containing emojis, learning and testing with that emoji as a label. The authors compared the BERT model with the conventional models [CNN, FastText and Attention bidirectional long short-term memory (BiLSTM)] that were high scores in the previous study.

Findings

Visualized the vector space of Word2Vec, the authors found that emojis and similar meaning words of emojis are adjacent and verify that emoji can be used for sentiment analysis. The authors obtained a higher score with BERT models compared to the conventional model. Therefore, the sophisticated experiments demonstrate that they improved the score over the conventional model in two languages. General emoji prediction is greatly influenced by context. In addition, the score may be lowered due to a misunderstanding of meaning. By using BERT based on a bi-directional transformer, the authors can consider the context.

Practical implications

The authors can find emoji in the output words by typing a word using an input method editor (IME). The current IME only considers the most latest inputted word, although it is possible to recommend emojis considering the context of the inputted sentence in this study. Therefore, the research can be used to improve IME performance in the future.

Originality/value

In the paper, the authors focus on multilingual emoji prediction. This is the first attempt of comparison at emoji prediction between Japanese and English. In addition, it is also the first attempt to use the BERT model based on the transformer for predicting limited emojis although the transformer is known to be effective for various NLP tasks. The authors found that a bidirectional transformer is suitable for emoji prediction.

Details

International Journal of Web Information Systems, vol. 16 no. 3
Type: Research Article
ISSN: 1744-0084

Keywords

Content available
Article
Publication date: 14 August 2020

Paramita Ray and Amlan Chakrabarti

Social networks have changed the communication patterns significantly. Information available from different social networking sites can be well utilized for the analysis…

Abstract

Social networks have changed the communication patterns significantly. Information available from different social networking sites can be well utilized for the analysis of users opinion. Hence, the organizations would benefit through the development of a platform, which can analyze public sentiments in the social media about their products and services to provide a value addition in their business process. Over the last few years, deep learning is very popular in the areas of image classification, speech recognition, etc. However, research on the use of deep learning method in sentiment analysis is limited. It has been observed that in some cases the existing machine learning methods for sentiment analysis fail to extract some implicit aspects and might not be very useful. Therefore, we propose a deep learning approach for aspect extraction from text and analysis of users sentiment corresponding to the aspect. A seven layer deep convolutional neural network (CNN) is used to tag each aspect in the opinionated sentences. We have combined deep learning approach with a set of rule-based approach to improve the performance of aspect extraction method as well as sentiment scoring method. We have also tried to improve the existing rule-based approach of aspect extraction by aspect categorization with a predefined set of aspect categories using clustering method and compared our proposed method with some of the state-of-the-art methods. It has been observed that the overall accuracy of our proposed method is 0.87 while that of the other state-of-the-art methods like modified rule-based method and CNN are 0.75 and 0.80 respectively. The overall accuracy of our proposed method shows an increment of 7–12% from that of the state-of-the-art methods.

Details

Applied Computing and Informatics, vol. ahead-of-print no. ahead-of-print
Type: Research Article
ISSN: 2634-1964

Keywords

To view the access options for this content please click here

Abstract

Details

Using Subject Headings for Online Retrieval: Theory, Practice and Potential
Type: Book
ISBN: 978-0-12221-570-4

To view the access options for this content please click here
Article
Publication date: 12 January 2021

Hui Yuan, Yuanyuan Tang, Wei Xu and Raymond Yiu Keung Lau

Despite the extensive academic interest in social media sentiment for financial fields, multimodal data in the stock market has been neglected. The purpose of this paper…

Abstract

Purpose

Despite the extensive academic interest in social media sentiment for financial fields, multimodal data in the stock market has been neglected. The purpose of this paper is to explore the influence of multimodal social media data on stock performance, and investigate the underlying mechanism of two forms of social media data, i.e. text and pictures.

Design/methodology/approach

This research employs panel vector autoregressive models to quantify the effect of the sentiment derived from two modalities in social media, i.e. text information and picture information. Through the models, the authors examine the short-term and long-term associations between social media sentiment and stock performance, measured by three metrics. Specifically, the authors design an enhanced sentiment analysis method, integrating random walk and word embeddings through Global Vectors for Word Representation (GloVe), to construct a domain-specific lexicon and apply it to textual sentiment analysis. Secondly, the authors exploit a deep learning framework based on convolutional neural networks to analyze the sentiment in picture data.

Findings

The empirical results derived from vector autoregressive models reveal that both measures of the sentiment extracted from textual information and pictorial information in social media are significant leading indicators of stock performance. Moreover, pictorial information and textual information have similar relationships with stock performance.

Originality/value

To the best of the authors’ knowledge, this is the first study that incorporates multimodal social media data for sentiment analysis, which is valuable in understanding pictures of social media data. The study offers significant implications for researchers and practitioners. This research informs researchers on the attention of multimodal social media data. The study’s findings provide some managerial recommendations, e.g. watching not only words but also pictures in social media.

Details

Internet Research, vol. 31 no. 3
Type: Research Article
ISSN: 1066-2243

Keywords

To view the access options for this content please click here
Article
Publication date: 7 October 2021

Juan Yang, Xu Du, Jui-Long Hung and Chih-hsiung Tu

Critical thinking is considered important in psychological science because it enables students to make effective decisions and optimizes their performance. Aiming at the…

Abstract

Purpose

Critical thinking is considered important in psychological science because it enables students to make effective decisions and optimizes their performance. Aiming at the challenges and issues of understanding the student's critical thinking, the objective of this study is to analyze online discussion data through an advanced multi-feature fusion modeling (MFFM) approach for automatically and accurately understanding the student's critical thinking levels.

Design/methodology/approach

An advanced MFFM approach is proposed in this study. Specifically, with considering the time-series characteristic and the high correlations between adjacent words in discussion contents, the long short-term memory–convolutional neural network (LSTM-CNN) architecture is proposed to extract deep semantic features, and then these semantic features are combined with linguistic and psychological knowledge generated by the LIWC2015 tool as the inputs of full-connected layers to automatically and accurately predict students' critical thinking levels that are hidden in online discussion data.

Findings

A series of experiments with 94 students' 7,691 posts were conducted to verify the effectiveness of the proposed approach. The experimental results show that the proposed MFFM approach that combines two types of textual features outperforms baseline methods, and the semantic-based padding can further improve the prediction performance of MFFM. It can achieve 0.8205 overall accuracy and 0.6172 F1 score for the “high” category on the validation dataset. Furthermore, it is found that the semantic features extracted by LSTM-CNN are more powerful for identifying self-introduction or off-topic discussions, while the linguistic, as well as psychological features, can better distinguish the discussion posts with the highest critical thinking level.

Originality/value

With the support of the proposed MFFM approach, online teachers can conveniently and effectively understand the interaction quality of online discussions, which can support instructional decision-making to better promote the student's knowledge construction process and improve learning performance.

Details

Data Technologies and Applications, vol. ahead-of-print no. ahead-of-print
Type: Research Article
ISSN: 2514-9288

Keywords

Content available
Article
Publication date: 21 June 2021

Bufei Xing, Haonan Yin, Zhijun Yan and Jiachen Wang

The purpose of this paper is to propose a new approach to retrieve similar questions in online health communities to improve the efficiency of health information retrieval…

Abstract

Purpose

The purpose of this paper is to propose a new approach to retrieve similar questions in online health communities to improve the efficiency of health information retrieval and sharing.

Design/methodology/approach

This paper proposes a hybrid approach to combining domain knowledge similarity and topic similarity to retrieve similar questions in online health communities. The domain knowledge similarity can evaluate the domain distance between different questions. And the topic similarity measures questions’ relationship base on the extracted latent topics.

Findings

The experiment results show that the proposed method outperforms the baseline methods.

Originality/value

This method conquers the problem of word mismatch and considers the named entities included in questions, which most of existing studies did not.

Details

International Journal of Crowd Science, vol. 5 no. 2
Type: Research Article
ISSN: 2398-7294

Keywords

To view the access options for this content please click here
Article
Publication date: 13 March 2020

Jinwook Choi, Yongmoo Suh and Namchul Jung

The purpose of this study is to investigate the effectiveness of qualitative information extracted from firm’s annual report in predicting corporate credit rating…

Abstract

Purpose

The purpose of this study is to investigate the effectiveness of qualitative information extracted from firm’s annual report in predicting corporate credit rating. Qualitative information represented by published reports or management interview has been known as an important source in addition to quantitative information represented by financial values in assigning corporate credit rating in practice. Nevertheless, prior studies have room for further research in that they rarely employed qualitative information in developing prediction model of corporate credit rating.

Design/methodology/approach

This study adopted three document vectorization methods, Bag-Of-Words (BOW), Word to Vector (Word2Vec) and Document to Vector (Doc2Vec), to transform an unstructured textual data into a numeric vector, so that Machine Learning (ML) algorithms accept it as an input. For the experiments, we used the corpus of Management’s Discussion and Analysis (MD&A) section in 10-K financial reports as well as financial variables and corporate credit rating data.

Findings

Experimental results from a series of multi-class classification experiments show the predictive models trained by both financial variables and vectors extracted from MD&A data outperform the benchmark models trained only by traditional financial variables.

Originality/value

This study proposed a new approach for corporate credit rating prediction by using qualitative information extracted from MD&A documents as an input to ML-based prediction models. Also, this research adopted and compared three textual vectorization methods in the domain of corporate credit rating prediction and showed that BOW mostly outperformed Word2Vec and Doc2Vec.

Details

Data Technologies and Applications, vol. 54 no. 2
Type: Research Article
ISSN: 2514-9288

Keywords

To view the access options for this content please click here
Article
Publication date: 18 October 2021

Saurabh Kumar

Decision-making in human beings is affected by emotions and sentiments. The affective computing takes this into account, intending to tailor decision support to the…

Abstract

Purpose

Decision-making in human beings is affected by emotions and sentiments. The affective computing takes this into account, intending to tailor decision support to the emotional states of people. However, the representation and classification of emotions is a very challenging task. The study used customized methods of deep learning models to aid in the accurate classification of emotions and sentiments.

Design/methodology/approach

The present study presents affective computing model using both text and image data. The text-based affective computing was conducted on four standard datasets using three deep learning customized models, namely LSTM, GRU and CNN. The study used four variants of deep learning including the LSTM model, LSTM model with GloVe embeddings, Bi-directional LSTM model and LSTM model with attention layer.

Findings

The result suggests that the proposed method outperforms the earlier methods. For image-based affective computing, the data was extracted from Instagram, and Facial emotion recognition was carried out using three deep learning models, namely CNN, transfer learning with VGG-19 model and transfer learning with ResNet-18 model. The results suggest that the proposed methods for both text and image can be used for affective computing and aid in decision-making.

Originality/value

The study used deep learning for affective computing. Earlier studies have used machine learning algorithms for affective computing. However, the present study uses deep learning for affective computing.

Details

Journal of Enterprise Information Management, vol. ahead-of-print no. ahead-of-print
Type: Research Article
ISSN: 1741-0398

Keywords

1 – 10 of over 35000