Search results

1 – 10 of over 2000
Open Access
Article
Publication date: 13 October 2022

Linzi Wang, Qiudan Li, Jingjun David Xu and Minjie Yuan

Mining user-concerned actionable and interpretable hot topics will help management departments fully grasp the latest events and make timely decisions. Existing topic models…

379

Abstract

Purpose

Mining user-concerned actionable and interpretable hot topics will help management departments fully grasp the latest events and make timely decisions. Existing topic models primarily integrate word embedding and matrix decomposition, which only generates keyword-based hot topics with weak interpretability, making it difficult to meet the specific needs of users. Mining phrase-based hot topics with syntactic dependency structure have been proven to model structure information effectively. A key challenge lies in the effective integration of the above information into the hot topic mining process.

Design/methodology/approach

This paper proposes the nonnegative matrix factorization (NMF)-based hot topic mining method, semantics syntax-assisted hot topic model (SSAHM), which combines semantic association and syntactic dependency structure. First, a semantic–syntactic component association matrix is constructed. Then, the matrix is used as a constraint condition to be incorporated into the block coordinate descent (BCD)-based matrix decomposition process. Finally, a hot topic information-driven phrase extraction algorithm is applied to describe hot topics.

Findings

The efficacy of the developed model is demonstrated on two real-world datasets, and the effects of dependency structure information on different topics are compared. The qualitative examples further explain the application of the method in real scenarios.

Originality/value

Most prior research focuses on keyword-based hot topics. Thus, the literature is advanced by mining phrase-based hot topics with syntactic dependency structure, which can effectively analyze the semantics. The development of syntactic dependency structure considering the combination of word order and part-of-speech (POS) is a step forward as word order, and POS are only separately utilized in the prior literature. Ignoring this synergy may miss important information, such as grammatical structure coherence and logical relations between syntactic components.

Details

Journal of Electronic Business & Digital Economics, vol. 1 no. 1/2
Type: Research Article
ISSN: 2754-4214

Keywords

Open Access
Article
Publication date: 14 August 2020

Paramita Ray and Amlan Chakrabarti

Social networks have changed the communication patterns significantly. Information available from different social networking sites can be well utilized for the analysis of users…

6418

Abstract

Social networks have changed the communication patterns significantly. Information available from different social networking sites can be well utilized for the analysis of users opinion. Hence, the organizations would benefit through the development of a platform, which can analyze public sentiments in the social media about their products and services to provide a value addition in their business process. Over the last few years, deep learning is very popular in the areas of image classification, speech recognition, etc. However, research on the use of deep learning method in sentiment analysis is limited. It has been observed that in some cases the existing machine learning methods for sentiment analysis fail to extract some implicit aspects and might not be very useful. Therefore, we propose a deep learning approach for aspect extraction from text and analysis of users sentiment corresponding to the aspect. A seven layer deep convolutional neural network (CNN) is used to tag each aspect in the opinionated sentences. We have combined deep learning approach with a set of rule-based approach to improve the performance of aspect extraction method as well as sentiment scoring method. We have also tried to improve the existing rule-based approach of aspect extraction by aspect categorization with a predefined set of aspect categories using clustering method and compared our proposed method with some of the state-of-the-art methods. It has been observed that the overall accuracy of our proposed method is 0.87 while that of the other state-of-the-art methods like modified rule-based method and CNN are 0.75 and 0.80 respectively. The overall accuracy of our proposed method shows an increment of 7–12% from that of the state-of-the-art methods.

Details

Applied Computing and Informatics, vol. 18 no. 1/2
Type: Research Article
ISSN: 2634-1964

Keywords

Open Access
Article
Publication date: 11 October 2023

Bachriah Fatwa Dhini, Abba Suganda Girsang, Unggul Utan Sufandi and Heny Kurniawati

The authors constructed an automatic essay scoring (AES) model in a discussion forum where the result was compared with scores given by human evaluators. This research proposes…

Abstract

Purpose

The authors constructed an automatic essay scoring (AES) model in a discussion forum where the result was compared with scores given by human evaluators. This research proposes essay scoring, which is conducted through two parameters, semantic and keyword similarities, using a SentenceTransformers pre-trained model that can construct the highest vector embedding. Combining these models is used to optimize the model with increasing accuracy.

Design/methodology/approach

The development of the model in the study is divided into seven stages: (1) data collection, (2) pre-processing data, (3) selected pre-trained SentenceTransformers model, (4) semantic similarity (sentence pair), (5) keyword similarity, (6) calculate final score and (7) evaluating model.

Findings

The multilingual paraphrase-multilingual-MiniLM-L12-v2 and distilbert-base-multilingual-cased-v1 models got the highest scores from comparisons of 11 pre-trained multilingual models of SentenceTransformers with Indonesian data (Dhini and Girsang, 2023). Both multilingual models were adopted in this study. A combination of two parameters is obtained by comparing the response of the keyword extraction responses with the rubric keywords. Based on the experimental results, proposing a combination can increase the evaluation results by 0.2.

Originality/value

This study uses discussion forum data from the general biology course in online learning at the open university for the 2020.2 and 2021.2 semesters. Forum discussion ratings are still manual. In this survey, the authors created a model that automatically calculates the value of discussion forums, which are essays based on the lecturer's answers moreover rubrics.

Details

Asian Association of Open Universities Journal, vol. 18 no. 3
Type: Research Article
ISSN: 1858-3431

Keywords

Open Access
Article
Publication date: 9 January 2024

Kazuyuki Motohashi and Chen Zhu

This study aims to assess the technological capability of Chinese internet platforms (BAT: Baidu, Alibaba, Tencent) compared to US ones (GAFA: Google, Amazon, Facebook, Apple)…

Abstract

Purpose

This study aims to assess the technological capability of Chinese internet platforms (BAT: Baidu, Alibaba, Tencent) compared to US ones (GAFA: Google, Amazon, Facebook, Apple). More specifically, this study explores Baidu’s technological catching-up process with Google by analyzing their patent textual information.

Design/methodology/approach

The authors retrieved 26,383 Google patents and 6,695 Baidu patents from PATSTAT 2019 Spring version. The collected patent documents were vectorized using the Word2Vec model first, and then K-means clustering was applied to visualize the technological space of two firms. Finally, novel indicators were proposed to capture the technological catching-up process between Baidu and Google.

Findings

The results show that Baidu follows a trend of US rather than Chinese technology which suggests Baidu is aggressively seeking to catch up with US players in the process of its technological development. At the same time, the impact index of Baidu patents increases over time, reflecting its upgrading of technological competitiveness.

Originality/value

This study proposed a new method to analyze technology mapping and evolution based on patent text information. As both US and China are crucial players in the internet industry, it is vital for policymakers in third countries to understand the technological capacity and competitiveness of both countries to develop strategic partnerships effectively.

Details

Asia Pacific Journal of Innovation and Entrepreneurship, vol. ahead-of-print no. ahead-of-print
Type: Research Article
ISSN: 2071-1395

Keywords

Open Access
Article
Publication date: 21 June 2021

Bufei Xing, Haonan Yin, Zhijun Yan and Jiachen Wang

The purpose of this paper is to propose a new approach to retrieve similar questions in online health communities to improve the efficiency of health information retrieval and…

Abstract

Purpose

The purpose of this paper is to propose a new approach to retrieve similar questions in online health communities to improve the efficiency of health information retrieval and sharing.

Design/methodology/approach

This paper proposes a hybrid approach to combining domain knowledge similarity and topic similarity to retrieve similar questions in online health communities. The domain knowledge similarity can evaluate the domain distance between different questions. And the topic similarity measures questions’ relationship base on the extracted latent topics.

Findings

The experiment results show that the proposed method outperforms the baseline methods.

Originality/value

This method conquers the problem of word mismatch and considers the named entities included in questions, which most of existing studies did not.

Details

International Journal of Crowd Science, vol. 5 no. 2
Type: Research Article
ISSN: 2398-7294

Keywords

Open Access
Article
Publication date: 19 December 2022

Baojun Ma, Jingxia He, Hui Yuan, Jian Zhang and Chi Zhang

Corporate social responsibility (CSR) is significant in the financial market. Despite plenty of existing research on CSR, few studies have quantified the fine-grained aspects of…

803

Abstract

Purpose

Corporate social responsibility (CSR) is significant in the financial market. Despite plenty of existing research on CSR, few studies have quantified the fine-grained aspects of CSR and examined how diverse CSR aspects are associated with firms' trade credit. Based on the released CSR reports, this paper strives to measure the CSR fulfillment of firms and examine the relationships between CSR and trade credit in terms of textual features presented in these reports.

Design/methodology/approach

This research proposes a natural language processing-based framework to extract the overall readability and the sentiment of fine-grained aspects from CSR reports, which can signal the performance of firms' CSR in diverse aspects. Furthermore, this paper explores how the textual features are associated with trade credit through partial dependence plots (PDPs), and PDPs can generate both linear and nonlinear relationships.

Findings

The study’s results reveal that the overall readability of the reports is positively associated with trade credit, while the performance of the fine-grained CSR aspects mentioned in the CSR reports matters differently. The performance of the environment has a positive impact on trade credit; the performance of creditors, suppliers and information disclosure, shows a U-shaped influence on trade credit; while the performance of the government and customers is negatively associated with trade credit.

Originality/value

This study expands the scope of research on CSR and trade credit by investigating fine-grained aspects covered in CSR reports. It also offers some managerial implications in the allocation of CSR resources and the presentation of CSR reports.

Details

Journal of Electronic Business & Digital Economics, vol. 2 no. 1
Type: Research Article
ISSN: 2754-4214

Keywords

Open Access
Article
Publication date: 31 July 2023

Daniel Šandor and Marina Bagić Babac

Sarcasm is a linguistic expression that usually carries the opposite meaning of what is being said by words, thus making it difficult for machines to discover the actual meaning…

2928

Abstract

Purpose

Sarcasm is a linguistic expression that usually carries the opposite meaning of what is being said by words, thus making it difficult for machines to discover the actual meaning. It is mainly distinguished by the inflection with which it is spoken, with an undercurrent of irony, and is largely dependent on context, which makes it a difficult task for computational analysis. Moreover, sarcasm expresses negative sentiments using positive words, allowing it to easily confuse sentiment analysis models. This paper aims to demonstrate the task of sarcasm detection using the approach of machine and deep learning.

Design/methodology/approach

For the purpose of sarcasm detection, machine and deep learning models were used on a data set consisting of 1.3 million social media comments, including both sarcastic and non-sarcastic comments. The data set was pre-processed using natural language processing methods, and additional features were extracted and analysed. Several machine learning models, including logistic regression, ridge regression, linear support vector and support vector machines, along with two deep learning models based on bidirectional long short-term memory and one bidirectional encoder representations from transformers (BERT)-based model, were implemented, evaluated and compared.

Findings

The performance of machine and deep learning models was compared in the task of sarcasm detection, and possible ways of improvement were discussed. Deep learning models showed more promise, performance-wise, for this type of task. Specifically, a state-of-the-art model in natural language processing, namely, BERT-based model, outperformed other machine and deep learning models.

Originality/value

This study compared the performance of the various machine and deep learning models in the task of sarcasm detection using the data set of 1.3 million comments from social media.

Details

Information Discovery and Delivery, vol. 52 no. 2
Type: Research Article
ISSN: 2398-6247

Keywords

Open Access
Article
Publication date: 12 June 2017

Lichao Zhu, Hangzhou Yang and Zhijun Yan

The purpose of this paper is to develop a new method to extract medical temporal information from online health communities.

Abstract

Purpose

The purpose of this paper is to develop a new method to extract medical temporal information from online health communities.

Design/methodology/approach

The authors trained a conditional random-filed model for the extraction of temporal expressions. The temporal relation identification is considered as a classification task and several support vector machine classifiers are built in the proposed method. For the model training, the authors extracted some high-level semantic features including co-reference relationship of medical concepts and the semantic similarity among words.

Findings

For the extraction of TIMEX, the authors find that well-formatted expressions are easy to recognize, and the main challenge is the relative TIMEX such as “three days after onset”. It also shows the same difficulty for normalization of absolute date or well-formatted duration, whereas frequency is easier to be normalized. For the identification of DocTimeRel, the result is fairly well, and the relation is difficult to identify when it involves a relative TIMEX or a hypothetical concept.

Originality/value

The authors proposed a new method to extract temporal information from the online clinical data and evaluated the usefulness of different level of syntactic features in this task.

Details

International Journal of Crowd Science, vol. 1 no. 2
Type: Research Article
ISSN: 2398-7294

Keywords

Open Access
Article
Publication date: 19 October 2020

Dipak Kumar Bhattacharyya

This study aims to help students to relate their theoretical knowledge in managing change in a crisis. It is more relevant in today’s pandemic situation and could be a morale…

718

Abstract

Purpose

This study aims to help students to relate their theoretical knowledge in managing change in a crisis. It is more relevant in today’s pandemic situation and could be a morale booster for many entrepreneurs who are struggling to sustain.

Design/methodology/approach

It is based on managing real-life change situation in organization, and it is presented in narrative form.

Findings

CERA India could successfully transform and sustain in Covid-19 pandemic situation with an inclusive approach, without losing their identity.

Research limitations/implications

This study is based on consulting experience and success story of one organization in pandemic situation. Important message is in a crisis, organizations can sustain partnering with people. But, this depends on the prevalent culture of the organization. Also, other organizations before replication need to ascertain the problem of their brand dilution, for shifting their focus to other product lines.

Practical implications

This story can be used in organizational change management classes, and students may be assigned to document their lessons. At the end of the story, some possible areas of investigation for students are listed for getting appropriate direction.

Social implications

In this pandemic situation, this study is socially relevant, as it shows how organizations can sustain with a human face.

Originality/value

This study is original and based on real-life experience in managing organizational transformation in a crisis situation. The name of the organization is imaginary, as organization did not like their name in public. This is one reason of not using their data for tabular presentation.

Details

Vilakshan - XIMB Journal of Management, vol. 17 no. 1/2
Type: Research Article
ISSN: 0973-1954

Keywords

Open Access
Article
Publication date: 19 July 2022

Shreyesh Doppalapudi, Tingyan Wang and Robin Qiu

Clinical notes typically contain medical jargons and specialized words and phrases that are complicated and technical to most people, which is one of the most challenging…

1061

Abstract

Purpose

Clinical notes typically contain medical jargons and specialized words and phrases that are complicated and technical to most people, which is one of the most challenging obstacles in health information dissemination to consumers by healthcare providers. The authors aim to investigate how to leverage machine learning techniques to transform clinical notes of interest into understandable expressions.

Design/methodology/approach

The authors propose a natural language processing pipeline that is capable of extracting relevant information from long unstructured clinical notes and simplifying lexicons by replacing medical jargons and technical terms. Particularly, the authors develop an unsupervised keywords matching method to extract relevant information from clinical notes. To automatically evaluate completeness of the extracted information, the authors perform a multi-label classification task on the relevant texts. To simplify lexicons in the relevant text, the authors identify complex words using a sequence labeler and leverage transformer models to generate candidate words for substitution. The authors validate the proposed pipeline using 58,167 discharge summaries from critical care services.

Findings

The results show that the proposed pipeline can identify relevant information with high completeness and simplify complex expressions in clinical notes so that the converted notes have a high level of readability but a low degree of meaning change.

Social implications

The proposed pipeline can help healthcare consumers well understand their medical information and therefore strengthen communications between healthcare providers and consumers for better care.

Originality/value

An innovative pipeline approach is developed to address the health literacy problem confronted by healthcare providers and consumers in the ongoing digital transformation process in the healthcare industry.

1 – 10 of over 2000