Search results

1 – 10 of 72
Article
Publication date: 22 April 2024

Ruoxi Zhang and Chenhan Ren

This study aims to construct a sentiment series generation method for danmu comments based on deep learning, and explore the features of sentiment series after clustering.

Abstract

Purpose

This study aims to construct a sentiment series generation method for danmu comments based on deep learning, and explore the features of sentiment series after clustering.

Design/methodology/approach

This study consisted of two main parts: danmu comment sentiment series generation and clustering. In the first part, the authors proposed a sentiment classification model based on BERT fine-tuning to quantify danmu comment sentiment polarity. To smooth the sentiment series, they used methods, such as comprehensive weights. In the second part, the shaped-based distance (SBD)-K-shape method was used to cluster the actual collected data.

Findings

The filtered sentiment series or curves of the microfilms on the Bilibili website could be divided into four major categories. There is an apparently stable time interval for the first three types of sentiment curves, while the fourth type of sentiment curve shows a clear trend of fluctuation in general. In addition, it was found that “disputed points” or “highlights” are likely to appear at the beginning and the climax of films, resulting in significant changes in the sentiment curves. The clustering results show a significant difference in user participation, with the second type prevailing over others.

Originality/value

Their sentiment classification model based on BERT fine-tuning outperformed the traditional sentiment lexicon method, which provides a reference for using deep learning as well as transfer learning for danmu comment sentiment analysis. The BERT fine-tuning–SBD-K-shape algorithm can weaken the effect of non-regular noise and temporal phase shift of danmu text.

Details

The Electronic Library , vol. ahead-of-print no. ahead-of-print
Type: Research Article
ISSN: 0264-0473

Keywords

Article
Publication date: 6 February 2024

Lin Xue and Feng Zhang

With the increasing number of Web services, correct and efficient classification of Web services is crucial to improve the efficiency of service discovery. However, existing Web…

Abstract

Purpose

With the increasing number of Web services, correct and efficient classification of Web services is crucial to improve the efficiency of service discovery. However, existing Web service classification approaches ignore the class overlap in Web services, resulting in poor accuracy of classification in practice. This paper aims to provide an approach to address this issue.

Design/methodology/approach

This paper proposes a label confusion and priori correction-based Web service classification approach. First, functional semantic representations of Web services descriptions are obtained based on BERT. Then, the ability of the model is enhanced to recognize and classify overlapping instances by using label confusion learning techniques; Finally, the predictive results are corrected based on the label prior distribution to further improve service classification effectiveness.

Findings

Experiments based on the ProgrammableWeb data set show that the proposed model demonstrates 4.3%, 3.2% and 1% improvement in Macro-F1 value compared to the ServeNet-BERT, BERT-DPCNN and CARL-NET, respectively.

Originality/value

This paper proposes a Web service classification approach for the overlapping categories of Web services and improve the accuracy of Web services classification.

Details

International Journal of Web Information Systems, vol. ahead-of-print no. ahead-of-print
Type: Research Article
ISSN: 1744-0084

Keywords

Article
Publication date: 18 May 2023

Rongen Yan, Depeng Dang, Hu Gao, Yan Wu and Wenhui Yu

Question answering (QA) answers the questions asked by people in the form of natural language. In the QA, due to the subjectivity of users, the questions they query have different…

Abstract

Purpose

Question answering (QA) answers the questions asked by people in the form of natural language. In the QA, due to the subjectivity of users, the questions they query have different expressions, which increases the difficulty of text retrieval. Therefore, the purpose of this paper is to explore new query rewriting method for QA that integrates multiple related questions (RQs) to form an optimal question. Moreover, it is important to generate a new dataset of the original query (OQ) with multiple RQs.

Design/methodology/approach

This study collects a new dataset SQuAD_extend by crawling the QA community and uses word-graph to model the collected OQs. Next, Beam search finds the best path to get the best question. To deeply represent the features of the question, pretrained model BERT is used to model sentences.

Findings

The experimental results show three outstanding findings. (1) The quality of the answers is better after adding the RQs of the OQs. (2) The word-graph that is used to model the problem and choose the optimal path is conducive to finding the best question. (3) Finally, BERT can deeply characterize the semantics of the exact problem.

Originality/value

The proposed method can use word-graph to construct multiple questions and select the optimal path for rewriting the question, and the quality of answers is better than the baseline. In practice, the research results can help guide users to clarify their query intentions and finally achieve the best answer.

Details

Data Technologies and Applications, vol. 58 no. 1
Type: Research Article
ISSN: 2514-9288

Keywords

Open Access
Article
Publication date: 31 July 2023

Daniel Šandor and Marina Bagić Babac

Sarcasm is a linguistic expression that usually carries the opposite meaning of what is being said by words, thus making it difficult for machines to discover the actual meaning…

2941

Abstract

Purpose

Sarcasm is a linguistic expression that usually carries the opposite meaning of what is being said by words, thus making it difficult for machines to discover the actual meaning. It is mainly distinguished by the inflection with which it is spoken, with an undercurrent of irony, and is largely dependent on context, which makes it a difficult task for computational analysis. Moreover, sarcasm expresses negative sentiments using positive words, allowing it to easily confuse sentiment analysis models. This paper aims to demonstrate the task of sarcasm detection using the approach of machine and deep learning.

Design/methodology/approach

For the purpose of sarcasm detection, machine and deep learning models were used on a data set consisting of 1.3 million social media comments, including both sarcastic and non-sarcastic comments. The data set was pre-processed using natural language processing methods, and additional features were extracted and analysed. Several machine learning models, including logistic regression, ridge regression, linear support vector and support vector machines, along with two deep learning models based on bidirectional long short-term memory and one bidirectional encoder representations from transformers (BERT)-based model, were implemented, evaluated and compared.

Findings

The performance of machine and deep learning models was compared in the task of sarcasm detection, and possible ways of improvement were discussed. Deep learning models showed more promise, performance-wise, for this type of task. Specifically, a state-of-the-art model in natural language processing, namely, BERT-based model, outperformed other machine and deep learning models.

Originality/value

This study compared the performance of the various machine and deep learning models in the task of sarcasm detection using the data set of 1.3 million comments from social media.

Details

Information Discovery and Delivery, vol. 52 no. 2
Type: Research Article
ISSN: 2398-6247

Keywords

Article
Publication date: 17 May 2023

Tong Yang, Jie Wu and Junming Zhang

This study aims to establish a comprehensive satisfaction analysis framework by mining online restaurant reviews, which can not only accurately reveal consumer satisfaction but…

Abstract

Purpose

This study aims to establish a comprehensive satisfaction analysis framework by mining online restaurant reviews, which can not only accurately reveal consumer satisfaction but also identify factors leading to dissatisfaction and further quantify improvement opportunity levels.

Design/methodology/approach

Adopting deep learning, Cross-Bidirectional Encoder Representations Transformers (BERT) model is developed to measure customer satisfaction. Furthermore, opinion mining technique is used to extract consumers’ opinions and obtain dissatisfaction factors. Furthermore, the opportunity algorithm is introduced to quantify attributes’ improvement opportunity levels. A total of 19,133 online reviews of 31 restaurants in Universal Beijing Resort are crawled to validate the framework.

Findings

Results demonstrate the superiority of Cross-BERT model compared to existing models such as sentiment lexicon-based model and Naïve Bayes. More importantly, after effectively unveiling customer dissatisfaction factors (e.g. long queuing time and taste salty), “Dish taste,” “Waiters’ attitude” and “Decoration” are identified as the three secondary attributes with the greatest improvement opportunities.

Practical implications

The proposed framework helps managers, especially in the restaurant industry, accurately understand customer satisfaction and reasons behind dissatisfaction, thereby generating efficient countermeasures. Especially, the improvement opportunity levels also benefit practitioners in efficiently allocating limited business resources.

Originality/value

This work contributes to hospitality and tourism literature by developing a comprehensive customer satisfaction analysis framework in the big data era. Moreover, to the best of the authors’ knowledge, this work is among the first to introduce opportunity algorithm to quantify service improvement benefits. The proposed Cross-BERT model also advances the methodological literature on measuring customer satisfaction.

Details

International Journal of Contemporary Hospitality Management, vol. 36 no. 3
Type: Research Article
ISSN: 0959-6119

Keywords

Article
Publication date: 5 May 2023

Ying Yu and Jing Ma

The tender documents, an essential data source for internet-based logistics tendering platforms, incorporate massive fine-grained data, ranging from information on tenderee…

Abstract

Purpose

The tender documents, an essential data source for internet-based logistics tendering platforms, incorporate massive fine-grained data, ranging from information on tenderee, shipping location and shipping items. Automated information extraction in this area is, however, under-researched, making the extraction process a time- and effort-consuming one. For Chinese logistics tender entities, in particular, existing named entity recognition (NER) solutions are mostly unsuitable as they involve domain-specific terminologies and possess different semantic features.

Design/methodology/approach

To tackle this problem, a novel lattice long short-term memory (LSTM) model, combining a variant contextual feature representation and a conditional random field (CRF) layer, is proposed in this paper for identifying valuable entities from logistic tender documents. Instead of traditional word embedding, the proposed model uses the pretrained Bidirectional Encoder Representations from Transformers (BERT) model as input to augment the contextual feature representation. Subsequently, with the Lattice-LSTM model, the information of characters and words is effectively utilized to avoid error segmentation.

Findings

The proposed model is then verified by the Chinese logistic tender named entity corpus. Moreover, the results suggest that the proposed model excels in the logistics tender corpus over other mainstream NER models. The proposed model underpins the automatic extraction of logistics tender information, enabling logistic companies to perceive the ever-changing market trends and make far-sighted logistic decisions.

Originality/value

(1) A practical model for logistic tender NER is proposed in the manuscript. By employing and fine-tuning BERT into the downstream task with a small amount of data, the experiment results show that the model has a better performance than other existing models. This is the first study, to the best of the authors' knowledge, to extract named entities from Chinese logistic tender documents. (2) A real logistic tender corpus for practical use is constructed and a program of the model for online-processing real logistic tender documents is developed in this work. The authors believe that the model will facilitate logistic companies in converting unstructured documents to structured data and further perceive the ever-changing market trends to make far-sighted logistic decisions.

Details

Data Technologies and Applications, vol. 58 no. 1
Type: Research Article
ISSN: 2514-9288

Keywords

Article
Publication date: 20 March 2024

Qiuying Chen, Ronghui Liu, Qingquan Jiang and Shangyue Xu

Tourists with different cultural backgrounds think and behave differently. Accurately capturing and correctly understanding cultural differences will help tourist destinations in…

Abstract

Purpose

Tourists with different cultural backgrounds think and behave differently. Accurately capturing and correctly understanding cultural differences will help tourist destinations in product/service planning, marketing communication and attracting and retaining tourists. This research employs Hofstede's cultural dimensions theory to analyse the variations in destination image perceptions of Chinese-speaking and English-speaking tourists to Xiamen, a prominent tourist attraction in China.

Design/methodology/approach

The evaluation utilizes a two-stage approach, incorporating LDA and BERT-BILSTM models. By leveraging text mining, sentiment analysis and t-tests, this research investigates the variations in tourists' perceptions of Xiamen across different cultures.

Findings

The results reveal that cultural disparities significantly impact tourists' perceived image of Xiamen, particularly regarding their preferences for renowned tourist destinations and the factors influencing their travel experience.

Originality/value

This research pioneers applying natural language processing methods and machine learning techniques to affirm the substantial differences in the perceptions of tourist destinations among Chinese-speaking and English-speaking tourists based on Hofstede's cultural theory. The findings furnish theoretical insights for destination marketing organizations to target diverse cultural tourists through precise marketing strategies and illuminate the practical application of Hofstede's cultural theory in tourism and hospitality.

Details

Data Technologies and Applications, vol. ahead-of-print no. ahead-of-print
Type: Research Article
ISSN: 2514-9288

Keywords

Article
Publication date: 29 August 2023

Hei-Chia Wang, Martinus Maslim and Hung-Yu Liu

A clickbait is a deceptive headline designed to boost ad revenue without presenting closely relevant content. There are numerous negative repercussions of clickbait, such as…

Abstract

Purpose

A clickbait is a deceptive headline designed to boost ad revenue without presenting closely relevant content. There are numerous negative repercussions of clickbait, such as causing viewers to feel tricked and unhappy, causing long-term confusion, and even attracting cyber criminals. Automatic detection algorithms for clickbait have been developed to address this issue. The fact that there is only one semantic representation for the same term and a limited dataset in Chinese is a need for the existing technologies for detecting clickbait. This study aims to solve the limitations of automated clickbait detection in the Chinese dataset.

Design/methodology/approach

This study combines both to train the model to capture the probable relationship between clickbait news headlines and news content. In addition, part-of-speech elements are used to generate the most appropriate semantic representation for clickbait detection, improving clickbait detection performance.

Findings

This research successfully compiled a dataset containing up to 20,896 Chinese clickbait news articles. This collection contains news headlines, articles, categories and supplementary metadata. The suggested context-aware clickbait detection (CA-CD) model outperforms existing clickbait detection approaches on many criteria, demonstrating the proposed strategy's efficacy.

Originality/value

The originality of this study resides in the newly compiled Chinese clickbait dataset and contextual semantic representation-based clickbait detection approach employing transfer learning. This method can modify the semantic representation of each word based on context and assist the model in more precisely interpreting the original meaning of news articles.

Details

Data Technologies and Applications, vol. 58 no. 2
Type: Research Article
ISSN: 2514-9288

Keywords

Article
Publication date: 29 March 2024

Sihao Li, Jiali Wang and Zhao Xu

The compliance checking of Building Information Modeling (BIM) models is crucial throughout the lifecycle of construction. The increasing amount and complexity of information…

Abstract

Purpose

The compliance checking of Building Information Modeling (BIM) models is crucial throughout the lifecycle of construction. The increasing amount and complexity of information carried by BIM models have made compliance checking more challenging, and manual methods are prone to errors. Therefore, this study aims to propose an integrative conceptual framework for automated compliance checking of BIM models, allowing for the identification of errors within BIM models.

Design/methodology/approach

This study first analyzed the typical building standards in the field of architecture and fire protection, and then the ontology of these elements is developed. Based on this, a building standard corpus is built, and deep learning models are trained to automatically label the building standard texts. The Neo4j is utilized for knowledge graph construction and storage, and a data extraction method based on the Dynamo is designed to obtain checking data files. After that, a matching algorithm is devised to express the logical rules of knowledge graph triples, resulting in automated compliance checking for BIM models.

Findings

Case validation results showed that this theoretical framework can achieve the automatic construction of domain knowledge graphs and automatic checking of BIM model compliance. Compared with traditional methods, this method has a higher degree of automation and portability.

Originality/value

This study introduces knowledge graphs and natural language processing technology into the field of BIM model checking and completes the automated process of constructing domain knowledge graphs and checking BIM model data. The validation of its functionality and usability through two case studies on a self-developed BIM checking platform.

Details

Engineering, Construction and Architectural Management, vol. ahead-of-print no. ahead-of-print
Type: Research Article
ISSN: 0969-9988

Keywords

Article
Publication date: 5 April 2024

Ayse Ocal and Kevin Crowston

Research on artificial intelligence (AI) and its potential effects on the workplace is increasing. How AI and the futures of work are framed in traditional media has been examined…

Abstract

Purpose

Research on artificial intelligence (AI) and its potential effects on the workplace is increasing. How AI and the futures of work are framed in traditional media has been examined in prior studies, but current research has not gone far enough in examining how AI is framed on social media. This paper aims to fill this gap by examining how people frame the futures of work and intelligent machines when they post on social media.

Design/methodology/approach

We investigate public interpretations, assumptions and expectations, referring to framing expressed in social media conversations. We also coded the emotions and attitudes expressed in the text data. A corpus consisting of 998 unique Reddit post titles and their corresponding 16,611 comments was analyzed using computer-aided textual analysis comprising a BERTopic model and two BERT text classification models, one for emotion and the other for sentiment analysis, supported by human judgment.

Findings

Different interpretations, assumptions and expectations were found in the conversations. Three subframes were analyzed in detail under the overarching frame of the New World of Work: (1) general impacts of intelligent machines on society, (2) undertaking of tasks (augmentation and substitution) and (3) loss of jobs. The general attitude observed in conversations was slightly positive, and the most common emotion category was curiosity.

Originality/value

Findings from this research can uncover public needs and expectations regarding the future of work with intelligent machines. The findings may also help shape research directions about futures of work. Furthermore, firms, organizations or industries may employ framing methods to analyze customers’ or workers’ responses or even influence the responses. Another contribution of this work is the application of framing theory to interpreting how people conceptualize the future of work with intelligent machines.

Details

Information Technology & People, vol. ahead-of-print no. ahead-of-print
Type: Research Article
ISSN: 0959-3845

Keywords

1 – 10 of 72