Search results

1 – 10 of over 1000
Open Access
Article
Publication date: 19 December 2023

Qinxu Ding, Ding Ding, Yue Wang, Chong Guan and Bosheng Ding

The rapid rise of large language models (LLMs) has propelled them to the forefront of applications in natural language processing (NLP). This paper aims to present a comprehensive…

1483

Abstract

Purpose

The rapid rise of large language models (LLMs) has propelled them to the forefront of applications in natural language processing (NLP). This paper aims to present a comprehensive examination of the research landscape in LLMs, providing an overview of the prevailing themes and topics within this dynamic domain.

Design/methodology/approach

Drawing from an extensive corpus of 198 records published between 1996 to 2023 from the relevant academic database encompassing journal articles, books, book chapters, conference papers and selected working papers, this study delves deep into the multifaceted world of LLM research. In this study, the authors employed the BERTopic algorithm, a recent advancement in topic modeling, to conduct a comprehensive analysis of the data after it had been meticulously cleaned and preprocessed. BERTopic leverages the power of transformer-based language models like bidirectional encoder representations from transformers (BERT) to generate more meaningful and coherent topics. This approach facilitates the identification of hidden patterns within the data, enabling authors to uncover valuable insights that might otherwise have remained obscure. The analysis revealed four distinct clusters of topics in LLM research: “language and NLP”, “education and teaching”, “clinical and medical applications” and “speech and recognition techniques”. Each cluster embodies a unique aspect of LLM application and showcases the breadth of possibilities that LLM technology has to offer. In addition to presenting the research findings, this paper identifies key challenges and opportunities in the realm of LLMs. It underscores the necessity for further investigation in specific areas, including the paramount importance of addressing potential biases, transparency and explainability, data privacy and security, and responsible deployment of LLM technology.

Findings

The analysis revealed four distinct clusters of topics in LLM research: “language and NLP”, “education and teaching”, “clinical and medical applications” and “speech and recognition techniques”. Each cluster embodies a unique aspect of LLM application and showcases the breadth of possibilities that LLM technology has to offer. In addition to presenting the research findings, this paper identifies key challenges and opportunities in the realm of LLMs. It underscores the necessity for further investigation in specific areas, including the paramount importance of addressing potential biases, transparency and explainability, data privacy and security, and responsible deployment of LLM technology.

Practical implications

This classification offers practical guidance for researchers, developers, educators, and policymakers to focus efforts and resources. The study underscores the importance of addressing challenges in LLMs, including potential biases, transparency, data privacy, and responsible deployment. Policymakers can utilize this information to shape regulations, while developers can tailor technology development based on the diverse applications identified. The findings also emphasize the need for interdisciplinary collaboration and highlight ethical considerations, providing a roadmap for navigating the complex landscape of LLM research and applications.

Originality/value

This study stands out as the first to examine the evolution of LLMs across such a long time frame and across such diversified disciplines. It provides a unique perspective on the key areas of LLM research, highlighting the breadth and depth of LLM’s evolution.

Details

Journal of Electronic Business & Digital Economics, vol. 3 no. 1
Type: Research Article
ISSN: 2754-4214

Keywords

Open Access
Article
Publication date: 31 July 2023

Daniel Šandor and Marina Bagić Babac

Sarcasm is a linguistic expression that usually carries the opposite meaning of what is being said by words, thus making it difficult for machines to discover the actual meaning…

2941

Abstract

Purpose

Sarcasm is a linguistic expression that usually carries the opposite meaning of what is being said by words, thus making it difficult for machines to discover the actual meaning. It is mainly distinguished by the inflection with which it is spoken, with an undercurrent of irony, and is largely dependent on context, which makes it a difficult task for computational analysis. Moreover, sarcasm expresses negative sentiments using positive words, allowing it to easily confuse sentiment analysis models. This paper aims to demonstrate the task of sarcasm detection using the approach of machine and deep learning.

Design/methodology/approach

For the purpose of sarcasm detection, machine and deep learning models were used on a data set consisting of 1.3 million social media comments, including both sarcastic and non-sarcastic comments. The data set was pre-processed using natural language processing methods, and additional features were extracted and analysed. Several machine learning models, including logistic regression, ridge regression, linear support vector and support vector machines, along with two deep learning models based on bidirectional long short-term memory and one bidirectional encoder representations from transformers (BERT)-based model, were implemented, evaluated and compared.

Findings

The performance of machine and deep learning models was compared in the task of sarcasm detection, and possible ways of improvement were discussed. Deep learning models showed more promise, performance-wise, for this type of task. Specifically, a state-of-the-art model in natural language processing, namely, BERT-based model, outperformed other machine and deep learning models.

Originality/value

This study compared the performance of the various machine and deep learning models in the task of sarcasm detection using the data set of 1.3 million comments from social media.

Details

Information Discovery and Delivery, vol. 52 no. 2
Type: Research Article
ISSN: 2398-6247

Keywords

Open Access
Article
Publication date: 6 April 2023

Karlo Puh and Marina Bagić Babac

Predicting the stock market's prices has always been an interesting topic since its closely related to making money. Recently, the advances in natural language processing (NLP…

4394

Abstract

Purpose

Predicting the stock market's prices has always been an interesting topic since its closely related to making money. Recently, the advances in natural language processing (NLP) have opened new perspectives for solving this task. The purpose of this paper is to show a state-of-the-art natural language approach to using language in predicting the stock market.

Design/methodology/approach

In this paper, the conventional statistical models for time-series prediction are implemented as a benchmark. Then, for methodological comparison, various state-of-the-art natural language models ranging from the baseline convolutional and recurrent neural network models to the most advanced transformer-based models are developed, implemented and tested.

Findings

Experimental results show that there is a correlation between the textual information in the news headlines and stock price prediction. The model based on the GRU (gated recurrent unit) cell with one linear layer, which takes pairs of the historical prices and the sentiment score calculated using transformer-based models, achieved the best result.

Originality/value

This study provides an insight into how to use NLP to improve stock price prediction and shows that there is a correlation between news headlines and stock price prediction.

Details

American Journal of Business, vol. 38 no. 2
Type: Research Article
ISSN: 1935-5181

Keywords

Open Access
Article
Publication date: 23 January 2024

Luís Jacques de Sousa, João Poças Martins, Luís Sanhudo and João Santos Baptista

This study aims to review recent advances towards the implementation of ANN and NLP applications during the budgeting phase of the construction process. During this phase…

Abstract

Purpose

This study aims to review recent advances towards the implementation of ANN and NLP applications during the budgeting phase of the construction process. During this phase, construction companies must assess the scope of each task and map the client’s expectations to an internal database of tasks, resources and costs. Quantity surveyors carry out this assessment manually with little to no computer aid, within very austere time constraints, even though these results determine the company’s bid quality and are contractually binding.

Design/methodology/approach

This paper seeks to compile applications of machine learning (ML) and natural language processing in the architectural engineering and construction sector to find which methodologies can assist this assessment. The paper carries out a systematic literature review, following the preferred reporting items for systematic reviews and meta-analyses guidelines, to survey the main scientific contributions within the topic of text classification (TC) for budgeting in construction.

Findings

This work concludes that it is necessary to develop data sets that represent the variety of tasks in construction, achieve higher accuracy algorithms, widen the scope of their application and reduce the need for expert validation of the results. Although full automation is not within reach in the short term, TC algorithms can provide helpful support tools.

Originality/value

Given the increasing interest in ML for construction and recent developments, the findings disclosed in this paper contribute to the body of knowledge, provide a more automated perspective on budgeting in construction and break ground for further implementation of text-based ML in budgeting for construction.

Details

Construction Innovation , vol. 24 no. 7
Type: Research Article
ISSN: 1471-4175

Keywords

Open Access
Article
Publication date: 21 October 2021

Elena Barbierato, Iacopo Bernetti and Irene Capecchi

Wine packaged tours as a specific aspect of wine tourism have so far been neglected in research, for this reason, the purpose of this study is to study the key elements for the…

3748

Abstract

Purpose

Wine packaged tours as a specific aspect of wine tourism have so far been neglected in research, for this reason, the purpose of this study is to study the key elements for the success of the wine tour in Tuscany (Italy), evaluating the points of strength and weakness.

Design/methodology/approach

The study combines approaches of text mining, sentiment analysis and natural language processing, drawing on data from the TripAdvisor platform, obtaining through an automatic procedure 9,616 reviews from 600 tours in the years 2010–2020.

Findings

The authors identified six elements of successful wine tours expressed by research subjects: tour guide; logistical aspects; the quality of the wine; the quality of the food; complementary tourist and recreational activities; the landscape and historic villages. The key strength associated with success was the integration of the leading wine product with food, landscape and historic villages, while the main criticisms were concerned with the organization and planning of the tour. Furthermore, the tour guide also plays a fundamental role in consumer satisfaction.

Research limitations/implications

The limitations of the method were linked to the origin of the data used. The main one is that TripAdvisor does not allow you to have social and personal information about the tourist who wrote the review; therefore, the methods are substantially complementary to the traditional survey through questionnaires.

Practical implications

The proposed model can be used both by professionals to improve the quality of their products and by policymakers to promote the territorial development of quality wine-growing areas.

Social implications

The proposed model can be useful for policymakers to promote the territorial development of quality wine-growing areas.

Originality/value

The methodology we tested is easily transferable to many countries and to the authors’ knowledge, for the first time attempts to combine multidimensional scaling, sentiment analysis and natural language processing approaches.

Details

International Journal of Wine Business Research, vol. 34 no. 2
Type: Research Article
ISSN: 1751-1062

Keywords

Open Access
Article
Publication date: 17 July 2020

Imad Zeroual and Abdelhak Lakhouaja

Recently, more data-driven approaches are demanding multilingual parallel resources primarily in the cross-language studies. To meet these demands, building multilingual parallel…

2574

Abstract

Recently, more data-driven approaches are demanding multilingual parallel resources primarily in the cross-language studies. To meet these demands, building multilingual parallel corpora are becoming the focus of many Natural Language Processing (NLP) scientific groups. Unlike monolingual corpora, the number of available multilingual parallel corpora is limited. In this paper, the MulTed, a corpus of subtitles extracted from TEDx talks is introduced. It is multilingual, Part of Speech (PoS) tagged, and bilingually sentence-aligned with English as a pivot language. This corpus is designed for many NLP applications, where the sentence-alignment, the PoS tagging, and the size of corpora are influential such as statistical machine translation, language recognition, and bilingual dictionary generation. Currently, the corpus has subtitles that cover 1100 talks available in over 100 languages. The subtitles are classified based on a variety of topics such as Business, Education, and Sport. Regarding the PoS tagging, the Treetagger, a language-independent PoS tagger, is used; then, to make the PoS tagging maximally useful, a mapping process to a universal common tagset is performed. Finally, we believe that making the MulTed corpus available for a public use can be a significant contribution to the literature of NLP and corpus linguistics, especially for under-resourced languages.

Details

Applied Computing and Informatics, vol. 18 no. 1/2
Type: Research Article
ISSN: 2210-8327

Keywords

Open Access
Article
Publication date: 5 December 2023

Manuel J. Sánchez-Franco and Sierra Rey-Tienda

This research proposes to organise and distil this massive amount of data, making it easier to understand. Using data mining, machine learning techniques and visual approaches…

Abstract

Purpose

This research proposes to organise and distil this massive amount of data, making it easier to understand. Using data mining, machine learning techniques and visual approaches, researchers and managers can extract valuable insights (on guests' preferences) and convert them into strategic thinking based on exploration and predictive analysis. Consequently, this research aims to assist hotel managers in making informed decisions, thus improving the overall guest experience and increasing competitiveness.

Design/methodology/approach

This research employs natural language processing techniques, data visualisation proposals and machine learning methodologies to analyse unstructured guest service experience content. In particular, this research (1) applies data mining to evaluate the role and significance of critical terms and semantic structures in hotel assessments; (2) identifies salient tokens to depict guests' narratives based on term frequency and the information quantity they convey; and (3) tackles the challenge of managing extensive document repositories through automated identification of latent topics in reviews by using machine learning methods for semantic grouping and pattern visualisation.

Findings

This study’s findings (1) aim to identify critical features and topics that guests highlight during their hotel stays, (2) visually explore the relationships between these features and differences among diverse types of travellers through online hotel reviews and (3) determine predictive power. Their implications are crucial for the hospitality domain, as they provide real-time insights into guests' perceptions and business performance and are essential for making informed decisions and staying competitive.

Originality/value

This research seeks to minimise the cognitive processing costs of the enormous amount of content published by the user through a better organisation of hotel service reviews and their visualisation. Likewise, this research aims to propose a methodology and method available to tourism organisations to obtain truly useable knowledge in the design of the hotel offer and its value propositions.

Details

Management Decision, vol. ahead-of-print no. ahead-of-print
Type: Research Article
ISSN: 0025-1747

Keywords

Open Access
Article
Publication date: 5 May 2022

Konstantinos Solakis, Vicky Katsoni, Ali B. Mahmoud and Nicholas Grigoriou

This is a general review study aiming to specify the key customer-based factors and technologies that influence the value co-creation (VCC) process through artificial intelligence…

11296

Abstract

Purpose

This is a general review study aiming to specify the key customer-based factors and technologies that influence the value co-creation (VCC) process through artificial intelligence (AI) and automation in the hospitality and tourism industry.

Design/methodology/approach

The study uses a theory-based general literature review approach to explore key customer-based factors and technologies influencing VCC in the tourism industry. By reviewing the relevant literature, the authors conclude a theoretical framework postulating the determinants of VCC in the AI-driven tourism industry.

Findings

This paper identifies customers' perceptions, attitudes, trust, social influence, hedonic motivations, anthropomorphism and prior experience as customer-based factors to VCC through the use of AI. Service robots, AI-enabled self-service kiosks, chatbots, metaversal tourism and new reality, machine learning (ML) and natural language processing (NLP) are technologies that influence VCC.

Research limitations/implications

The results of this research inform a theoretical framework articulating the human and AI elements for future research set to expand the models predicting VCC in the tourism industry.

Originality/value

Few studies have examined consumer-related factors that influence their participation in the VCC process through automation and AI.

Details

Journal of Tourism Futures, vol. 10 no. 1
Type: Research Article
ISSN: 2055-5911

Keywords

Open Access
Article
Publication date: 23 November 2023

Reema Khaled AlRowais and Duaa Alsaeed

Automatically extracting stance information from natural language texts is a significant research problem with various applications, particularly after the recent explosion of…

238

Abstract

Purpose

Automatically extracting stance information from natural language texts is a significant research problem with various applications, particularly after the recent explosion of data on the internet via platforms like social media sites. Stance detection system helps determine whether the author agree, against or has a neutral opinion with the given target. Most of the research in stance detection focuses on the English language, while few research was conducted on the Arabic language.

Design/methodology/approach

This paper aimed to address stance detection on Arabic tweets by building and comparing different stance detection models using four transformers, namely: Araelectra, MARBERT, AraBERT and Qarib. Using different weights for these transformers, the authors performed extensive experiments fine-tuning the task of stance detection Arabic tweets with the four different transformers.

Findings

The results showed that the AraBERT model learned better than the other three models with a 70% F1 score followed by the Qarib model with a 68% F1 score.

Research limitations/implications

A limitation of this study is the imbalanced dataset and the limited availability of annotated datasets of SD in Arabic.

Originality/value

Provide comprehensive overview of the current resources for stance detection in the literature, including datasets and machine learning methods used. Therefore, the authors examined the models to analyze and comprehend the obtained findings in order to make recommendations for the best performance models for the stance detection task.

Details

Arab Gulf Journal of Scientific Research, vol. ahead-of-print no. ahead-of-print
Type: Research Article
ISSN: 1985-9899

Keywords

Open Access
Article
Publication date: 14 July 2022

Karlo Puh and Marina Bagić Babac

As the tourism industry becomes more vital for the success of many economies around the world, the importance of technology in tourism grows daily. Alongside increasing tourism…

6031

Abstract

Purpose

As the tourism industry becomes more vital for the success of many economies around the world, the importance of technology in tourism grows daily. Alongside increasing tourism importance and popularity, the amount of significant data grows, too. On daily basis, millions of people write their opinions, suggestions and views about accommodation, services, and much more on various websites. Well-processed and filtered data can provide a lot of useful information that can be used for making tourists' experiences much better and help us decide when selecting a hotel or a restaurant. Thus, the purpose of this study is to explore machine and deep learning models for predicting sentiment and rating from tourist reviews.

Design/methodology/approach

This paper used machine learning models such as Naïve Bayes, support vector machines (SVM), convolutional neural network (CNN), long short-term memory (LSTM) and bidirectional long short-term memory (BiLSTM) for extracting sentiment and ratings from tourist reviews. These models were trained to classify reviews into positive, negative, or neutral sentiment, and into one to five grades or stars. Data used for training the models were gathered from TripAdvisor, the world's largest travel platform. The models based on multinomial Naïve Bayes (MNB) and SVM were trained using the term frequency-inverse document frequency (TF-IDF) for word representations while deep learning models were trained using global vectors (GloVe) for word representation. The results from testing these models are presented, compared and discussed.

Findings

The performance of machine and learning models achieved high accuracy in predicting positive, negative, or neutral sentiments and ratings from tourist reviews. The optimal model architecture for both classification tasks was a deep learning model based on BiLSTM. The study’s results confirmed that deep learning models are more efficient and accurate than machine learning algorithms.

Practical implications

The proposed models allow for forecasting the number of tourist arrivals and expenditure, gaining insights into the tourists' profiles, improving overall customer experience, and upgrading marketing strategies. Different service sectors can use the implemented models to get insights into customer satisfaction with the products and services as well as to predict the opinions given a particular context.

Originality/value

This study developed and compared different machine learning models for classifying customer reviews as positive, negative, or neutral, as well as predicting ratings with one to five stars based on a TripAdvisor hotel reviews dataset that contains 20,491 unique hotel reviews.

Details

Journal of Hospitality and Tourism Insights, vol. 6 no. 3
Type: Research Article
ISSN: 2514-9792

Keywords

1 – 10 of over 1000