Search results

1 – 10 of 19
Open Access
Article
Publication date: 19 December 2023

Qinxu Ding, Ding Ding, Yue Wang, Chong Guan and Bosheng Ding

The rapid rise of large language models (LLMs) has propelled them to the forefront of applications in natural language processing (NLP). This paper aims to present a comprehensive…

1483

Abstract

Purpose

The rapid rise of large language models (LLMs) has propelled them to the forefront of applications in natural language processing (NLP). This paper aims to present a comprehensive examination of the research landscape in LLMs, providing an overview of the prevailing themes and topics within this dynamic domain.

Design/methodology/approach

Drawing from an extensive corpus of 198 records published between 1996 to 2023 from the relevant academic database encompassing journal articles, books, book chapters, conference papers and selected working papers, this study delves deep into the multifaceted world of LLM research. In this study, the authors employed the BERTopic algorithm, a recent advancement in topic modeling, to conduct a comprehensive analysis of the data after it had been meticulously cleaned and preprocessed. BERTopic leverages the power of transformer-based language models like bidirectional encoder representations from transformers (BERT) to generate more meaningful and coherent topics. This approach facilitates the identification of hidden patterns within the data, enabling authors to uncover valuable insights that might otherwise have remained obscure. The analysis revealed four distinct clusters of topics in LLM research: “language and NLP”, “education and teaching”, “clinical and medical applications” and “speech and recognition techniques”. Each cluster embodies a unique aspect of LLM application and showcases the breadth of possibilities that LLM technology has to offer. In addition to presenting the research findings, this paper identifies key challenges and opportunities in the realm of LLMs. It underscores the necessity for further investigation in specific areas, including the paramount importance of addressing potential biases, transparency and explainability, data privacy and security, and responsible deployment of LLM technology.

Findings

The analysis revealed four distinct clusters of topics in LLM research: “language and NLP”, “education and teaching”, “clinical and medical applications” and “speech and recognition techniques”. Each cluster embodies a unique aspect of LLM application and showcases the breadth of possibilities that LLM technology has to offer. In addition to presenting the research findings, this paper identifies key challenges and opportunities in the realm of LLMs. It underscores the necessity for further investigation in specific areas, including the paramount importance of addressing potential biases, transparency and explainability, data privacy and security, and responsible deployment of LLM technology.

Practical implications

This classification offers practical guidance for researchers, developers, educators, and policymakers to focus efforts and resources. The study underscores the importance of addressing challenges in LLMs, including potential biases, transparency, data privacy, and responsible deployment. Policymakers can utilize this information to shape regulations, while developers can tailor technology development based on the diverse applications identified. The findings also emphasize the need for interdisciplinary collaboration and highlight ethical considerations, providing a roadmap for navigating the complex landscape of LLM research and applications.

Originality/value

This study stands out as the first to examine the evolution of LLMs across such a long time frame and across such diversified disciplines. It provides a unique perspective on the key areas of LLM research, highlighting the breadth and depth of LLM’s evolution.

Details

Journal of Electronic Business & Digital Economics, vol. 3 no. 1
Type: Research Article
ISSN: 2754-4214

Keywords

Open Access
Article
Publication date: 19 July 2022

Shreyesh Doppalapudi, Tingyan Wang and Robin Qiu

Clinical notes typically contain medical jargons and specialized words and phrases that are complicated and technical to most people, which is one of the most challenging…

1062

Abstract

Purpose

Clinical notes typically contain medical jargons and specialized words and phrases that are complicated and technical to most people, which is one of the most challenging obstacles in health information dissemination to consumers by healthcare providers. The authors aim to investigate how to leverage machine learning techniques to transform clinical notes of interest into understandable expressions.

Design/methodology/approach

The authors propose a natural language processing pipeline that is capable of extracting relevant information from long unstructured clinical notes and simplifying lexicons by replacing medical jargons and technical terms. Particularly, the authors develop an unsupervised keywords matching method to extract relevant information from clinical notes. To automatically evaluate completeness of the extracted information, the authors perform a multi-label classification task on the relevant texts. To simplify lexicons in the relevant text, the authors identify complex words using a sequence labeler and leverage transformer models to generate candidate words for substitution. The authors validate the proposed pipeline using 58,167 discharge summaries from critical care services.

Findings

The results show that the proposed pipeline can identify relevant information with high completeness and simplify complex expressions in clinical notes so that the converted notes have a high level of readability but a low degree of meaning change.

Social implications

The proposed pipeline can help healthcare consumers well understand their medical information and therefore strengthen communications between healthcare providers and consumers for better care.

Originality/value

An innovative pipeline approach is developed to address the health literacy problem confronted by healthcare providers and consumers in the ongoing digital transformation process in the healthcare industry.

Open Access
Article
Publication date: 23 November 2023

Reema Khaled AlRowais and Duaa Alsaeed

Automatically extracting stance information from natural language texts is a significant research problem with various applications, particularly after the recent explosion of…

238

Abstract

Purpose

Automatically extracting stance information from natural language texts is a significant research problem with various applications, particularly after the recent explosion of data on the internet via platforms like social media sites. Stance detection system helps determine whether the author agree, against or has a neutral opinion with the given target. Most of the research in stance detection focuses on the English language, while few research was conducted on the Arabic language.

Design/methodology/approach

This paper aimed to address stance detection on Arabic tweets by building and comparing different stance detection models using four transformers, namely: Araelectra, MARBERT, AraBERT and Qarib. Using different weights for these transformers, the authors performed extensive experiments fine-tuning the task of stance detection Arabic tweets with the four different transformers.

Findings

The results showed that the AraBERT model learned better than the other three models with a 70% F1 score followed by the Qarib model with a 68% F1 score.

Research limitations/implications

A limitation of this study is the imbalanced dataset and the limited availability of annotated datasets of SD in Arabic.

Originality/value

Provide comprehensive overview of the current resources for stance detection in the literature, including datasets and machine learning methods used. Therefore, the authors examined the models to analyze and comprehend the obtained findings in order to make recommendations for the best performance models for the stance detection task.

Details

Arab Gulf Journal of Scientific Research, vol. ahead-of-print no. ahead-of-print
Type: Research Article
ISSN: 1985-9899

Keywords

Open Access
Article
Publication date: 16 November 2023

Bahareh Farhoudinia, Selcen Ozturkcan and Nihat Kasap

This paper aims to conduct an interdisciplinary systematic literature review (SLR) of fake news research and to advance the socio-technical understanding of digital information…

1284

Abstract

Purpose

This paper aims to conduct an interdisciplinary systematic literature review (SLR) of fake news research and to advance the socio-technical understanding of digital information practices and platforms in business and management studies.

Design/methodology/approach

The paper applies a focused, SLR method to analyze articles on fake news in business and management journals from 2010 to 2020.

Findings

The paper analyzes the definition, theoretical frameworks, methods and research gaps of fake news in the business and management domains. It also identifies some promising research opportunities for future scholars.

Practical implications

The paper offers practical implications for various stakeholders who are affected by or involved in fake news dissemination, such as brands, consumers and policymakers. It provides recommendations to cope with the challenges and risks of fake news.

Social implications

The paper discusses the social consequences and future threats of fake news, especially in relation to social networking and social media. It calls for more awareness and responsibility from online communities to prevent and combat fake news.

Originality/value

The paper contributes to the literature on information management by showing the importance and consequences of fake news sharing for societies. It is among the frontier systematic reviews in the field that covers studies from different disciplines and focuses on business and management studies.

Details

Aslib Journal of Information Management, vol. ahead-of-print no. ahead-of-print
Type: Research Article
ISSN: 2050-3806

Keywords

Open Access
Article
Publication date: 15 February 2022

Martin Nečaský, Petr Škoda, David Bernhauer, Jakub Klímek and Tomáš Skopal

Semantic retrieval and discovery of datasets published as open data remains a challenging task. The datasets inherently originate in the globally distributed web jungle, lacking…

1210

Abstract

Purpose

Semantic retrieval and discovery of datasets published as open data remains a challenging task. The datasets inherently originate in the globally distributed web jungle, lacking the luxury of centralized database administration, database schemes, shared attributes, vocabulary, structure and semantics. The existing dataset catalogs provide basic search functionality relying on keyword search in brief, incomplete or misleading textual metadata attached to the datasets. The search results are thus often insufficient. However, there exist many ways of improving the dataset discovery by employing content-based retrieval, machine learning tools, third-party (external) knowledge bases, countless feature extraction methods and description models and so forth.

Design/methodology/approach

In this paper, the authors propose a modular framework for rapid experimentation with methods for similarity-based dataset discovery. The framework consists of an extensible catalog of components prepared to form custom pipelines for dataset representation and discovery.

Findings

The study proposes several proof-of-concept pipelines including experimental evaluation, which showcase the usage of the framework.

Originality/value

To the best of authors’ knowledge, there is no similar formal framework for experimentation with various similarity methods in the context of dataset discovery. The framework has the ambition to establish a platform for reproducible and comparable research in the area of dataset discovery. The prototype implementation of the framework is available on GitHub.

Details

Data Technologies and Applications, vol. 56 no. 4
Type: Research Article
ISSN: 2514-9288

Keywords

Open Access
Article
Publication date: 29 June 2022

Ibtissam Touahri

This paper purposed a multi-facet sentiment analysis system.

Abstract

Purpose

This paper purposed a multi-facet sentiment analysis system.

Design/methodology/approach

Hence, This paper uses multidomain resources to build a sentiment analysis system. The manual lexicon based features that are extracted from the resources are fed into a machine learning classifier to compare their performance afterward. The manual lexicon is replaced with a custom BOW to deal with its time consuming construction. To help the system run faster and make the model interpretable, this will be performed by employing different existing and custom approaches such as term occurrence, information gain, principal component analysis, semantic clustering, and POS tagging filters.

Findings

The proposed system featured by lexicon extraction automation and characteristics size optimization proved its efficiency when applied to multidomain and benchmark datasets by reaching 93.59% accuracy which makes it competitive to the state-of-the-art systems.

Originality/value

The construction of a custom BOW. Optimizing features based on existing and custom feature selection and clustering approaches.

Details

Applied Computing and Informatics, vol. ahead-of-print no. ahead-of-print
Type: Research Article
ISSN: 2634-1964

Keywords

Open Access
Article
Publication date: 13 October 2022

Linzi Wang, Qiudan Li, Jingjun David Xu and Minjie Yuan

Mining user-concerned actionable and interpretable hot topics will help management departments fully grasp the latest events and make timely decisions. Existing topic models

379

Abstract

Purpose

Mining user-concerned actionable and interpretable hot topics will help management departments fully grasp the latest events and make timely decisions. Existing topic models primarily integrate word embedding and matrix decomposition, which only generates keyword-based hot topics with weak interpretability, making it difficult to meet the specific needs of users. Mining phrase-based hot topics with syntactic dependency structure have been proven to model structure information effectively. A key challenge lies in the effective integration of the above information into the hot topic mining process.

Design/methodology/approach

This paper proposes the nonnegative matrix factorization (NMF)-based hot topic mining method, semantics syntax-assisted hot topic model (SSAHM), which combines semantic association and syntactic dependency structure. First, a semantic–syntactic component association matrix is constructed. Then, the matrix is used as a constraint condition to be incorporated into the block coordinate descent (BCD)-based matrix decomposition process. Finally, a hot topic information-driven phrase extraction algorithm is applied to describe hot topics.

Findings

The efficacy of the developed model is demonstrated on two real-world datasets, and the effects of dependency structure information on different topics are compared. The qualitative examples further explain the application of the method in real scenarios.

Originality/value

Most prior research focuses on keyword-based hot topics. Thus, the literature is advanced by mining phrase-based hot topics with syntactic dependency structure, which can effectively analyze the semantics. The development of syntactic dependency structure considering the combination of word order and part-of-speech (POS) is a step forward as word order, and POS are only separately utilized in the prior literature. Ignoring this synergy may miss important information, such as grammatical structure coherence and logical relations between syntactic components.

Details

Journal of Electronic Business & Digital Economics, vol. 1 no. 1/2
Type: Research Article
ISSN: 2754-4214

Keywords

Open Access
Article
Publication date: 4 April 2023

Stanislav Ivanov and Mohammad Soliman

The paper aims to evaluate the ways ChatGPT is going to disrupt tourism education and research.

7721

Abstract

Purpose

The paper aims to evaluate the ways ChatGPT is going to disrupt tourism education and research.

Design/methodology/approach

This is a conceptual paper.

Findings

ChatGPT has the potential to revolutionize tourism education and research because it can do what students and researchers should do, namely, generate text (assignments and research papers). Universities will need to reevaluate their teaching and assessment strategies and incorporate generative language models in teaching. Publishers will need to be more receptive toward manuscripts that are partially generated by artificial intelligence. In the future, digital teachers and research assistants will take over many of the cognitive tasks of tourism educators and researchers.

Originality/value

To the authors’ best knowledge, this is one of the first academic papers that investigates the implications of ChatGPT to tourism education and research.

Details

Journal of Tourism Futures, vol. 9 no. 2
Type: Research Article
ISSN: 2055-5911

Keywords

Open Access
Article
Publication date: 4 October 2022

Dhong Fhel K. Gom-os and Kelvin Y. Yong

The goal of this study is to test the real-world use of an emotion recognition system.

1286

Abstract

Purpose

The goal of this study is to test the real-world use of an emotion recognition system.

Design/methodology/approach

The researchers chose an existing algorithm that displayed high accuracy and speed. Four emotions: happy, sadness, anger and surprise, are used from six of the universal emotions, associated by their own mood markers. The mood-matrix interface is then coded as a web application. Four guidance counselors and 10 students participated in the testing of the mood-matrix. Guidance counselors answered the technology acceptance model (TAM) to assess its usefulness, and the students answered the general comfort questionnaire (GCQ) to assess their comfort levels.

Findings

Results from TAM found that the mood-matrix has significant use for the guidance counselors and the GCQ finds that the students were comfortable during testing.

Originality/value

No study yet has tested an emotion recognition system applied to counseling or any mental health or psychological transactions.

Details

Applied Computing and Informatics, vol. ahead-of-print no. ahead-of-print
Type: Research Article
ISSN: 2634-1964

Keywords

Open Access
Article
Publication date: 4 January 2024

Ahmed A. Khalifa and Mariam A. Ibrahim

The study aims to evaluate PubMed publications on ChatGPT or artificial intelligence (AI) involvement in scientific or medical writing and investigate whether ChatGPT or AI was…

1034

Abstract

Purpose

The study aims to evaluate PubMed publications on ChatGPT or artificial intelligence (AI) involvement in scientific or medical writing and investigate whether ChatGPT or AI was used to create these articles or listed as authors.

Design/methodology/approach

This scoping review was conducted according to Preferred Reporting Items for Systematic Reviews and Meta-Analyses Extension for Scoping Reviews (PRISMA-ScR) guidelines. A PubMed database search was performed for articles published between January 1 and November 29, 2023, using appropriate search terms; both authors performed screening and selection independently.

Findings

From the initial search results of 127 articles, 41 were eligible for final analysis. Articles were published in 34 journals. Editorials were the most common article type, with 15 (36.6%) articles. Authors originated from 27 countries, and authors from the USA contributed the most, with 14 (34.1%) articles. The most discussed topic was AI tools and writing capabilities in 19 (46.3%) articles. AI or ChatGPT was involved in manuscript preparation in 31 (75.6%) articles. None of the articles listed AI or ChatGPT as an author, and in 19 (46.3%) articles, the authors acknowledged utilizing AI or ChatGPT.

Practical implications

Researchers worldwide are concerned with AI or ChatGPT involvement in scientific research, specifically the writing process. The authors believe that precise and mature regulations will be developed soon by journals, publishers and editors, which will pave the way for the best usage of these tools.

Originality/value

This scoping review expressed data published on using AI or ChatGPT in various scientific research and writing aspects, besides alluding to the advantages, disadvantages and implications of their usage.

Details

Arab Gulf Journal of Scientific Research, vol. ahead-of-print no. ahead-of-print
Type: Research Article
ISSN: 1985-9899

Keywords

1 – 10 of 19