Search results

1 – 10 of over 1000
Article
Publication date: 5 May 2023

Subhajit Panda and Navkiran Kaur

The purpose of this research paper is to explore the significance of language processing in library systems and evaluate the effectiveness of integrating artificial intelligence…

Abstract

Purpose

The purpose of this research paper is to explore the significance of language processing in library systems and evaluate the effectiveness of integrating artificial intelligence and generative pre-trained transformer (GPT) technology in modern libraries. Specifically, the paper focuses on SheetGPT, a Google Sheet and GPT Plugin and its impact on language processing in library systems.

Design/methodology/approach

This paper adopts a comprehensive analysis approach to evaluate the integration of SheetGPT in library systems. The authors outlined a user-friendly approach for installation and use of SheetGPT using its “beginner plan”, appropriate for personal/student use or extended experimentation. The study includes a quantitative analysis to provide a thorough understanding of the benefits and limitations of SheetGPT in library systems.

Findings

The findings of this research paper suggest that SheetGPT is a highly effective language-processing tool for library systems. Additionally, ChatGPT’s integration with Google Sheets and easy accessibility over Google Marketplace makes it an efficient and user-friendly tool for library professionals. Overall, this study highlights the potential of SheetGPT to enhance language processing in library systems

Originality/value

This research paper contributes to the existing literature by providing a comprehensive analysis of the effectiveness of SheetGPT in library systems. The study’s approach is unique in that it evaluates SheetGPT’s impact on language processing and provides insights into its benefits and limitations. The study’s findings are original and provide a valuable resource for library professionals and researchers interested in exploring the potential of SheetGPT to enhance language processing in library systems.

Details

Library Hi Tech News, vol. ahead-of-print no. ahead-of-print
Type: Research Article
ISSN: 0741-9058

Keywords

Article
Publication date: 31 October 2023

Hong Zhou, Binwei Gao, Shilong Tang, Bing Li and Shuyu Wang

The number of construction dispute cases has maintained a high growth trend in recent years. The effective exploration and management of construction contract risk can directly…

Abstract

Purpose

The number of construction dispute cases has maintained a high growth trend in recent years. The effective exploration and management of construction contract risk can directly promote the overall performance of the project life cycle. The miss of clauses may result in a failure to match with standard contracts. If the contract, modified by the owner, omits key clauses, potential disputes may lead to contractors paying substantial compensation. Therefore, the identification of construction project contract missing clauses has heavily relied on the manual review technique, which is inefficient and highly restricted by personnel experience. The existing intelligent means only work for the contract query and storage. It is urgent to raise the level of intelligence for contract clause management. Therefore, this paper aims to propose an intelligent method to detect construction project contract missing clauses based on Natural Language Processing (NLP) and deep learning technology.

Design/methodology/approach

A complete classification scheme of contract clauses is designed based on NLP. First, construction contract texts are pre-processed and converted from unstructured natural language into structured digital vector form. Following the initial categorization, a multi-label classification of long text construction contract clauses is designed to preliminary identify whether the clause labels are missing. After the multi-label clause missing detection, the authors implement a clause similarity algorithm by creatively integrating the image detection thought, MatchPyramid model, with BERT to identify missing substantial content in the contract clauses.

Findings

1,322 construction project contracts were tested. Results showed that the accuracy of multi-label classification could reach 93%, the accuracy of similarity matching can reach 83%, and the recall rate and F1 mean of both can reach more than 0.7. The experimental results verify the feasibility of intelligently detecting contract risk through the NLP-based method to some extent.

Originality/value

NLP is adept at recognizing textual content and has shown promising results in some contract processing applications. However, the mostly used approaches of its utilization for risk detection in construction contract clauses predominantly are rule-based, which encounter challenges when handling intricate and lengthy engineering contracts. This paper introduces an NLP technique based on deep learning which reduces manual intervention and can autonomously identify and tag types of contractual deficiencies, aligning with the evolving complexities anticipated in future construction contracts. Moreover, this method achieves the recognition of extended contract clause texts. Ultimately, this approach boasts versatility; users simply need to adjust parameters such as segmentation based on language categories to detect omissions in contract clauses of diverse languages.

Details

Engineering, Construction and Architectural Management, vol. ahead-of-print no. ahead-of-print
Type: Research Article
ISSN: 0969-9988

Keywords

Article
Publication date: 1 April 2024

Xiaoxian Yang, Zhifeng Wang, Qi Wang, Ke Wei, Kaiqi Zhang and Jiangang Shi

This study aims to adopt a systematic review approach to examine the existing literature on law and LLMs.It involves analyzing and synthesizing relevant research papers, reports…

Abstract

Purpose

This study aims to adopt a systematic review approach to examine the existing literature on law and LLMs.It involves analyzing and synthesizing relevant research papers, reports and scholarly articles that discuss the use of LLMs in the legal domain. The review encompasses various aspects, including an analysis of LLMs, legal natural language processing (NLP), model tuning techniques, data processing strategies and frameworks for addressing the challenges associated with legal question-and-answer (Q&A) systems. Additionally, the study explores potential applications and services that can benefit from the integration of LLMs in the field of intelligent justice.

Design/methodology/approach

This paper surveys the state-of-the-art research on law LLMs and their application in the field of intelligent justice. The study aims to identify the challenges associated with developing Q&A systems based on LLMs and explores potential directions for future research and development. The ultimate goal is to contribute to the advancement of intelligent justice by effectively leveraging LLMs.

Findings

To effectively apply a law LLM, systematic research on LLM, legal NLP and model adjustment technology is required.

Originality/value

This study contributes to the field of intelligent justice by providing a comprehensive review of the current state of research on law LLMs.

Details

International Journal of Web Information Systems, vol. ahead-of-print no. ahead-of-print
Type: Research Article
ISSN: 1744-0084

Keywords

Article
Publication date: 2 January 2024

Tiara Kusumaningtiyas, Prasetyo Adi Nugroho and Nurul Aida Noor Azizi

The purpose of this paper is to explore the use of artificial intelligence (AI) in libraries, especially university libraries, which are faced with users from various countries…

Abstract

Purpose

The purpose of this paper is to explore the use of artificial intelligence (AI) in libraries, especially university libraries, which are faced with users from various countries who have different languages and cultures. Seamless M4T, which is being developed, has great potential for helping university librarians maximize library services by providing ease of communication.

Design/methodology/approach

Analyzing the possibility of developing Seamless M4T using natural language processing techniques and how to train language models to be smarter AI tools and can be used to break down language barriers between librarians and users.

Findings

The implementation of AI-based application Seamless M4T can help university librarians provide maximum service to users who are hampered by language and culture with advanced communication skills. Seamless M4T has an automatic speech recognition feature for dozens of languages, so it can translate speech-to-text, text-to-speech or both text and speech. To convert written words into verbal forms, this AI can also translate and transcribe text and speech in real-time without significant delays.

Originality/value

This paper emphasizes the use of AI in university libraries to improve services, especially in communication due to language differences between librarians and users. Advantages in using AI in libraries can support the collaboration and scholarly communication process.

Details

Library Hi Tech News, vol. ahead-of-print no. ahead-of-print
Type: Research Article
ISSN: 0741-9058

Keywords

Article
Publication date: 25 July 2023

Aida Khakimova, Oleg Zolotarev and Sanjay Kaushal

Effective communication is crucial in the medical field where different stakeholders use various terminologies to describe and classify healthcare concepts such as ICD, SNOMED CT…

Abstract

Purpose

Effective communication is crucial in the medical field where different stakeholders use various terminologies to describe and classify healthcare concepts such as ICD, SNOMED CT, UMLS and MeSH, but the problem of polysemy can make natural language processing difficult. This study explores the contextual meanings of the term “pattern” in the biomedical literature, compares them to existing definitions, annotates a corpus for use in machine learning and proposes new definitions of terms such as “Syndrome, feature” and “pattern recognition.”

Design/methodology/approach

Entrez API was used to retrieve articles form PubMed for the study which assembled a corpus of 398 articles using a search query for the ambiguous term “pattern” in the titles or abstracts. The python NLTK library was used to extract the terms and their contexts, and an expert check was carried out. To understand the various meanings of the term, the contextual environment was analyzed by extracting the surrounding words of the term. The expert determined the appropriate size of the context for analysis to gain a more nuanced understanding of the different meanings of the term pattern.

Findings

The study found that the categories of meanings of the term “pattern” are broader in biomedical publications than in common definitions, and new categories have been emerging from the term's use in the biomedical field. The study highlights the importance of annotated corpora in advancing natural language processing techniques and provides valuable insights into the nuances of biomedical language.

Originality/value

The study's findings demonstrate the importance of exploring contextual meanings and proposing new definitions of terms in the biomedical field to improve natural language processing techniques.

Details

Kybernetes, vol. ahead-of-print no. ahead-of-print
Type: Research Article
ISSN: 0368-492X

Keywords

Article
Publication date: 28 March 2023

Yupeng Lin and Zhonggen Yu

The application of artificial intelligence chatbots is an emerging trend in educational technology studies for its multi-faceted advantages. However, the existing studies rarely…

1643

Abstract

Purpose

The application of artificial intelligence chatbots is an emerging trend in educational technology studies for its multi-faceted advantages. However, the existing studies rarely take a perspective of educational technology application to evaluate the application of chatbots to educational contexts. This study aims to bridge the research gap by taking an educational perspective to review the existing literature on artificial intelligence chatbots.

Design/methodology/approach

This study combines bibliometric analysis and citation network analysis: a bibliometric analysis through visualization of keyword, authors, organizations and countries and a citation network analysis based on literature clustering.

Findings

Educational applications of chatbots are still rising in post-COVID-19 learning environments. Popular research issues on this topic include technological advancements, students’ perception of chatbots and effectiveness of chatbots in different educational contexts. Originating from similar technological and theoretical foundations, chatbots are primarily applied to language education, educational services (such as information counseling and automated grading), health-care education and medical training. Diversifying application contexts demonstrate specific purposes for using chatbots in education but are confronted with some common challenges. Multi-faceted factors can influence the effectiveness and acceptance of chatbots in education. This study provides an extended framework to facilitate extending artificial intelligence chatbot applications in education.

Research limitations/implications

The authors have to acknowledge that this study is subjected to some limitations. First, the literature search was based on the core collection on Web of Science, which did not include some existing studies. Second, this bibliometric analysis only included studies published in English. Third, due to the limitation in technological expertise, the authors could not comprehensively interpret the implications of some studies reporting technological advancements. However, this study intended to establish its research significance by summarizing and evaluating the effectiveness of artificial intelligence chatbots from an educational perspective.

Originality/value

This study identifies the publication trends of artificial intelligence chatbots in educational contexts. It bridges the research gap caused by previous neglection of treating educational contexts as an interconnected whole which can demonstrate its characteristics. It identifies the major application contexts of artificial intelligence chatbots in education and encouraged further extending of applications. It also proposes an extended framework to consider that covers three critical components of technological integration in education when future researchers and instructors apply artificial intelligence chatbots to new educational contexts.

Article
Publication date: 3 January 2023

Saleem Raja A., Sundaravadivazhagan Balasubaramanian, Pradeepa Ganesan, Justin Rajasekaran and Karthikeyan R.

The internet has completely merged into contemporary life. People are addicted to using internet services for everyday activities. Consequently, an abundance of information about…

Abstract

Purpose

The internet has completely merged into contemporary life. People are addicted to using internet services for everyday activities. Consequently, an abundance of information about people and organizations is available online, which encourages the proliferation of cybercrimes. Cybercriminals often use malicious links for large-scale cyberattacks, which are disseminated via email, SMS and social media. Recognizing malicious links online can be exceedingly challenging. The purpose of this paper is to present a strong security system that can detect malicious links in the cyberspace using natural language processing technique.

Design/methodology/approach

The researcher recommends a variety of approaches, including blacklisting and rules-based machine/deep learning, for automatically recognizing malicious links. But the approaches generally necessitate the generation of a set of features to generalize the detection process. Most of the features are generated by processing URLs and content of the web page, as well as some external features such as the ranking of the web page and domain name system information. This process of feature extraction and selection typically takes more time and demands a high level of expertise in the domain. Sometimes the generated features may not leverage the full potentials of the data set. In addition, the majority of the currently deployed systems make use of a single classifier for the classification of malicious links. However, prediction accuracy may vary widely depending on the data set and the classifier used.

Findings

To address the issue of generating feature sets, the proposed method uses natural language processing techniques (term frequency and inverse document frequency) that vectorize URLs. To build a robust system for the classification of malicious links, the proposed system implements weighted soft voting classifier, an ensemble classifier that combines predictions of base classifiers. The ability or skill of each classifier serves as the base for the weight that is assigned to it.

Originality/value

The proposed method performs better when the optimal weights are assigned. The performance of the proposed method was assessed by using two different data sets (D1 and D2) and compared performance against base machine learning classifiers and previous research results. The outcome accuracy shows that the proposed method is superior to the existing methods, offering 91.4% and 98.8% accuracy for data sets D1 and D2, respectively.

Details

International Journal of Pervasive Computing and Communications, vol. ahead-of-print no. ahead-of-print
Type: Research Article
ISSN: 1742-7371

Keywords

Article
Publication date: 25 October 2022

Victor Diogho Heuer de Carvalho and Ana Paula Cabral Seixas Costa

This article presents two Brazilian Portuguese corpora collected from different media concerning public security issues in a specific location. The primary motivation is…

Abstract

Purpose

This article presents two Brazilian Portuguese corpora collected from different media concerning public security issues in a specific location. The primary motivation is supporting analyses, so security authorities can make appropriate decisions about their actions.

Design/methodology/approach

The corpora were obtained through web scraping from a newspaper's website and tweets from a Brazilian metropolitan region. Natural language processing was applied considering: text cleaning, lemmatization, summarization, part-of-speech and dependencies parsing, named entities recognition, and topic modeling.

Findings

Several results were obtained based on the methodology used, highlighting some: an example of a summarization using an automated process; dependency parsing; the most common topics in each corpus; the forty named entities and the most common slogans were extracted, highlighting those linked to public security.

Research limitations/implications

Some critical tasks were identified for the research perspective, related to the applied methodology: the treatment of noise from obtaining news on their source websites, passing through textual elements quite present in social network posts such as abbreviations, emojis/emoticons, and even writing errors; the treatment of subjectivity, to eliminate noise from irony and sarcasm; the search for authentic news of issues within the target domain. All these tasks aim to improve the process to enable interested authorities to perform accurate analyses.

Practical implications

The corpora dedicated to the public security domain enable several analyses, such as mining public opinion on security actions in a given location; understanding criminals' behaviors reported in the news or even on social networks and drawing their attitudes timeline; detecting movements that may cause damage to public property and people welfare through texts from social networks; extracting the history and repercussions of police actions, crossing news with records on social networks; among many other possibilities.

Originality/value

The work on behalf of the corpora reported in this text represents one of the first initiatives to create textual bases in Portuguese, dedicated to Brazil's specific public security domain.

Details

Library Hi Tech, vol. ahead-of-print no. ahead-of-print
Type: Research Article
ISSN: 0737-8831

Keywords

Article
Publication date: 28 March 2023

Antonijo Marijić and Marina Bagić Babac

Genre classification of songs based on lyrics is a challenging task even for humans, however, state-of-the-art natural language processing has recently offered advanced solutions…

Abstract

Purpose

Genre classification of songs based on lyrics is a challenging task even for humans, however, state-of-the-art natural language processing has recently offered advanced solutions to this task. The purpose of this study is to advance the understanding and application of natural language processing and deep learning in the domain of music genre classification, while also contributing to the broader themes of global knowledge and communication, and sustainable preservation of cultural heritage.

Design/methodology/approach

The main contribution of this study is the development and evaluation of various machine and deep learning models for song genre classification. Additionally, we investigated the effect of different word embeddings, including Global Vectors for Word Representation (GloVe) and Word2Vec, on the classification performance. The tested models range from benchmarks such as logistic regression, support vector machine and random forest, to more complex neural network architectures and transformer-based models, such as recurrent neural network, long short-term memory, bidirectional long short-term memory and bidirectional encoder representations from transformers (BERT).

Findings

The authors conducted experiments on both English and multilingual data sets for genre classification. The results show that the BERT model achieved the best accuracy on the English data set, whereas cross-lingual language model pretraining based on RoBERTa (XLM-RoBERTa) performed the best on the multilingual data set. This study found that songs in the metal genre were the most accurately labeled, as their text style and topics were the most distinct from other genres. On the contrary, songs from the pop and rock genres were more challenging to differentiate. This study also compared the impact of different word embeddings on the classification task and found that models with GloVe word embeddings outperformed Word2Vec and the learning embedding layer.

Originality/value

This study presents the implementation, testing and comparison of various machine and deep learning models for genre classification. The results demonstrate that transformer models, including BERT, robustly optimized BERT pretraining approach, distilled bidirectional encoder representations from transformers, bidirectional and auto-regressive transformers and XLM-RoBERTa, outperformed other models.

Details

Global Knowledge, Memory and Communication, vol. ahead-of-print no. ahead-of-print
Type: Research Article
ISSN: 2514-9342

Keywords

Article
Publication date: 12 September 2023

Wenjing Wu, Caifeng Wen, Qi Yuan, Qiulan Chen and Yunzhong Cao

Learning from safety accidents and sharing safety knowledge has become an important part of accident prevention and improving construction safety management. Considering the…

Abstract

Purpose

Learning from safety accidents and sharing safety knowledge has become an important part of accident prevention and improving construction safety management. Considering the difficulty of reusing unstructured data in the construction industry, the knowledge in it is difficult to be used directly for safety analysis. The purpose of this paper is to explore the construction of construction safety knowledge representation model and safety accident graph through deep learning methods, extract construction safety knowledge entities through BERT-BiLSTM-CRF model and propose a data management model of data–knowledge–services.

Design/methodology/approach

The ontology model of knowledge representation of construction safety accidents is constructed by integrating entity relation and logic evolution. Then, the database of safety incidents in the architecture, engineering and construction (AEC) industry is established based on the collected construction safety incident reports and related dispute cases. The construction method of construction safety accident knowledge graph is studied, and the precision of BERT-BiLSTM-CRF algorithm in information extraction is verified through comparative experiments. Finally, a safety accident report is used as an example to construct the AEC domain construction safety accident knowledge graph (AEC-KG), which provides visual query knowledge service and verifies the operability of knowledge management.

Findings

The experimental results show that the combined BERT-BiLSTM-CRF algorithm has a precision of 84.52%, a recall of 92.35%, and an F1 value of 88.26% in named entity recognition from the AEC domain database. The construction safety knowledge representation model and safety incident knowledge graph realize knowledge visualization.

Originality/value

The proposed framework provides a new knowledge management approach to improve the safety management of practitioners and also enriches the application scenarios of knowledge graph. On the one hand, it innovatively proposes a data application method and knowledge management method of safety accident report that integrates entity relationship and matter evolution logic. On the other hand, the legal adjudication dimension is innovatively added to the knowledge graph in the construction safety field as the basis for the postincident disposal measures of safety accidents, which provides reference for safety managers' decision-making in all aspects.

Details

Engineering, Construction and Architectural Management, vol. ahead-of-print no. ahead-of-print
Type: Research Article
ISSN: 0969-9988

Keywords

1 – 10 of over 1000