Search results

1 – 10 of over 1000
Article
Publication date: 31 October 2023

Hong Zhou, Binwei Gao, Shilong Tang, Bing Li and Shuyu Wang

The number of construction dispute cases has maintained a high growth trend in recent years. The effective exploration and management of construction contract risk can directly…

Abstract

Purpose

The number of construction dispute cases has maintained a high growth trend in recent years. The effective exploration and management of construction contract risk can directly promote the overall performance of the project life cycle. The miss of clauses may result in a failure to match with standard contracts. If the contract, modified by the owner, omits key clauses, potential disputes may lead to contractors paying substantial compensation. Therefore, the identification of construction project contract missing clauses has heavily relied on the manual review technique, which is inefficient and highly restricted by personnel experience. The existing intelligent means only work for the contract query and storage. It is urgent to raise the level of intelligence for contract clause management. Therefore, this paper aims to propose an intelligent method to detect construction project contract missing clauses based on Natural Language Processing (NLP) and deep learning technology.

Design/methodology/approach

A complete classification scheme of contract clauses is designed based on NLP. First, construction contract texts are pre-processed and converted from unstructured natural language into structured digital vector form. Following the initial categorization, a multi-label classification of long text construction contract clauses is designed to preliminary identify whether the clause labels are missing. After the multi-label clause missing detection, the authors implement a clause similarity algorithm by creatively integrating the image detection thought, MatchPyramid model, with BERT to identify missing substantial content in the contract clauses.

Findings

1,322 construction project contracts were tested. Results showed that the accuracy of multi-label classification could reach 93%, the accuracy of similarity matching can reach 83%, and the recall rate and F1 mean of both can reach more than 0.7. The experimental results verify the feasibility of intelligently detecting contract risk through the NLP-based method to some extent.

Originality/value

NLP is adept at recognizing textual content and has shown promising results in some contract processing applications. However, the mostly used approaches of its utilization for risk detection in construction contract clauses predominantly are rule-based, which encounter challenges when handling intricate and lengthy engineering contracts. This paper introduces an NLP technique based on deep learning which reduces manual intervention and can autonomously identify and tag types of contractual deficiencies, aligning with the evolving complexities anticipated in future construction contracts. Moreover, this method achieves the recognition of extended contract clause texts. Ultimately, this approach boasts versatility; users simply need to adjust parameters such as segmentation based on language categories to detect omissions in contract clauses of diverse languages.

Details

Engineering, Construction and Architectural Management, vol. ahead-of-print no. ahead-of-print
Type: Research Article
ISSN: 0969-9988

Keywords

Article
Publication date: 13 August 2024

Samia Nawaz Yousafzai, Hooria Shahbaz, Armughan Ali, Amreen Qamar, Inzamam Mashood Nasir, Sara Tehsin and Robertas Damaševičius

The objective is to develop a more effective model that simplifies and accelerates the news classification process using advanced text mining and deep learning (DL) techniques. A…

Abstract

Purpose

The objective is to develop a more effective model that simplifies and accelerates the news classification process using advanced text mining and deep learning (DL) techniques. A distributed framework utilizing Bidirectional Encoder Representations from Transformers (BERT) was developed to classify news headlines. This approach leverages various text mining and DL techniques on a distributed infrastructure, aiming to offer an alternative to traditional news classification methods.

Design/methodology/approach

This study focuses on the classification of distinct types of news by analyzing tweets from various news channels. It addresses the limitations of using benchmark datasets for news classification, which often result in models that are impractical for real-world applications.

Findings

The framework’s effectiveness was evaluated on a newly proposed dataset and two additional benchmark datasets from the Kaggle repository, assessing the performance of each text mining and classification method across these datasets. The results of this study demonstrate that the proposed strategy significantly outperforms other approaches in terms of accuracy and execution time. This indicates that the distributed framework, coupled with the use of BERT for text analysis, provides a robust solution for analyzing large volumes of data efficiently. The findings also highlight the value of the newly released corpus for further research in news classification and emotion classification, suggesting its potential to facilitate advancements in these areas.

Originality/value

This research introduces an innovative distributed framework for news classification that addresses the shortcomings of models trained on benchmark datasets. By utilizing cutting-edge techniques and a novel dataset, the study offers significant improvements in accuracy and processing speed. The release of the corpus represents a valuable contribution to the field, enabling further exploration into news and emotion classification. This work sets a new standard for the analysis of news data, offering practical implications for the development of more effective and efficient news classification systems.

Details

International Journal of Intelligent Computing and Cybernetics, vol. ahead-of-print no. ahead-of-print
Type: Research Article
ISSN: 1756-378X

Keywords

Article
Publication date: 28 March 2023

Antonijo Marijić and Marina Bagić Babac

Genre classification of songs based on lyrics is a challenging task even for humans, however, state-of-the-art natural language processing has recently offered advanced solutions…

Abstract

Purpose

Genre classification of songs based on lyrics is a challenging task even for humans, however, state-of-the-art natural language processing has recently offered advanced solutions to this task. The purpose of this study is to advance the understanding and application of natural language processing and deep learning in the domain of music genre classification, while also contributing to the broader themes of global knowledge and communication, and sustainable preservation of cultural heritage.

Design/methodology/approach

The main contribution of this study is the development and evaluation of various machine and deep learning models for song genre classification. Additionally, we investigated the effect of different word embeddings, including Global Vectors for Word Representation (GloVe) and Word2Vec, on the classification performance. The tested models range from benchmarks such as logistic regression, support vector machine and random forest, to more complex neural network architectures and transformer-based models, such as recurrent neural network, long short-term memory, bidirectional long short-term memory and bidirectional encoder representations from transformers (BERT).

Findings

The authors conducted experiments on both English and multilingual data sets for genre classification. The results show that the BERT model achieved the best accuracy on the English data set, whereas cross-lingual language model pretraining based on RoBERTa (XLM-RoBERTa) performed the best on the multilingual data set. This study found that songs in the metal genre were the most accurately labeled, as their text style and topics were the most distinct from other genres. On the contrary, songs from the pop and rock genres were more challenging to differentiate. This study also compared the impact of different word embeddings on the classification task and found that models with GloVe word embeddings outperformed Word2Vec and the learning embedding layer.

Originality/value

This study presents the implementation, testing and comparison of various machine and deep learning models for genre classification. The results demonstrate that transformer models, including BERT, robustly optimized BERT pretraining approach, distilled bidirectional encoder representations from transformers, bidirectional and auto-regressive transformers and XLM-RoBERTa, outperformed other models.

Details

Global Knowledge, Memory and Communication, vol. ahead-of-print no. ahead-of-print
Type: Research Article
ISSN: 2514-9342

Keywords

Article
Publication date: 21 March 2024

Thamaraiselvan Natarajan, P. Pragha, Krantiraditya Dhalmahapatra and Deepak Ramanan Veera Raghavan

The metaverse, which is now revolutionizing how brands strategize their business needs, necessitates understanding individual opinions. Sentiment analysis deciphers emotions and…

Abstract

Purpose

The metaverse, which is now revolutionizing how brands strategize their business needs, necessitates understanding individual opinions. Sentiment analysis deciphers emotions and uncovers a deeper understanding of user opinions and trends within this digital realm. Further, sentiments signify the underlying factor that triggers one’s intent to use technology like the metaverse. Positive sentiments often correlate with positive user experiences, while negative sentiments may signify issues or frustrations. Brands may consider these sentiments and implement them on their metaverse platforms for a seamless user experience.

Design/methodology/approach

The current study adopts machine learning sentiment analysis techniques using Support Vector Machine, Doc2Vec, RNN, and CNN to explore the sentiment of individuals toward metaverse in a user-generated context. The topics were discovered using the topic modeling method, and sentiment analysis was performed subsequently.

Findings

The results revealed that the users had a positive notion about the experience and orientation of the metaverse while having a negative attitude towards the economy, data, and cyber security. The accuracy of each model has been analyzed, and it has been concluded that CNN provides better accuracy on an average of 89% compared to the other models.

Research limitations/implications

Analyzing sentiment can reveal how the general public perceives the metaverse. Positive sentiment may suggest enthusiasm and readiness for adoption, while negative sentiment might indicate skepticism or concerns. Given the positive user notions about the metaverse’s experience and orientation, developers should continue to focus on creating innovative and immersive virtual environments. At the same time, users' concerns about data, cybersecurity and the economy are critical. The negative attitude toward the metaverse’s economy suggests a need for innovation in economic models within the metaverse. Also, developers and platform operators should prioritize robust data security measures. Implementing strong encryption and two-factor authentication and educating users about cybersecurity best practices can address these concerns and enhance user trust.

Social implications

In terms of societal dynamics, the metaverse could revolutionize communication and relationships by altering traditional notions of proximity and the presence of its users. Further, virtual economies might emerge, with virtual assets having real-world value, presenting both opportunities and challenges for industries and regulators.

Originality/value

The current study contributes to research as it is the first of its kind to explore the sentiments of individuals toward the metaverse using deep learning techniques and evaluate the accuracy of these models.

Details

Kybernetes, vol. ahead-of-print no. ahead-of-print
Type: Research Article
ISSN: 0368-492X

Keywords

Article
Publication date: 28 May 2024

Zhi Yang, Sai Xie and Yuanhan Gu

The purpose of this study is to investigate the technology-focused and technology-supported dilemmas that firms have encountered and their digital orientation from a nuanced…

Abstract

Purpose

The purpose of this study is to investigate the technology-focused and technology-supported dilemmas that firms have encountered and their digital orientation from a nuanced perspective to answer the following research questions: What digital orientations do companies take in launching digital initiatives? How does the choice between a proactive digital orientation (Pro-DO) and a reactive digital orientation (Rea-DO) influence firm value?

Design/methodology/approach

The authors adopted machine learning and a quantitative research approach using observations from China’s listed companies from 2010 to 2020 and applied statistical techniques and regression analysis to examine the effect of digital orientation alternatives on firm value.

Findings

The findings of this study indicate that firms with a Pro-DO exhibit a positive effect on firm value. In contrast, firms with a Rea-DO do not demonstrate the same positive relationship with firm value. Additionally, this study reveals that firms with better corporate governance practices and lower financing constraints are more responsive to the positive effects of Pro-DO on firm value.

Originality/value

We elucidate two primary perspectives of digital orientation: Pro-DO and Rea-DO. Additionally, we empirically showcase their nuanced influences on firm value, thereby enriching knowledge in the fields of strategic orientation and digital transformation. Moreover, our findings underscore the importance of corporate governance and financing constraints as moderators.

Details

Management Decision, vol. ahead-of-print no. ahead-of-print
Type: Research Article
ISSN: 0025-1747

Keywords

Article
Publication date: 7 May 2024

Xueyuan Wang and Meixia Sun

The COVID-19 pandemic has profoundly impacted small and medium-sized enterprises (SMEs), inherently vulnerable entities, prompting a pivotal question of how to enhance SMEs’…

Abstract

Purpose

The COVID-19 pandemic has profoundly impacted small and medium-sized enterprises (SMEs), inherently vulnerable entities, prompting a pivotal question of how to enhance SMEs’ organizational resilience (OR) to withstand discontinuous crises. Although digital innovation (DI) is widely acknowledged as a critical antecedent to OR, limited studies have analyzed the configurational effects of DI on OR, particularly stage-based analysis.

Design/methodology/approach

Underpinned by the dynamic capabilities view, this study introduces a multi-stage dynamic capabilities framework for OR. Employing Latent Dirichlet Allocation (LDA), digital product innovation (DPI), digital services innovation (DSI) and digital process innovation (DCI) are further deconstructed into six dimensions. Furthermore, we utilized fuzzy-set qualitative comparative analysis (fsQCA) to explore the configuration effects of six DI on OR at different stages, using data from 94 Chinese SMEs.

Findings

First, OR improvement hinges not on a singular DI but on the interactions among various DIs. Second, multiple equivalent configurations emerge at different stages. Before the crisis, absorptive capability primarily advanced through iterative DPI and predictive DSI. During the crisis, response capability is principally augmented by the iterative DPI, distributed DCI, and integrated DCI. After the crisis, recovery capability is predominantly fortified by the iterative DPI, expanded DPI and experiential DSI. Third, iterative DPI consistently assumes a supportive role in fortifying OR.

Originality/value

This study contributes to the extant literature on DI and OR, offering practical guidance for SMEs to systematically enhance OR by configuring DI across distinct stages.

Details

European Journal of Innovation Management, vol. ahead-of-print no. ahead-of-print
Type: Research Article
ISSN: 1460-1060

Keywords

Open Access
Article
Publication date: 8 August 2024

Mateo Hitl, Nikola Greb and Marina Bagić Babac

The purpose of this study is to investigate how expressing gratitude and forgiveness on social media platforms relates to the overall sentiment of users, aiming to understand the…

212

Abstract

Purpose

The purpose of this study is to investigate how expressing gratitude and forgiveness on social media platforms relates to the overall sentiment of users, aiming to understand the impact of these expressions on social media interactions and individual well-being.

Design/methodology/approach

The hypothesis posits that users who frequently express gratitude or forgiveness will exhibit more positive sentiment in all posts during the observed period, compared to those who express these emotions less often. To test the hypothesis, sentiment analysis and statistical inference will be used. Additionally, topic modelling algorithms will be used to identify and assess the correlation between expressing gratitude and forgiveness and various topics.

Findings

This research paper explores the relationship between expressing gratitude and forgiveness in X (formerly known as Twitter) posts and the overall sentiment of user posts. The findings suggest correlations between expressing these emotions and the overall tone of social media content. The findings of this study can inform future research on how expressing gratitude and forgiveness can affect online sentiment and communication.

Originality/value

The authors have demonstrated that social media users who frequently express gratitude or forgiveness over an extended period of time exhibit a more positive sentiment compared to those who express these emotions less. Additionally, the authors observed that BERTopic modelling analysis performs better than latent dirichlet allocation and Top2Vec modelling analyses when analysing short messages from social media. This research, through the application of innovative techniques and the confirmation of previous theoretical findings, paves the way for further studies in the fields of positive psychology and machine learning.

Details

Global Knowledge, Memory and Communication, vol. ahead-of-print no. ahead-of-print
Type: Research Article
ISSN: 2514-9342

Keywords

Article
Publication date: 10 November 2023

Abby Yaqing Zhang and Joseph H. Zhang

Environmental, social and governance (ESG) factors have become increasingly important in investment decisions, leading to a surge in ESG investing and the rise of sustainable…

1201

Abstract

Purpose

Environmental, social and governance (ESG) factors have become increasingly important in investment decisions, leading to a surge in ESG investing and the rise of sustainable investment assets. Nevertheless, challenges in ESG disclosure, such as quantifying unstructured data, lack of guidelines and comparability, rampantly exist. ESG rating agencies play a crucial role in assessing corporate ESG performance, but concerns over their credibility and reliability persist. To address these issues, researchers are increasingly utilizing machine learning (ML) tools to enhance ESG reporting and evaluation. By leveraging ML, accounting practitioners and researchers gain deeper insights into the relationship between ESG practices and financial performance, offering a more data-driven understanding of ESG impacts on business communities.

Design/methodology/approach

The authors review the current research on ESG disclosure and ESG performance disagreement, followed by the review of current ESG research with ML tools in three areas: connecting ML with ESG disclosures, integrating ML with ESG rating disagreement and employing ML with ESG in other settings. By comparing different research's ML applications in ESG research, the authors conclude the positive and negative sides of those research studies.

Findings

The practice of ESG reporting and assurance is on the rise, but still in its technical infancy. ML methods offer advantages over traditional approaches in accounting, efficiently handling large, unstructured data and capturing complex patterns, contributing to their superiority. ML methods excel in prediction accuracy, making them ideal for tasks like fraud detection and financial forecasting. Their adaptability and feature interaction capabilities make them well-suited for addressing diverse and evolving accounting problems, surpassing traditional methods in accuracy and insight.

Originality/value

The authors broadly review the accounting research with the ML method in ESG-related issues. By emphasizing the advantages of ML compared to traditional methods, the authors offer suggestions for future research in ML applications in ESG-related fields.

Details

Asian Review of Accounting, vol. ahead-of-print no. ahead-of-print
Type: Research Article
ISSN: 1321-7348

Keywords

Article
Publication date: 23 September 2024

Bernardo Cerqueira de Lima, Renata Maria Abrantes Baracho, Thomas Mandl and Patricia Baracho Porto

Social media platforms that disseminate scientific information to the public during the COVID-19 pandemic highlighted the importance of the topic of scientific communication…

Abstract

Purpose

Social media platforms that disseminate scientific information to the public during the COVID-19 pandemic highlighted the importance of the topic of scientific communication. Content creators in the field, as well as researchers who study the impact of scientific information online, are interested in how people react to these information resources and how they judge them. This study aims to devise a framework for extracting large social media datasets and find specific feedback to content delivery, enabling scientific content creators to gain insights into how the public perceives scientific information.

Design/methodology/approach

To collect public reactions to scientific information, the study focused on Twitter users who are doctors, researchers, science communicators or representatives of research institutes, and processed their replies for two years from the start of the pandemic. The study aimed in developing a solution powered by topic modeling enhanced by manual validation and other machine learning techniques, such as word embeddings, that is capable of filtering massive social media datasets in search of documents related to reactions to scientific communication. The architecture developed in this paper can be replicated for finding any documents related to niche topics in social media data. As a final step of our framework, we also fine-tuned a large language model to be able to perform the classification task with even more accuracy, forgoing the need of more human validation after the first step.

Findings

We provided a framework capable of receiving a large document dataset, and, with the help of with a small degree of human validation at different stages, is able to filter out documents within the corpus that are relevant to a very underrepresented niche theme inside the database, with much higher precision than traditional state-of-the-art machine learning algorithms. Performance was improved even further by the fine-tuning of a large language model based on BERT, which would allow for the use of such model to classify even larger unseen datasets in search of reactions to scientific communication without the need for further manual validation or topic modeling.

Research limitations/implications

The challenges of scientific communication are even higher with the rampant increase of misinformation in social media, and the difficulty of competing in a saturated attention economy of the social media landscape. Our study aimed at creating a solution that could be used by scientific content creators to better locate and understand constructive feedback toward their content and how it is received, which can be hidden as a minor subject between hundreds of thousands of comments. By leveraging an ensemble of techniques ranging from heuristics to state-of-the-art machine learning algorithms, we created a framework that is able to detect texts related to very niche subjects in very large datasets, with just a small amount of examples of texts related to the subject being given as input.

Practical implications

With this tool, scientific content creators can sift through their social media following and quickly understand how to adapt their content to their current user’s needs and standards of content consumption.

Originality/value

This study aimed to find reactions to scientific communication in social media. We applied three methods with human intervention and compared their performance. This study shows for the first time, the topics of interest which were discussed in Brazil during the COVID-19 pandemic.

Details

Data Technologies and Applications, vol. ahead-of-print no. ahead-of-print
Type: Research Article
ISSN: 2514-9288

Keywords

Article
Publication date: 14 May 2024

Xuemei Tang, Jun Wang and Qi Su

Recent trends have shown the integration of Chinese word segmentation (CWS) and part-of-speech (POS) tagging to enhance syntactic and semantic parsing. However, the potential…

Abstract

Purpose

Recent trends have shown the integration of Chinese word segmentation (CWS) and part-of-speech (POS) tagging to enhance syntactic and semantic parsing. However, the potential utility of hierarchical and structural information in these tasks remains underexplored. This study aims to leverage multiple external knowledge sources (e.g. syntactic and semantic features, lexicons) through various modules for the joint task.

Design/methodology/approach

We introduce a novel learning framework for the joint CWS and POS tagging task, utilizing graph convolutional networks (GCNs) to encode syntactic structure and semantic features. The framework also incorporates a pre-defined lexicon through a lexicon attention module. We evaluate our model on a range of public corpora, including CTB5, PKU and UD, the novel ZX dataset and the comprehensive CTB9 dataset.

Findings

Experimental results on these benchmark corpora demonstrate the effectiveness of our model in improving the performance of the joint task. Notably, we find that syntax information significantly enhances performance, while lexicon information helps mitigate the issue of out-of-vocabulary (OOV) words.

Originality/value

This study introduces a comprehensive approach to the joint CWS and POS tagging task by combining multiple features. Moreover, the proposed framework offers potential adaptability to other sequence labeling tasks, such as named entity recognition (NER).

Details

Aslib Journal of Information Management, vol. ahead-of-print no. ahead-of-print
Type: Research Article
ISSN: 2050-3806

Keywords

1 – 10 of over 1000