Search results

1 – 10 of over 1000
Open Access
Article
Publication date: 11 October 2023

Bachriah Fatwa Dhini, Abba Suganda Girsang, Unggul Utan Sufandi and Heny Kurniawati

The authors constructed an automatic essay scoring (AES) model in a discussion forum where the result was compared with scores given by human evaluators. This research proposes…

Abstract

Purpose

The authors constructed an automatic essay scoring (AES) model in a discussion forum where the result was compared with scores given by human evaluators. This research proposes essay scoring, which is conducted through two parameters, semantic and keyword similarities, using a SentenceTransformers pre-trained model that can construct the highest vector embedding. Combining these models is used to optimize the model with increasing accuracy.

Design/methodology/approach

The development of the model in the study is divided into seven stages: (1) data collection, (2) pre-processing data, (3) selected pre-trained SentenceTransformers model, (4) semantic similarity (sentence pair), (5) keyword similarity, (6) calculate final score and (7) evaluating model.

Findings

The multilingual paraphrase-multilingual-MiniLM-L12-v2 and distilbert-base-multilingual-cased-v1 models got the highest scores from comparisons of 11 pre-trained multilingual models of SentenceTransformers with Indonesian data (Dhini and Girsang, 2023). Both multilingual models were adopted in this study. A combination of two parameters is obtained by comparing the response of the keyword extraction responses with the rubric keywords. Based on the experimental results, proposing a combination can increase the evaluation results by 0.2.

Originality/value

This study uses discussion forum data from the general biology course in online learning at the open university for the 2020.2 and 2021.2 semesters. Forum discussion ratings are still manual. In this survey, the authors created a model that automatically calculates the value of discussion forums, which are essays based on the lecturer's answers moreover rubrics.

Details

Asian Association of Open Universities Journal, vol. 18 no. 3
Type: Research Article
ISSN: 1858-3431

Keywords

Article
Publication date: 23 August 2023

Mohamed Madani Hafidi, Meriem Djezzar, Mounir Hemam, Fatima Zahra Amara and Moufida Maimour

This paper aims to offer a comprehensive examination of the various solutions currently accessible for addressing the challenge of semantic interoperability in cyber physical…

Abstract

Purpose

This paper aims to offer a comprehensive examination of the various solutions currently accessible for addressing the challenge of semantic interoperability in cyber physical systems (CPS). CPS is a new generation of systems composed of physical assets with computation capabilities, connected with software systems in a network, exchanging data collected from the physical asset, models (physics-based, data-driven, . . .) and services (reconfiguration, monitoring, . . .). The physical asset and its software system are connected, and they exchange data to be interpreted in a certain context. The heterogeneous nature of the collected data together with different types of models rise interoperability problems. Modeling the digital space of the CPS and integrating information models that support cyber physical interoperability together are required.

Design/methodology/approach

This paper aims to identify the most relevant points in the development of semantic models and machine learning solutions to the interoperability problem, and how these solutions are implemented in CPS. The research analyzes recent papers related to the topic of semantic interoperability in Industry 4.0 (I4.0) systems.

Findings

Semantic models are key enabler technologies that provide a common understanding of data, and they can be used to solve interoperability problems in Industry by using a common vocabulary when defining these models.

Originality/value

This paper provides an overview of the different available solutions to the semantic interoperability problem in CPS.

Details

International Journal of Web Information Systems, vol. 19 no. 3/4
Type: Research Article
ISSN: 1744-0084

Keywords

Article
Publication date: 31 October 2023

Hong Zhou, Binwei Gao, Shilong Tang, Bing Li and Shuyu Wang

The number of construction dispute cases has maintained a high growth trend in recent years. The effective exploration and management of construction contract risk can directly…

Abstract

Purpose

The number of construction dispute cases has maintained a high growth trend in recent years. The effective exploration and management of construction contract risk can directly promote the overall performance of the project life cycle. The miss of clauses may result in a failure to match with standard contracts. If the contract, modified by the owner, omits key clauses, potential disputes may lead to contractors paying substantial compensation. Therefore, the identification of construction project contract missing clauses has heavily relied on the manual review technique, which is inefficient and highly restricted by personnel experience. The existing intelligent means only work for the contract query and storage. It is urgent to raise the level of intelligence for contract clause management. Therefore, this paper aims to propose an intelligent method to detect construction project contract missing clauses based on Natural Language Processing (NLP) and deep learning technology.

Design/methodology/approach

A complete classification scheme of contract clauses is designed based on NLP. First, construction contract texts are pre-processed and converted from unstructured natural language into structured digital vector form. Following the initial categorization, a multi-label classification of long text construction contract clauses is designed to preliminary identify whether the clause labels are missing. After the multi-label clause missing detection, the authors implement a clause similarity algorithm by creatively integrating the image detection thought, MatchPyramid model, with BERT to identify missing substantial content in the contract clauses.

Findings

1,322 construction project contracts were tested. Results showed that the accuracy of multi-label classification could reach 93%, the accuracy of similarity matching can reach 83%, and the recall rate and F1 mean of both can reach more than 0.7. The experimental results verify the feasibility of intelligently detecting contract risk through the NLP-based method to some extent.

Originality/value

NLP is adept at recognizing textual content and has shown promising results in some contract processing applications. However, the mostly used approaches of its utilization for risk detection in construction contract clauses predominantly are rule-based, which encounter challenges when handling intricate and lengthy engineering contracts. This paper introduces an NLP technique based on deep learning which reduces manual intervention and can autonomously identify and tag types of contractual deficiencies, aligning with the evolving complexities anticipated in future construction contracts. Moreover, this method achieves the recognition of extended contract clause texts. Ultimately, this approach boasts versatility; users simply need to adjust parameters such as segmentation based on language categories to detect omissions in contract clauses of diverse languages.

Details

Engineering, Construction and Architectural Management, vol. ahead-of-print no. ahead-of-print
Type: Research Article
ISSN: 0969-9988

Keywords

Book part
Publication date: 25 October 2023

Md Sakib Ullah Sourav, Huidong Wang, Mohammad Raziuddin Chowdhury and Rejwan Bin Sulaiman

One of the most neglected sources of energy loss is streetlights that generate too much light in areas where it is not required. Energy waste has enormous economic and…

Abstract

One of the most neglected sources of energy loss is streetlights that generate too much light in areas where it is not required. Energy waste has enormous economic and environmental effects. In addition, due to the conventional manual nature of operation, streetlights are frequently seen being turned ‘ON’ during the day and ‘OFF’ in the evening, which is regrettable even in the twenty-first century. These issues require automated streetlight control in order to be resolved. This study aims to develop a novel streetlight controlling method by combining a smart transport monitoring system powered by computer vision technology with a closed circuit television (CCTV) camera that allows the light-emitting diode (LED) streetlight to automatically light up with the appropriate brightness by detecting the presence of pedestrians or vehicles and dimming the streetlight in their absence using semantic image segmentation from the CCTV video streaming. Consequently, our model distinguishes daylight and nighttime, which made it feasible to automate the process of turning the streetlight ‘ON’ and ‘OFF’ to save energy consumption costs. According to the aforementioned approach, geo-location sensor data could be utilised to make more informed streetlight management decisions. To complete the tasks, we consider training the U-net model with ResNet-34 as its backbone. Validity of the models is guaranteed with the use of assessment matrices. The suggested concept is straightforward, economical, energy-efficient, long-lasting and more resilient than conventional alternatives.

Details

Technology and Talent Strategies for Sustainable Smart Cities
Type: Book
ISBN: 978-1-83753-023-6

Keywords

Article
Publication date: 20 September 2022

Jinzhu Zhang, Yue Liu, Linqi Jiang and Jialu Shi

This paper aims to propose a method for better discovering topic evolution path and semantic relationship from the perspective of patent entity extraction and semantic…

Abstract

Purpose

This paper aims to propose a method for better discovering topic evolution path and semantic relationship from the perspective of patent entity extraction and semantic representation. On the one hand, this paper identifies entities that have the same semantics but different expressions for accurate topic evolution path discovery. On the other hand, this paper reveals semantic relationships of topic evolution for better understanding what leads to topic evolution.

Design/methodology/approach

Firstly, a Bi-LSTM-CRF (bidirectional long short-term memory with conditional random field) model is designed for patent entity extraction and a representation learning method is constructed for patent entity representation. Secondly, a method based on knowledge outflow and inflow is proposed for discovering topic evolution path, by identifying and computing semantic common entities among topics. Finally, multiple semantic relationships among patent entities are pre-designed according to a specific domain, and then the semantic relationship among topics is identified through the proportion of different types of semantic relationships belonging to each topic.

Findings

In the field of UAV (unmanned aerial vehicle), this method identifies semantic common entities which have the same semantics but different expressions. In addition, this method better discovers topic evolution paths by comparison with a traditional method. Finally, this method identifies different semantic relationships among topics, which gives a detailed description for understanding and interpretation of topic evolution. These results prove that the proposed method is effective and useful. Simultaneously, this method is a preliminary study and still needs to be further investigated on other datasets using multiple emerging deep learning methods.

Originality/value

This work provides a new perspective for topic evolution analysis by considering semantic representation of patent entities. The authors design a method for discovering topic evolution paths by considering knowledge flow computed by semantic common entities, which can be easily extended to other patent mining-related tasks. This work is the first attempt to reveal semantic relationships among topics for a precise and detailed description of topic evolution.

Details

Aslib Journal of Information Management, vol. 75 no. 3
Type: Research Article
ISSN: 2050-3806

Keywords

Article
Publication date: 14 June 2022

Gitaek Lee, Seonghyeon Moon and Seokho Chi

Contractors must check the provisions that may cause disputes in the specifications to manage project risks when bidding for a construction project. However, since the…

Abstract

Purpose

Contractors must check the provisions that may cause disputes in the specifications to manage project risks when bidding for a construction project. However, since the specification is mainly written regarding many national standards, determining which standard each section of the specification is derived from and whether the content is appropriate for the local site is a labor-intensive task. To develop an automatic reference section identification model that helps complete the specification review process in short bidding steps, the authors proposed a framework that integrates rules and machine learning algorithms.

Design/methodology/approach

The study begins by collecting 7,795 sections from construction specifications and the national standards from different countries. Then, the collected sections were retrieved for similar section pairs with syntactic rules generated by the construction domain knowledge. Finally, to improve the reliability and expandability of the section paring, the authors built a deep structured semantic model that increases the cosine similarity between documents dealing with the same topic by learning human-labeled similarity information.

Findings

The integrated model developed in this study showed 0.812, 0.898, and 0.923 levels of performance in NDCG@1, NDCG@5, and NDCG@10, respectively, confirming that the model can adequately select document candidates that require comparative analysis of clauses for practitioners.

Originality/value

The results contribute to more efficient and objective identification of potential disputes within the specifications by automatically providing practitioners with the reference section most relevant to the analysis target section.

Details

Engineering, Construction and Architectural Management, vol. 30 no. 9
Type: Research Article
ISSN: 0969-9988

Keywords

Article
Publication date: 20 July 2023

Elaheh Hosseini, Kimiya Taghizadeh Milani and Mohammad Shaker Sabetnasab

This research aimed to visualize and analyze the co-word network and thematic clusters of the intellectual structure in the field of linked data during 1900–2021.

Abstract

Purpose

This research aimed to visualize and analyze the co-word network and thematic clusters of the intellectual structure in the field of linked data during 1900–2021.

Design/methodology/approach

This applied research employed a descriptive and analytical method, scientometric indicators, co-word techniques, and social network analysis. VOSviewer, SPSS, Python programming, and UCINet software were used for data analysis and network structure visualization.

Findings

The top ranks of the Web of Science (WOS) subject categorization belonged to various fields of computer science. Besides, the USA was the most prolific country. The keyword ontology had the highest frequency of co-occurrence. Ontology and semantic were the most frequent co-word pairs. In terms of the network structure, nine major topic clusters were identified based on co-occurrence, and 29 thematic clusters were identified based on hierarchical clustering. Comparisons between the two clustering techniques indicated that three clusters, namely semantic bioinformatics, knowledge representation, and semantic tools were in common. The most mature and mainstream thematic clusters were natural language processing techniques to boost modeling and visualization, context-aware knowledge discovery, probabilistic latent semantic analysis (PLSA), semantic tools, latent semantic indexing, web ontology language (OWL) syntax, and ontology-based deep learning.

Originality/value

This study adopted various techniques such as co-word analysis, social network analysis network structure visualization, and hierarchical clustering to represent a suitable, visual, methodical, and comprehensive perspective into linked data.

Details

Library Hi Tech, vol. ahead-of-print no. ahead-of-print
Type: Research Article
ISSN: 0737-8831

Keywords

Article
Publication date: 14 November 2023

Shaodan Sun, Jun Deng and Xugong Qin

This paper aims to amplify the retrieval and utilization of historical newspapers through the application of semantic organization, all from the vantage point of a fine-grained…

Abstract

Purpose

This paper aims to amplify the retrieval and utilization of historical newspapers through the application of semantic organization, all from the vantage point of a fine-grained knowledge element perspective. This endeavor seeks to unlock the latent value embedded within newspaper contents while simultaneously furnishing invaluable guidance within methodological paradigms for research in the humanities domain.

Design/methodology/approach

According to the semantic organization process and knowledge element concept, this study proposes a holistic framework, including four pivotal stages: knowledge element description, extraction, association and application. Initially, a semantic description model dedicated to knowledge elements is devised. Subsequently, harnessing the advanced deep learning techniques, the study delves into the realm of entity recognition and relationship extraction. These techniques are instrumental in identifying entities within the historical newspaper contents and capturing the interdependencies that exist among them. Finally, an online platform based on Flask is developed to enable the recognition of entities and relationships within historical newspapers.

Findings

This article utilized the Shengjing Times·Changchun Compilation as the datasets for describing, extracting, associating and applying newspapers contents. Regarding knowledge element extraction, the BERT + BS consistently outperforms Bi-LSTM, CRF++ and even BERT in terms of Recall and F1 scores, making it a favorable choice for entity recognition in this context. Particularly noteworthy is the Bi-LSTM-Pro model, which stands out with the highest scores across all metrics, notably achieving an exceptional F1 score in knowledge element relationship recognition.

Originality/value

Historical newspapers transcend their status as mere artifacts, evolving into invaluable reservoirs safeguarding the societal and historical memory. Through semantic organization from a fine-grained knowledge element perspective, it can facilitate semantic retrieval, semantic association, information visualization and knowledge discovery services for historical newspapers. In practice, it can empower researchers to unearth profound insights within the historical and cultural context, broadening the landscape of digital humanities research and practical applications.

Details

Aslib Journal of Information Management, vol. ahead-of-print no. ahead-of-print
Type: Research Article
ISSN: 2050-3806

Keywords

Article
Publication date: 30 January 2023

Zhongbao Liu and Wenjuan Zhao

In recent years, Chinese sentiment analysis has made great progress, but the characteristics of the language itself and downstream task requirements were not explored thoroughly…

Abstract

Purpose

In recent years, Chinese sentiment analysis has made great progress, but the characteristics of the language itself and downstream task requirements were not explored thoroughly. It is not practical to directly migrate achievements obtained in English sentiment analysis to the analysis of Chinese because of the huge difference between the two languages.

Design/methodology/approach

In view of the particularity of Chinese text and the requirement of sentiment analysis, a Chinese sentiment analysis model integrating multi-granularity semantic features is proposed in this paper. This model introduces the radical and part-of-speech features based on the character and word features, with the application of bidirectional long short-term memory, attention mechanism and recurrent convolutional neural network.

Findings

The comparative experiments showed that the F1 values of this model reaches 88.28 and 84.80 per cent on the man-made dataset and the NLPECC dataset, respectively. Meanwhile, an ablation experiment was conducted to verify the effectiveness of attention mechanism, part of speech, radical, character and word factors in Chinese sentiment analysis. The performance of the proposed model exceeds that of existing models to some extent.

Originality/value

The academic contribution of this paper is as follows: first, in view of the particularity of Chinese texts and the requirement of sentiment analysis, this paper focuses on solving the deficiency problem of Chinese sentiment analysis under the big data context. Second, this paper borrows ideas from multiple interdisciplinary frontier theories and methods, such as information science, linguistics and artificial intelligence, which makes it innovative and comprehensive. Finally, this paper deeply integrates multi-granularity semantic features such as character, word, radical and part of speech, which further complements the theoretical framework and method system of Chinese sentiment analysis.

Details

Data Technologies and Applications, vol. 57 no. 4
Type: Research Article
ISSN: 2514-9288

Keywords

Article
Publication date: 14 July 2022

Nishad A. and Sajimon Abraham

A wide number of technologies are currently in store to harness the challenges posed by pandemic situations. As such diseases transmit by way of person-to-person contact or by any…

Abstract

Purpose

A wide number of technologies are currently in store to harness the challenges posed by pandemic situations. As such diseases transmit by way of person-to-person contact or by any other means, the World Health Organization had recommended location tracking and tracing of people either infected or contacted with the patients as one of the standard operating procedures and has also outlined protocols for incident management. Government agencies use different inputs such as smartphone signals and details from the respondent to prepare the travel log of patients. Each and every event of their trace such as stay points, revisit locations and meeting points is important. More trained staffs and tools are required under the traditional system of contact tracing. At the time of the spiralling patient count, the time-bound tracing of primary and secondary contacts may not be possible, and there are chances of human errors as well. In this context, the purpose of this paper is to propose an algorithm called SemTraClus-Tracer, an efficient approach for computing the movement of individuals and analysing the possibility of pandemic spread and vulnerability of the locations.

Design/methodology/approach

Pandemic situations push the world into existential crises. In this context, this paper proposes an algorithm called SemTraClus-Tracer, an efficient approach for computing the movement of individuals and analysing the possibility of pandemic spread and vulnerability of the locations. By exploring the daily mobility and activities of the general public, the system identifies multiple levels of contacts with respect to an infected person and extracts semantic information by considering vital factors that can induce virus spread. It grades different geographic locations according to a measure called weightage of participation so that vulnerable locations can be easily identified. This paper gives directions on the advantages of using spatio-temporal aggregate queries for extracting general characteristics of social mobility. The system also facilitates room for the generation of various information by combing through the medical reports of the patients.

Findings

It is identified that context of movement is important; hence, the existing SemTraClus algorithm is modified by accounting for four important factors such as stay point, contact presence, stay time of primary contacts and waypoint severity. The priority level can be reconfigured according to the interest of authority. This approach reduces the overwhelming task of contact tracing. Different functionalities provided by the system are also explained. As the real data set is not available, experiments are conducted with similar data and results are shown for different types of journeys in different geographical locations. The proposed method efficiently handles computational movement and activity analysis by incorporating various relevant semantics of trajectories. The incorporation of cluster-based aggregate queries in the model do away with the computational headache of processing the entire mobility data.

Research limitations/implications

As the trajectory of patients is not available, the authors have used the standard data sets for experimentation, which serve the purpose.

Originality/value

This paper proposes a framework infrastructure that allows the emergency response team to grab multiple information based on the tracked mobility details of a patient and facilitates room for various activities for the mitigation of pandemics such as the prediction of hotspots, identification of stay locations and suggestion of possible locations of primary and secondary contacts, creation of clusters of hotspots and identification of nearby medical assistance. The system provides an efficient way of activity analysis by computing the mobility of people and identifying features of geographical locations where people travelled. While formulating the framework, the authors have reviewed many different implementation plans and protocols and arrived at the conclusion that the core strategy followed is more or less the same. For the sake of a reference model, the Indian scenario is adopted for defining the concepts.

Details

International Journal of Pervasive Computing and Communications, vol. 19 no. 4
Type: Research Article
ISSN: 1742-7371

Keywords

1 – 10 of over 1000