Search results

1 – 10 of over 2000
Article
Publication date: 28 December 2023

Na Xu, Yanxiang Liang, Chaoran Guo, Bo Meng, Xueqing Zhou, Yuting Hu and Bo Zhang

Safety management plays an important part in coal mine construction. Due to complex data, the implementation of the construction safety knowledge scattered in standards poses a…

Abstract

Purpose

Safety management plays an important part in coal mine construction. Due to complex data, the implementation of the construction safety knowledge scattered in standards poses a challenge. This paper aims to develop a knowledge extraction model to automatically and efficiently extract domain knowledge from unstructured texts.

Design/methodology/approach

Bidirectional encoder representations from transformers (BERT)-bidirectional long short-term memory (BiLSTM)-conditional random field (CRF) method based on a pre-training language model was applied to carry out knowledge entity recognition in the field of coal mine construction safety in this paper. Firstly, 80 safety standards for coal mine construction were collected, sorted out and marked as a descriptive corpus. Then, the BERT pre-training language model was used to obtain dynamic word vectors. Finally, the BiLSTM-CRF model concluded the entity’s optimal tag sequence.

Findings

Accordingly, 11,933 entities and 2,051 relationships in the standard specifications texts of this paper were identified and a language model suitable for coal mine construction safety management was proposed. The experiments showed that F1 values were all above 60% in nine types of entities such as security management. F1 value of this model was more than 60% for entity extraction. The model identified and extracted entities more accurately than conventional methods.

Originality/value

This work completed the domain knowledge query and built a Q&A platform via entities and relationships identified by the standard specifications suitable for coal mines. This paper proposed a systematic framework for texts in coal mine construction safety to improve efficiency and accuracy of domain-specific entity extraction. In addition, the pretraining language model was also introduced into the coal mine construction safety to realize dynamic entity recognition, which provides technical support and theoretical reference for the optimization of safety management platforms.

Details

Engineering, Construction and Architectural Management, vol. ahead-of-print no. ahead-of-print
Type: Research Article
ISSN: 0969-9988

Keywords

Article
Publication date: 2 May 2023

Giovanna Aracri, Antonietta Folino and Stefano Silvestri

The purpose of this paper is to propose a methodology for the enrichment and tailoring of a knowledge organization system (KOS), in order to support the information extraction…

Abstract

Purpose

The purpose of this paper is to propose a methodology for the enrichment and tailoring of a knowledge organization system (KOS), in order to support the information extraction (IE) task for the analysis of documents in the tourism domain. In particular, the KOS is used to develop a named entity recognition (NER) system.

Design/methodology/approach

A method to improve and customize an available thesaurus by leveraging documents related to the tourism in Italy is firstly presented. Then, the obtained thesaurus is used to create an annotated NER corpus, exploiting both distant supervision, deep learning and a light human supervision.

Findings

The study shows that a customized KOS can effectively support IE tasks when applied to documents belonging to the same domains and types used for its construction. Moreover, it is very useful to support and ease the annotation task using the proposed methodology, allowing to annotate a corpus with a fraction of the effort required for a manual annotation.

Originality/value

The paper explores an alternative use of a KOS, proposing an innovative NER corpus annotation methodology. Moreover, the KOS and the annotated NER data set will be made publicly available.

Details

Journal of Documentation, vol. 79 no. 6
Type: Research Article
ISSN: 0022-0418

Keywords

Article
Publication date: 14 November 2023

Shaodan Sun, Jun Deng and Xugong Qin

This paper aims to amplify the retrieval and utilization of historical newspapers through the application of semantic organization, all from the vantage point of a fine-grained…

Abstract

Purpose

This paper aims to amplify the retrieval and utilization of historical newspapers through the application of semantic organization, all from the vantage point of a fine-grained knowledge element perspective. This endeavor seeks to unlock the latent value embedded within newspaper contents while simultaneously furnishing invaluable guidance within methodological paradigms for research in the humanities domain.

Design/methodology/approach

According to the semantic organization process and knowledge element concept, this study proposes a holistic framework, including four pivotal stages: knowledge element description, extraction, association and application. Initially, a semantic description model dedicated to knowledge elements is devised. Subsequently, harnessing the advanced deep learning techniques, the study delves into the realm of entity recognition and relationship extraction. These techniques are instrumental in identifying entities within the historical newspaper contents and capturing the interdependencies that exist among them. Finally, an online platform based on Flask is developed to enable the recognition of entities and relationships within historical newspapers.

Findings

This article utilized the Shengjing Times·Changchun Compilation as the datasets for describing, extracting, associating and applying newspapers contents. Regarding knowledge element extraction, the BERT + BS consistently outperforms Bi-LSTM, CRF++ and even BERT in terms of Recall and F1 scores, making it a favorable choice for entity recognition in this context. Particularly noteworthy is the Bi-LSTM-Pro model, which stands out with the highest scores across all metrics, notably achieving an exceptional F1 score in knowledge element relationship recognition.

Originality/value

Historical newspapers transcend their status as mere artifacts, evolving into invaluable reservoirs safeguarding the societal and historical memory. Through semantic organization from a fine-grained knowledge element perspective, it can facilitate semantic retrieval, semantic association, information visualization and knowledge discovery services for historical newspapers. In practice, it can empower researchers to unearth profound insights within the historical and cultural context, broadening the landscape of digital humanities research and practical applications.

Details

Aslib Journal of Information Management, vol. ahead-of-print no. ahead-of-print
Type: Research Article
ISSN: 2050-3806

Keywords

Article
Publication date: 12 September 2023

Wenjing Wu, Caifeng Wen, Qi Yuan, Qiulan Chen and Yunzhong Cao

Learning from safety accidents and sharing safety knowledge has become an important part of accident prevention and improving construction safety management. Considering the…

Abstract

Purpose

Learning from safety accidents and sharing safety knowledge has become an important part of accident prevention and improving construction safety management. Considering the difficulty of reusing unstructured data in the construction industry, the knowledge in it is difficult to be used directly for safety analysis. The purpose of this paper is to explore the construction of construction safety knowledge representation model and safety accident graph through deep learning methods, extract construction safety knowledge entities through BERT-BiLSTM-CRF model and propose a data management model of data–knowledge–services.

Design/methodology/approach

The ontology model of knowledge representation of construction safety accidents is constructed by integrating entity relation and logic evolution. Then, the database of safety incidents in the architecture, engineering and construction (AEC) industry is established based on the collected construction safety incident reports and related dispute cases. The construction method of construction safety accident knowledge graph is studied, and the precision of BERT-BiLSTM-CRF algorithm in information extraction is verified through comparative experiments. Finally, a safety accident report is used as an example to construct the AEC domain construction safety accident knowledge graph (AEC-KG), which provides visual query knowledge service and verifies the operability of knowledge management.

Findings

The experimental results show that the combined BERT-BiLSTM-CRF algorithm has a precision of 84.52%, a recall of 92.35%, and an F1 value of 88.26% in named entity recognition from the AEC domain database. The construction safety knowledge representation model and safety incident knowledge graph realize knowledge visualization.

Originality/value

The proposed framework provides a new knowledge management approach to improve the safety management of practitioners and also enriches the application scenarios of knowledge graph. On the one hand, it innovatively proposes a data application method and knowledge management method of safety accident report that integrates entity relationship and matter evolution logic. On the other hand, the legal adjudication dimension is innovatively added to the knowledge graph in the construction safety field as the basis for the postincident disposal measures of safety accidents, which provides reference for safety managers' decision-making in all aspects.

Details

Engineering, Construction and Architectural Management, vol. ahead-of-print no. ahead-of-print
Type: Research Article
ISSN: 0969-9988

Keywords

Article
Publication date: 13 December 2022

Chengxi Yan, Xuemei Tang, Hao Yang and Jun Wang

The majority of existing studies about named entity recognition (NER) concentrate on the prediction enhancement of deep neural network (DNN)-based models themselves, but the…

Abstract

Purpose

The majority of existing studies about named entity recognition (NER) concentrate on the prediction enhancement of deep neural network (DNN)-based models themselves, but the issues about the scarcity of training corpus and the difficulty of annotation quality control are not fully solved, especially for Chinese ancient corpora. Therefore, designing a new integrated solution for Chinese historical NER, including automatic entity extraction and man-machine cooperative annotation, is quite valuable for improving the effectiveness of Chinese historical NER and fostering the development of low-resource information extraction.

Design/methodology/approach

The research provides a systematic approach for Chinese historical NER with a three-stage framework. In addition to the stage of basic preprocessing, the authors create, retrain and yield a high-performance NER model only using limited labeled resources during the stage of augmented deep active learning (ADAL), which entails three steps—DNN-based NER modeling, hybrid pool-based sampling (HPS) based on the active learning (AL), and NER-oriented data augmentation (DA). ADAL is thought to have the capacity to maintain the performance of DNN as high as possible under the few-shot constraint. Then, to realize machine-aided quality control in crowdsourcing settings, the authors design a stage of globally-optimized automatic label consolidation (GALC). The core of GALC is a newly-designed label consolidation model called simulated annealing-based automatic label aggregation (“SA-ALC”), which incorporates the factors of worker reliability and global label estimation. The model can assure the annotation quality of those data from a crowdsourcing annotation system.

Findings

Extensive experiments on two types of Chinese classical historical datasets show that the authors’ solution can effectively reduce the corpus dependency of a DNN-based NER model and alleviate the problem of label quality. Moreover, the results also show the superior performance of the authors’ pipeline approaches (i.e. HPS + DA and SA-ALC) compared to equivalent baselines in each stage.

Originality/value

The study sheds new light on the automatic extraction of Chinese historical entities in an all-technological-process integration. The solution is helpful to effectively reducing the annotation cost and controlling the labeling quality for the NER task. It can be further applied to similar tasks of information extraction and other low-resource fields in theoretical and practical ways.

Details

Aslib Journal of Information Management, vol. 75 no. 3
Type: Research Article
ISSN: 2050-3806

Keywords

Article
Publication date: 10 March 2022

Jayaram Boga and Dhilip Kumar V.

For achieving the profitable human activity recognition (HAR) method, this paper solves the HAR problem under wireless body area network (WBAN) using a developed ensemble learning…

95

Abstract

Purpose

For achieving the profitable human activity recognition (HAR) method, this paper solves the HAR problem under wireless body area network (WBAN) using a developed ensemble learning approach. The purpose of this study is,to solve the HAR problem under WBAN using a developed ensemble learning approach for achieving the profitable HAR method. There are three data sets used for this HAR in WBAN, namely, human activity recognition using smartphones, wireless sensor data mining and Kaggle. The proposed model undergoes four phases, namely, “pre-processing, feature extraction, feature selection and classification.” Here, the data can be preprocessed by artifacts removal and median filtering techniques. Then, the features are extracted by techniques such as “t-Distributed Stochastic Neighbor Embedding”, “Short-time Fourier transform” and statistical approaches. The weighted optimal feature selection is considered as the next step for selecting the important features based on computing the data variance of each class. This new feature selection is achieved by the hybrid coyote Jaya optimization (HCJO). Finally, the meta-heuristic-based ensemble learning approach is used as a new recognition approach with three classifiers, namely, “support vector machine (SVM), deep neural network (DNN) and fuzzy classifiers.” Experimental analysis is performed.

Design/methodology/approach

The proposed HCJO algorithm was developed for optimizing the membership function of fuzzy, iteration limit of SVM and hidden neuron count of DNN for getting superior classified outcomes and to enhance the performance of ensemble classification.

Findings

The accuracy for enhanced HAR model was pretty high in comparison to conventional models, i.e. higher than 6.66% to fuzzy, 4.34% to DNN, 4.34% to SVM, 7.86% to ensemble and 6.66% to Improved Sealion optimization algorithm-Attention Pyramid-Convolutional Neural Network-AP-CNN, respectively.

Originality/value

The suggested HAR model with WBAN using HCJO algorithm is accurate and improves the effectiveness of the recognition.

Details

International Journal of Pervasive Computing and Communications, vol. 19 no. 4
Type: Research Article
ISSN: 1742-7371

Keywords

Article
Publication date: 8 June 2022

Guo Chen, Jiabin Peng, Tianxiang Xu and Lu Xiao

Problem-solving” is the most crucial key insight of scientific research. This study focuses on constructing the “problem-solving” knowledge graph of scientific domains by…

Abstract

Purpose

Problem-solving” is the most crucial key insight of scientific research. This study focuses on constructing the “problem-solving” knowledge graph of scientific domains by extracting four entity relation types: problem-solving, problem hierarchy, solution hierarchy and association.

Design/methodology/approach

This paper presents a low-cost method for identifying these relationships in scientific papers based on word analogy. The problem-solving and hierarchical relations are represented as offset vectors of the head and tail entities and then classified by referencing a small set of predefined entity relations.

Findings

This paper presents an experiment with artificial intelligence papers from the Web of Science and achieved good performance. The F1 scores of entity relation types problem hierarchy, problem-solving and solution hierarchy, which were 0.823, 0.815 and 0.748, respectively. This paper used computer vision as an example to demonstrate the application of the extracted relations in constructing domain knowledge graphs and revealing historical research trends.

Originality/value

This paper uses an approach that is highly efficient and has a good generalization ability. Instead of relying on a large-scale manually annotated corpus, it only requires a small set of entity relations that can be easily extracted from external knowledge resources.

Details

Aslib Journal of Information Management, vol. 75 no. 3
Type: Research Article
ISSN: 2050-3806

Keywords

Article
Publication date: 25 April 2024

Aasif Ahmad Mir, Nina Smirnova, Ramalingam Jeyshankar and Phillip Mayr

This study aims to highlight the growth and development of Indo-German collaborative research over the past three decades. Moreover, this study encompasses an in-depth examination…

Abstract

Purpose

This study aims to highlight the growth and development of Indo-German collaborative research over the past three decades. Moreover, this study encompasses an in-depth examination of funding acknowledgements to gain valuable insights into the financial support that underpins these collaborative endeavours. Together with this paper, the authors provide an openly accessible data set of Indo-German research papers for further and reproducible research activities (the “Indo-German Literature Dataset”).

Design/methodology/approach

The data were retrieved from the Web of Science (WoS) database from the year 1990 till the 30th of November 2022. A total of 36,999 records were retrieved against the used query. Acknowledged entities were extracted using a named entity recognition (NER) model specifically trained for this task. Interrelations between the extracted entities and scientific domains, lengths of acknowledgement texts, number of authors and affiliations, number of citations and gender of the first author, as well as collaboration patterns between Indian and German funders were examined.

Findings

The study reveals a consistent and increasing growth in the publication trend over the years. The study brings to light that Physics, Chemistry, Materials Science, Astronomy and Astrophysics and Engineering prominently dominate the Indo-German collaborative research. The USA, followed by England and France, are the most active collaborators in Indian and German research. Largely, research was funded by major German and Indian funding agencies, international corporations and German and American universities. Associations between the first author’s gender and acknowledged entity were observed. Additionally, relations between entity, entity type and scientific domain were discovered.

Practical implications

The study paves the way for enhanced collaboration, optimized resource utilization and societal advantages by offering a profound comprehension of the intricacies inherent in research partnerships between India and Germany. Implementation of the insights gleaned from this study holds the promise of cultivating a more resilient and influential collaborative research ecosystem between the two nations.

Originality/value

The study highlights a deeper understanding of the composition of the Indo-German collaborative research landscape of the past 30 years and its significance in advancing scientific knowledge and fostering international partnerships. Furthermore, the authors provide an open version of the original WoS data set. The Indo-German Literature Data set consists of 22,844 papers from OpenAlex and is available for related studies like literature studies and scientometrics.

Details

Global Knowledge, Memory and Communication, vol. ahead-of-print no. ahead-of-print
Type: Research Article
ISSN: 2514-9342

Keywords

Article
Publication date: 25 January 2023

Ashutosh Kumar and Aakanksha Sharaff

The purpose of this study was to design a multitask learning model so that biomedical entities can be extracted without having any ambiguity from biomedical texts.

Abstract

Purpose

The purpose of this study was to design a multitask learning model so that biomedical entities can be extracted without having any ambiguity from biomedical texts.

Design/methodology/approach

In the proposed automated bio entity extraction (ABEE) model, a multitask learning model has been introduced with the combination of single-task learning models. Our model used Bidirectional Encoder Representations from Transformers to train the single-task learning model. Then combined model's outputs so that we can find the verity of entities from biomedical text.

Findings

The proposed ABEE model targeted unique gene/protein, chemical and disease entities from the biomedical text. The finding is more important in terms of biomedical research like drug finding and clinical trials. This research aids not only to reduce the effort of the researcher but also to reduce the cost of new drug discoveries and new treatments.

Research limitations/implications

As such, there are no limitations with the model, but the research team plans to test the model with gigabyte of data and establish a knowledge graph so that researchers can easily estimate the entities of similar groups.

Practical implications

As far as the practical implication concerned, the ABEE model will be helpful in various natural language processing task as in information extraction (IE), it plays an important role in the biomedical named entity recognition and biomedical relation extraction and also in the information retrieval task like literature-based knowledge discovery.

Social implications

During the COVID-19 pandemic, the demands for this type of our work increased because of the increase in the clinical trials at that time. If this type of research has been introduced previously, then it would have reduced the time and effort for new drug discoveries in this area.

Originality/value

In this work we proposed a novel multitask learning model that is capable to extract biomedical entities from the biomedical text without any ambiguity. The proposed model achieved state-of-the-art performance in terms of precision, recall and F1 score.

Details

Data Technologies and Applications, vol. 57 no. 2
Type: Research Article
ISSN: 2514-9288

Keywords

Open Access
Article
Publication date: 23 October 2023

Rebecca Maughan and Aideen O'Dochartaigh

This study examines how accounting tools and techniques are used to create and support membership and reporting boundaries for a multi-entity sustainability scheme. It also…

1184

Abstract

Purpose

This study examines how accounting tools and techniques are used to create and support membership and reporting boundaries for a multi-entity sustainability scheme. It also considers whether boundary setting for this initiative helps to connect corporate activity with planetary boundaries and the SDGs.

Design/methodology/approach

A case study of a national agrifood sustainability scheme, analysing extensive documentary data and multi-entity sustainability reports. The concept of partial organising is used to frame the analysis.

Findings

Accounting, in the form of planning, verification, target setting, annual review and reporting, can be used to create a membership and a reporting boundary. Accounting tools and techniques support the scheme's standard-setting and monitoring elements. The study demonstrates that the scheme offers innovation in how sustainability reporting is managed. However, it does not currently provide a cumulative assessment of the effect of the sector's activity on ecological carrying capacity or connect this activity to global sustainability indicators.

Research limitations/implications

Future research can build on this study's insights to further develop our understanding of multi-entity sustainability reporting and accounting's role in organising for sustainability. The authors identify several research avenues including: boundary setting in ecologically significant sectors, integrating global sustainability indicators at sectoral and organisational levels, sustainability controls in multi-entity settings and the potential of multi-entity reporting to provide substantive disclosure.

Originality/value

This paper provides insight into accounting's role in boundary setting for a multi-entity sustainability initiative. It adds to our understanding of the potential of a multi-entity reporting boundary to support connected measurement between corporate activity and global sustainability indicators. It builds on work on partial organising and provides insight into how accounting can support this form of organising for sustainability.

Details

Accounting, Auditing & Accountability Journal, vol. 36 no. 9
Type: Research Article
ISSN: 0951-3574

Keywords

1 – 10 of over 2000