Search results

1 – 10 of over 2000
Article
Publication date: 8 July 2022

Chuanming Yu, Zhengang Zhang, Lu An and Gang Li

In recent years, knowledge graph completion has gained increasing research focus and shown significant improvements. However, most existing models only use the structures of…

Abstract

Purpose

In recent years, knowledge graph completion has gained increasing research focus and shown significant improvements. However, most existing models only use the structures of knowledge graph triples when obtaining the entity and relationship representations. In contrast, the integration of the entity description and the knowledge graph network structure has been ignored. This paper aims to investigate how to leverage both the entity description and the network structure to enhance the knowledge graph completion with a high generalization ability among different datasets.

Design/methodology/approach

The authors propose an entity-description augmented knowledge graph completion model (EDA-KGC), which incorporates the entity description and network structure. It consists of three modules, i.e. representation initialization, deep interaction and reasoning. The representation initialization module utilizes entity descriptions to obtain the pre-trained representation of entities. The deep interaction module acquires the features of the deep interaction between entities and relationships. The reasoning component performs matrix manipulations with the deep interaction feature vector and entity representation matrix, thus obtaining the probability distribution of target entities. The authors conduct intensive experiments on the FB15K, WN18, FB15K-237 and WN18RR data sets to validate the effect of the proposed model.

Findings

The experiments demonstrate that the proposed model outperforms the traditional structure-based knowledge graph completion model and the entity-description-enhanced knowledge graph completion model. The experiments also suggest that the model has greater feasibility in different scenarios such as sparse data, dynamic entities and limited training epochs. The study shows that the integration of entity description and network structure can significantly increase the effect of the knowledge graph completion task.

Originality/value

The research has a significant reference for completing the missing information in the knowledge graph and improving the application effect of the knowledge graph in information retrieval, question answering and other fields.

Details

Aslib Journal of Information Management, vol. 75 no. 3
Type: Research Article
ISSN: 2050-3806

Keywords

Article
Publication date: 30 June 2023

Ruan Wang, Jun Deng, Xinhui Guan and Yuming He

With the development of data mining technology, diverse and broader domain knowledge can be extracted automatically. However, the research on applying knowledge mapping and data…

186

Abstract

Purpose

With the development of data mining technology, diverse and broader domain knowledge can be extracted automatically. However, the research on applying knowledge mapping and data visualization techniques to genealogical data is limited. This paper aims to fill this research gap by providing a systematic framework and process guidance for practitioners seeking to uncover hidden knowledge from genealogy.

Design/methodology/approach

Based on a literature review of genealogy's current knowledge reasoning research, the authors constructed an integrated framework for knowledge inference and visualization application using a knowledge graph. Additionally, the authors applied this framework in a case study using “Manchu Clan Genealogy” as the data source.

Findings

The case study shows that the proposed framework can effectively decompose and reconstruct genealogy. It demonstrates the reasoning, discovery, and web visualization application process of implicit information in genealogy. It enhances the effective utilization of Manchu genealogy resources by highlighting the intricate relationships among people, places, and time entities.

Originality/value

This study proposed a framework for genealogy knowledge reasoning and visual analysis utilizing a knowledge graph, including five dimensions: the target layer, the resource layer, the data layer, the inference layer, and the application layer. It helps to gather the scattered genealogy information and establish a data network with semantic correlations while establishing reasoning rules to enable inference discovery and visualization of hidden relationships.

Details

Library Hi Tech, vol. ahead-of-print no. ahead-of-print
Type: Research Article
ISSN: 0737-8831

Keywords

Article
Publication date: 12 September 2023

Wenjing Wu, Caifeng Wen, Qi Yuan, Qiulan Chen and Yunzhong Cao

Learning from safety accidents and sharing safety knowledge has become an important part of accident prevention and improving construction safety management. Considering the…

Abstract

Purpose

Learning from safety accidents and sharing safety knowledge has become an important part of accident prevention and improving construction safety management. Considering the difficulty of reusing unstructured data in the construction industry, the knowledge in it is difficult to be used directly for safety analysis. The purpose of this paper is to explore the construction of construction safety knowledge representation model and safety accident graph through deep learning methods, extract construction safety knowledge entities through BERT-BiLSTM-CRF model and propose a data management model of data–knowledge–services.

Design/methodology/approach

The ontology model of knowledge representation of construction safety accidents is constructed by integrating entity relation and logic evolution. Then, the database of safety incidents in the architecture, engineering and construction (AEC) industry is established based on the collected construction safety incident reports and related dispute cases. The construction method of construction safety accident knowledge graph is studied, and the precision of BERT-BiLSTM-CRF algorithm in information extraction is verified through comparative experiments. Finally, a safety accident report is used as an example to construct the AEC domain construction safety accident knowledge graph (AEC-KG), which provides visual query knowledge service and verifies the operability of knowledge management.

Findings

The experimental results show that the combined BERT-BiLSTM-CRF algorithm has a precision of 84.52%, a recall of 92.35%, and an F1 value of 88.26% in named entity recognition from the AEC domain database. The construction safety knowledge representation model and safety incident knowledge graph realize knowledge visualization.

Originality/value

The proposed framework provides a new knowledge management approach to improve the safety management of practitioners and also enriches the application scenarios of knowledge graph. On the one hand, it innovatively proposes a data application method and knowledge management method of safety accident report that integrates entity relationship and matter evolution logic. On the other hand, the legal adjudication dimension is innovatively added to the knowledge graph in the construction safety field as the basis for the postincident disposal measures of safety accidents, which provides reference for safety managers' decision-making in all aspects.

Details

Engineering, Construction and Architectural Management, vol. ahead-of-print no. ahead-of-print
Type: Research Article
ISSN: 0969-9988

Keywords

Article
Publication date: 1 November 2021

Maren Parnas Gulnes, Ahmet Soylu and Dumitru Roman

Neuroscience data are spread across a variety of sources, typically provisioned through ad-hoc and non-standard approaches and formats and often have no connection to the related…

Abstract

Purpose

Neuroscience data are spread across a variety of sources, typically provisioned through ad-hoc and non-standard approaches and formats and often have no connection to the related data sources. These make it difficult for researchers to understand, integrate and reuse brain-related data. The aim of this study is to show that a graph-based approach offers an effective mean for representing, analysing and accessing brain-related data, which is highly interconnected, evolving over time and often needed in combination.

Design/methodology/approach

The authors present an approach for organising brain-related data in a graph model. The approach is exemplified in the case of a unique data set of quantitative neuroanatomical data about the murine basal ganglia––a group of nuclei in the brain essential for processing information related to movement. Specifically, the murine basal ganglia data set is modelled as a graph, integrated with relevant data from third-party repositories, published through a Web-based user interface and API, analysed from exploratory and confirmatory perspectives using popular graph algorithms to extract new insights.

Findings

The evaluation of the graph model and the results of the graph data analysis and usability study of the user interface suggest that graph-based data management in the neuroscience domain is a promising approach, since it enables integration of various disparate data sources and improves understanding and usability of data.

Originality/value

The study provides a practical and generic approach for representing, integrating, analysing and provisioning brain-related data and a set of software tools to support the proposed approach.

Details

Data Technologies and Applications, vol. 56 no. 3
Type: Research Article
ISSN: 2514-9288

Keywords

Article
Publication date: 16 August 2023

Anish Khobragade, Shashikant Ghumbre and Vinod Pachghare

MITRE and the National Security Agency cooperatively developed and maintained a D3FEND knowledge graph (KG). It provides concepts as an entity from the cybersecurity…

Abstract

Purpose

MITRE and the National Security Agency cooperatively developed and maintained a D3FEND knowledge graph (KG). It provides concepts as an entity from the cybersecurity countermeasure domain, such as dynamic, emulated and file analysis. Those entities are linked by applying relationships such as analyze, may_contains and encrypt. A fundamental challenge for collaborative designers is to encode knowledge and efficiently interrelate the cyber-domain facts generated daily. However, the designers manually update the graph contents with new or missing facts to enrich the knowledge. This paper aims to propose an automated approach to predict the missing facts using the link prediction task, leveraging embedding as representation learning.

Design/methodology/approach

D3FEND is available in the resource description framework (RDF) format. In the preprocessing step, the facts in RDF format converted to subject–predicate–object triplet format contain 5,967 entities and 98 relationship types. Progressive distance-based, bilinear and convolutional embedding models are applied to learn the embeddings of entities and relations. This study presents a link prediction task to infer missing facts using learned embeddings.

Findings

Experimental results show that the translational model performs well on high-rank results, whereas the bilinear model is superior in capturing the latent semantics of complex relationship types. However, the convolutional model outperforms 44% of the true facts and achieves a 3% improvement in results compared to other models.

Research limitations/implications

Despite the success of embedding models to enrich D3FEND using link prediction under the supervised learning setup, it has some limitations, such as not capturing diversity and hierarchies of relations. The average node degree of D3FEND KG is 16.85, with 12% of entities having a node degree less than 2, especially there are many entities or relations with few or no observed links. This results in sparsity and data imbalance, which affect the model performance even after increasing the embedding vector size. Moreover, KG embedding models consider existing entities and relations and may not incorporate external or contextual information such as textual descriptions, temporal dynamics or domain knowledge, which can enhance the link prediction performance.

Practical implications

Link prediction in the D3FEND KG can benefit cybersecurity countermeasure strategies in several ways, such as it can help to identify gaps or weaknesses in the existing defensive methods and suggest possible ways to improve or augment them; it can help to compare and contrast different defensive methods and understand their trade-offs and synergies; it can help to discover novel or emerging defensive methods by inferring new relations from existing data or external sources; and it can help to generate recommendations or guidance for selecting or deploying appropriate defensive methods based on the characteristics and objectives of the system or network.

Originality/value

The representation learning approach helps to reduce incompleteness using a link prediction that infers possible missing facts by using the existing entities and relations of D3FEND.

Details

International Journal of Web Information Systems, vol. 19 no. 3/4
Type: Research Article
ISSN: 1744-0084

Keywords

Abstract

Details

Aslib Journal of Information Management, vol. 75 no. 3
Type: Research Article
ISSN: 2050-3806

Article
Publication date: 11 June 2021

Wei Du, Qiang Yan, Wenping Zhang and Jian Ma

Patent trade recommendations necessitate recommendation interpretability in addition to recommendation accuracy because of patent transaction risks and the technological…

Abstract

Purpose

Patent trade recommendations necessitate recommendation interpretability in addition to recommendation accuracy because of patent transaction risks and the technological complexity of patents. This study designs an interpretable knowledge-aware patent recommendation model (IKPRM) for patent trading. IKPRM first creates a patent knowledge graph (PKG) for patent trade recommendations and then leverages paths in the PKG to achieve recommendation interpretability.

Design/methodology/approach

First, we construct a PKG to integrate online company behaviors and patent information using natural language processing techniques. Second, a bidirectional long short-term memory network (BiLSTM) is utilized with an attention mechanism to establish the connecting paths of a company — patent pair in PKG. Finally, the prediction score of a company — patent pair is calculated by assigning different weights to their connecting paths. The semantic relationships in connecting paths help explain why a candidate patent is recommended.

Findings

Experiments on a real dataset from a patent trading platform verify that IKPRM significantly outperforms baseline methods in terms of hit ratio and normalized discounted cumulative gain (nDCG). The analysis of an online user study verified the interpretability of our recommendations.

Originality/value

A meta-path-based recommendation can achieve certain explainability but suffers from low flexibility when reasoning on heterogeneous information. To bridge this gap, we propose the IKPRM to explain the full paths in the knowledge graph. IKPRM demonstrates good performance and transparency and is a solid foundation for integrating interpretable artificial intelligence into complex tasks such as intelligent recommendations.

Details

Internet Research, vol. 32 no. 2
Type: Research Article
ISSN: 1066-2243

Keywords

Article
Publication date: 20 July 2023

Elaheh Hosseini, Kimiya Taghizadeh Milani and Mohammad Shaker Sabetnasab

This research aimed to visualize and analyze the co-word network and thematic clusters of the intellectual structure in the field of linked data during 1900–2021.

Abstract

Purpose

This research aimed to visualize and analyze the co-word network and thematic clusters of the intellectual structure in the field of linked data during 1900–2021.

Design/methodology/approach

This applied research employed a descriptive and analytical method, scientometric indicators, co-word techniques, and social network analysis. VOSviewer, SPSS, Python programming, and UCINet software were used for data analysis and network structure visualization.

Findings

The top ranks of the Web of Science (WOS) subject categorization belonged to various fields of computer science. Besides, the USA was the most prolific country. The keyword ontology had the highest frequency of co-occurrence. Ontology and semantic were the most frequent co-word pairs. In terms of the network structure, nine major topic clusters were identified based on co-occurrence, and 29 thematic clusters were identified based on hierarchical clustering. Comparisons between the two clustering techniques indicated that three clusters, namely semantic bioinformatics, knowledge representation, and semantic tools were in common. The most mature and mainstream thematic clusters were natural language processing techniques to boost modeling and visualization, context-aware knowledge discovery, probabilistic latent semantic analysis (PLSA), semantic tools, latent semantic indexing, web ontology language (OWL) syntax, and ontology-based deep learning.

Originality/value

This study adopted various techniques such as co-word analysis, social network analysis network structure visualization, and hierarchical clustering to represent a suitable, visual, methodical, and comprehensive perspective into linked data.

Details

Library Hi Tech, vol. ahead-of-print no. ahead-of-print
Type: Research Article
ISSN: 0737-8831

Keywords

Article
Publication date: 14 May 2018

Daifeng Li, Andrew Madden, Chaochun Liu, Ying Ding, Liwei Qian and Enguo Zhou

Internet technology allows millions of people to find high quality medical resources online, with the result that personal healthcare and medical services have become one of the…

Abstract

Purpose

Internet technology allows millions of people to find high quality medical resources online, with the result that personal healthcare and medical services have become one of the fastest growing markets in China. Data relating to healthcare search behavior may provide insights that could lead to better provision of healthcare services. However, discrepancies often arise between terminologies derived from professional medical domain knowledge and the more colloquial terms that users adopt when searching for information about ailments. This can make it difficult to match healthcare queries with doctors’ keywords in online medical searches. The paper aims to discuss these issues.

Design/methodology/approach

To help address this problem, the authors propose a transfer learning using latent factor graph (TLLFG), which can learn the descriptions of ailments used in internet searches and match them to the most appropriate formal medical keywords.

Findings

Experiments show that the TLLFG outperforms competing algorithms in incorporating both medical domain knowledge and patient-doctor Q&A data from online services into a unified latent layer capable of bridging the gap between lay enquiries and professionally expressed information sources, and make more accurate analysis of online users’ symptom descriptions. The authors conclude with a brief discussion of some of the ways in which the model may support online applications and connect offline medical services.

Practical implications

The authors used an online medical searching application to verify the proposed model. The model can bridge users’ long-tailed description with doctors’ formal medical keywords. Online experiments show that TLLFG can significantly improve the searching experience of both users and medical service providers compared with traditional machine learning methods. The research provides a helpful example of the use of domain knowledge to optimize searching or recommendation experiences.

Originality/value

The authors use transfer learning to map online users’ long-tail queries onto medical domain knowledge, significantly improving the relevance of queries and keywords in a search system reliant on sponsored links.

Details

Industrial Management & Data Systems, vol. 118 no. 4
Type: Research Article
ISSN: 0263-5577

Keywords

Article
Publication date: 28 April 2020

Siham Eddamiri, Asmaa Benghabrit and Elmoukhtar Zemmouri

The purpose of this paper is to present a generic pipeline for Resource Description Framework (RDF) graph mining to provide a comprehensive review of each step in the knowledge

Abstract

Purpose

The purpose of this paper is to present a generic pipeline for Resource Description Framework (RDF) graph mining to provide a comprehensive review of each step in the knowledge discovery from data process. The authors also investigate different approaches and combinations to extract feature vectors from RDF graphs to apply the clustering and theme identification tasks.

Design/methodology/approach

The proposed methodology comprises four steps. First, the authors generate several graph substructures (Walks, Set of Walks, Walks with backward and Set of Walks with backward). Second, the authors build neural language models to extract numerical vectors of the generated sequences by using word embedding techniques (Word2Vec and Doc2Vec) combined with term frequency-inverse document frequency (TF-IDF). Third, the authors use the well-known K-means algorithm to cluster the RDF graph. Finally, the authors extract the most relevant rdf:type from the grouped vertices to describe the semantics of each theme by generating the labels.

Findings

The experimental evaluation on the state of the art data sets (AIFB, BGS and Conference) shows that the combination of Set of Walks-with-backward with TF-IDF and Doc2vec techniques give excellent results. In fact, the clustering results reach more than 97% and 90% in terms of purity and F-measure, respectively. Concerning the theme identification, the results show that by using the same combination, the purity and F-measure criteria reach more than 90% for all the considered data sets.

Originality/value

The originality of this paper lies in two aspects: first, a new machine learning pipeline for RDF data is presented; second, an efficient process to identify and extract relevant graph substructures from an RDF graph is proposed. The proposed techniques were combined with different neural language models to improve the accuracy and relevance of the obtained feature vectors that will be fed to the clustering mechanism.

Details

International Journal of Web Information Systems, vol. 16 no. 2
Type: Research Article
ISSN: 1744-0084

Keywords

1 – 10 of over 2000