Search results

1 – 10 of 14
Open Access
Article
Publication date: 5 April 2024

Miquel Centelles and Núria Ferran-Ferrer

Develop a comprehensive framework for assessing the knowledge organization systems (KOSs), including the taxonomy of Wikipedia and the ontologies of Wikidata, with a specific…

Abstract

Purpose

Develop a comprehensive framework for assessing the knowledge organization systems (KOSs), including the taxonomy of Wikipedia and the ontologies of Wikidata, with a specific focus on enhancing management and retrieval with a gender nonbinary perspective.

Design/methodology/approach

This study employs heuristic and inspection methods to assess Wikipedia’s KOS, ensuring compliance with international standards. It evaluates the efficiency of retrieving non-masculine gender-related articles using the Catalan Wikipedian category scheme, identifying limitations. Additionally, a novel assessment of Wikidata ontologies examines their structure and coverage of gender-related properties, comparing them to Wikipedia’s taxonomy for advantages and enhancements.

Findings

This study evaluates Wikipedia’s taxonomy and Wikidata’s ontologies, establishing evaluation criteria for gender-based categorization and exploring their structural effectiveness. The evaluation process suggests that Wikidata ontologies may offer a viable solution to address Wikipedia’s categorization challenges.

Originality/value

The assessment of Wikipedia categories (taxonomy) based on KOS standards leads to the conclusion that there is ample room for improvement, not only in matters concerning gender identity but also in the overall KOS to enhance search and retrieval for users. These findings bear relevance for the design of tools to support information retrieval on knowledge-rich websites, as they assist users in exploring topics and concepts.

Article
Publication date: 3 October 2023

Haklae Kim

Despite ongoing research into archival metadata standards, digital archives are unable to effectively represent records in their appropriate contexts. This study aims to propose a…

Abstract

Purpose

Despite ongoing research into archival metadata standards, digital archives are unable to effectively represent records in their appropriate contexts. This study aims to propose a knowledge graph that depicts the diverse relationships between heterogeneous digital archive entities.

Design/methodology/approach

This study introduces and describes a method for applying knowledge graphs to digital archives in a step-by-step manner. It examines archival metadata standards, such as Records in Context Ontology (RiC-O), for characterising digital records; explains the process of data refinement, enrichment and reconciliation with examples; and demonstrates the use of knowledge graphs constructed using semantic queries.

Findings

This study introduced the 97imf.kr archive as a knowledge graph, enabling meaningful exploration of relationships within the archive’s records. This approach facilitated comprehensive record descriptions about different record entities. Applying archival ontologies with general-purpose vocabularies to digital records was advised to enhance metadata coherence and semantic search.

Originality/value

Most digital archives serviced in Korea are limited in the proper use of archival metadata standards. The contribution of this study is to propose a practical application of knowledge graph technology for linking and exploring digital records. This study details the process of collecting raw data on archives, data preprocessing and data enrichment, and demonstrates how to build a knowledge graph connected to external data. In particular, the knowledge graph of RiC-O vocabulary, Wikidata and Schema.org vocabulary and the semantic query using it can be applied to supplement keyword search in conventional digital archives.

Details

The Electronic Library , vol. 42 no. 1
Type: Research Article
ISSN: 0264-0473

Keywords

Article
Publication date: 9 November 2023

Gustavo Candela, Nele Gabriëls, Sally Chambers, Milena Dobreva, Sarah Ames, Meghan Ferriter, Neil Fitzgerald, Victor Harbo, Katrine Hofmann, Olga Holownia, Alba Irollo, Mahendra Mahey, Eileen Manchester, Thuy-An Pham, Abigail Potter and Ellen Van Keer

The purpose of this study is to offer a checklist that can be used for both creating and evaluating digital collections, which are also sometimes referred to as data sets as part…

Abstract

Purpose

The purpose of this study is to offer a checklist that can be used for both creating and evaluating digital collections, which are also sometimes referred to as data sets as part of the collections as data movement, suitable for computational use.

Design/methodology/approach

The checklist was built by synthesising and analysing the results of relevant research literature, articles and studies and the issues and needs obtained in an observational study. The checklist was tested and applied both as a tool for assessing a selection of digital collections made available by galleries, libraries, archives and museums (GLAM) institutions as proof of concept and as a supporting tool for creating collections as data.

Findings

Over the past few years, there has been a growing interest in making available digital collections published by GLAM organisations for computational use. Based on previous work, the authors defined a methodology to build a checklist for the publication of Collections as data. The authors’ evaluation showed several examples of applications that can be useful to encourage other institutions to publish their digital collections for computational use.

Originality/value

While some work on making available digital collections suitable for computational use exists, giving particular attention to data quality, planning and experimentation, to the best of the authors’ knowledge, none of the work to date provides an easy-to-follow and robust checklist to publish collection data sets in GLAM institutions. This checklist intends to encourage small- and medium-sized institutions to adopt the collection as data principles in daily workflows following best practices and guidelines.

Details

Global Knowledge, Memory and Communication, vol. ahead-of-print no. ahead-of-print
Type: Research Article
ISSN: 2514-9342

Keywords

Article
Publication date: 3 February 2023

Huyen Nguyen, Haihua Chen, Jiangping Chen, Kate Kargozari and Junhua Ding

This study aims to evaluate a method of building a biomedical knowledge graph (KG).

Abstract

Purpose

This study aims to evaluate a method of building a biomedical knowledge graph (KG).

Design/methodology/approach

This research first constructs a COVID-19 KG on the COVID-19 Open Research Data Set, covering information over six categories (i.e. disease, drug, gene, species, therapy and symptom). The construction used open-source tools to extract entities, relations and triples. Then, the COVID-19 KG is evaluated on three data-quality dimensions: correctness, relatedness and comprehensiveness, using a semiautomatic approach. Finally, this study assesses the application of the KG by building a question answering (Q&A) system. Five queries regarding COVID-19 genomes, symptoms, transmissions and therapeutics were submitted to the system and the results were analyzed.

Findings

With current extraction tools, the quality of the KG is moderate and difficult to improve, unless more efforts are made to improve the tools for entity extraction, relation extraction and others. This study finds that comprehensiveness and relatedness positively correlate with the data size. Furthermore, the results indicate the performances of the Q&A systems built on the larger-scale KGs are better than the smaller ones for most queries, proving the importance of relatedness and comprehensiveness to ensure the usefulness of the KG.

Originality/value

The KG construction process, data-quality-based and application-based evaluations discussed in this paper provide valuable references for KG researchers and practitioners to build high-quality domain-specific knowledge discovery systems.

Details

Information Discovery and Delivery, vol. 51 no. 4
Type: Research Article
ISSN: 2398-6247

Keywords

Open Access
Article
Publication date: 30 March 2023

Sofia Baroncini, Bruno Sartini, Marieke Van Erp, Francesca Tomasi and Aldo Gangemi

In the last few years, the size of Linked Open Data (LOD) describing artworks, in general or domain-specific Knowledge Graphs (KGs), is gradually increasing. This provides…

Abstract

Purpose

In the last few years, the size of Linked Open Data (LOD) describing artworks, in general or domain-specific Knowledge Graphs (KGs), is gradually increasing. This provides (art-)historians and Cultural Heritage professionals with a wealth of information to explore. Specifically, structured data about iconographical and iconological (icon) aspects, i.e. information about the subjects, concepts and meanings of artworks, are extremely valuable for the state-of-the-art of computational tools, e.g. content recognition through computer vision. Nevertheless, a data quality evaluation for art domains, fundamental for data reuse, is still missing. The purpose of this study is filling this gap with an overview of art-historical data quality in current KGs with a focus on the icon aspects.

Design/methodology/approach

This study’s analyses are based on established KG evaluation methodologies, adapted to the domain by addressing requirements from art historians’ theories. The authors first select several KGs according to Semantic Web principles. Then, the authors evaluate (1) their structures’ suitability to describe icon information through quantitative and qualitative assessment and (2) their content, qualitatively assessed in terms of correctness and completeness.

Findings

This study’s results reveal several issues on the current expression of icon information in KGs. The content evaluation shows that these domain-specific statements are generally correct but often not complete. The incompleteness is confirmed by the structure evaluation, which highlights the unsuitability of the KG schemas to describe icon information with the required granularity.

Originality/value

The main contribution of this work is an overview of the actual landscape of the icon information expressed in LOD. Therefore, it is valuable to cultural institutions by providing them a first domain-specific data quality evaluation. Since this study’s results suggest that the selected domain information is underrepresented in Semantic Web datasets, the authors highlight the need for the creation and fostering of such information to provide a more thorough art-historical dimension to LOD.

Details

Journal of Documentation, vol. 79 no. 7
Type: Research Article
ISSN: 0022-0418

Keywords

Article
Publication date: 13 October 2023

Judit Gárdos, Julia Egyed-Gergely, Anna Horváth, Balázs Pataki, Roza Vajda and András Micsik

The present study is about generating metadata to enhance thematic transparency and facilitate research on interview collections at the Research Documentation Centre, Centre for…

Abstract

Purpose

The present study is about generating metadata to enhance thematic transparency and facilitate research on interview collections at the Research Documentation Centre, Centre for Social Sciences (TK KDK) in Budapest. It explores the use of artificial intelligence (AI) in producing, managing and processing social science data and its potential to generate useful metadata to describe the contents of such archives on a large scale.

Design/methodology/approach

The authors combined manual and automated/semi-automated methods of metadata development and curation. The authors developed a suitable domain-oriented taxonomy to classify a large text corpus of semi-structured interviews. To this end, the authors adapted the European Language Social Science Thesaurus (ELSST) to produce a concise, hierarchical structure of topics relevant in social sciences. The authors identified and tested the most promising natural language processing (NLP) tools supporting the Hungarian language. The results of manual and machine coding will be presented in a user interface.

Findings

The study describes how an international social scientific taxonomy can be adapted to a specific local setting and tailored to be used by automated NLP tools. The authors show the potential and limitations of existing and new NLP methods for thematic assignment. The current possibilities of multi-label classification in social scientific metadata assignment are discussed, i.e. the problem of automated selection of relevant labels from a large pool.

Originality/value

Interview materials have not yet been used for building manually annotated training datasets for automated indexing of scientifically relevant topics in a data repository. Comparing various automated-indexing methods, this study shows a possible implementation of a researcher tool supporting custom visualizations and the faceted search of interview collections.

Article
Publication date: 18 March 2024

Shiv Shakti Ghosh and Sunil Kumar Chatterjee

This study presents a review based research framework that aims to influence memory institutions in their projects on digital storytelling from digitized ancient travel records…

Abstract

Purpose

This study presents a review based research framework that aims to influence memory institutions in their projects on digital storytelling from digitized ancient travel records. This study aims to influence research and policymaking related to design and delivery of services based on memory institutions’ collections of historical records.

Design/methodology/approach

The demonstrated research framework has been synthesized using inputs from a review of existing studies on the domain accompanied by a short survey created for collecting the opinion of selected experts. Studies demonstrating utilization of semantic web technologies and those that can influence policymaking related to digital storytelling were primarily reviewed.

Findings

The core tasks behind digital storytelling vary depending on the project goals. So, a two-part framework had to be proposed that covers the generic fundamental tasks with diverse applicability and digital storytelling related specific tasks separately. Also during the review, it was found that studies demonstrating the use of travel records for digital storytelling were less in number compared to studies using digital storytelling for tourism in general.

Originality/value

The demonstrated research framework can guide memory institutions in exposing their travel-related holdings to a wider audience using innovative semantic web technologies and open up avenues for future empirical research thereby adding to the novelty of the presented research. Also, reviews of articles on digital storytelling or digital humanities in general exist, but, review of digital storytelling initiatives focusing specifically on tourism and travel literature is scarce.

Details

Global Knowledge, Memory and Communication, vol. ahead-of-print no. ahead-of-print
Type: Research Article
ISSN: 2514-9342

Keywords

Article
Publication date: 16 April 2024

Shuyuan Xu, Jun Wang, Xiangyu Wang, Wenchi Shou and Tuan Ngo

This paper covers the development of a novel defect model for concrete highway bridges. The proposed defect model is intended to facilitate the identification of bridge’s…

Abstract

Purpose

This paper covers the development of a novel defect model for concrete highway bridges. The proposed defect model is intended to facilitate the identification of bridge’s condition information (i.e. defects), improve the efficiency and accuracy of bridge inspections by supporting practitioners and even machines with digitalised expert knowledge, and ultimately automate the process.

Design/methodology/approach

The research design consists of three major phases so as to (1) categorise common defect with regard to physical entities (i.e. bridge element), (2) establish internal relationships among those defects and (3) relate defects to their properties and potential causes. A mixed-method research approach, which includes a comprehensive literature review, focus groups and case studies, was employed to develop and validate the proposed defect model.

Findings

The data collected through the literature and focus groups were analysed and knowledge were extracted to form the novel defect model. The defect model was then validated and further calibrated through case study. Inspection reports of nearly 300 bridges in China were collected and analysed. The study uncovered the relationships between defects and a variety of inspection-related elements and represented in the form of an accessible, digitalised and user-friendly knowledge model.

Originality/value

The contribution of this paper is the development of a defect model that can assist inexperienced practitioners and even machines in the near future to conduct inspection tasks. For one, the proposed defect model can standardise the data collection process of bridge inspection, including the identification of defects and documentation of their vital properties, paving the path for the automation in subsequent stages (e.g. condition evaluation). For another, by retrieving rich experience and expert knowledge which have long been reserved and inherited in the industrial sector, the inspection efficiency and accuracy can be considerably improved.

Details

Engineering, Construction and Architectural Management, vol. ahead-of-print no. ahead-of-print
Type: Research Article
ISSN: 0969-9988

Keywords

Open Access
Article
Publication date: 2 April 2024

Koraljka Golub, Osma Suominen, Ahmed Taiye Mohammed, Harriet Aagaard and Olof Osterman

In order to estimate the value of semi-automated subject indexing in operative library catalogues, the study aimed to investigate five different automated implementations of an…

Abstract

Purpose

In order to estimate the value of semi-automated subject indexing in operative library catalogues, the study aimed to investigate five different automated implementations of an open source software package on a large set of Swedish union catalogue metadata records, with Dewey Decimal Classification (DDC) as the target classification system. It also aimed to contribute to the body of research on aboutness and related challenges in automated subject indexing and evaluation.

Design/methodology/approach

On a sample of over 230,000 records with close to 12,000 distinct DDC classes, an open source tool Annif, developed by the National Library of Finland, was applied in the following implementations: lexical algorithm, support vector classifier, fastText, Omikuji Bonsai and an ensemble approach combing the former four. A qualitative study involving two senior catalogue librarians and three students of library and information studies was also conducted to investigate the value and inter-rater agreement of automatically assigned classes, on a sample of 60 records.

Findings

The best results were achieved using the ensemble approach that achieved 66.82% accuracy on the three-digit DDC classification task. The qualitative study confirmed earlier studies reporting low inter-rater agreement but also pointed to the potential value of automatically assigned classes as additional access points in information retrieval.

Originality/value

The paper presents an extensive study of automated classification in an operative library catalogue, accompanied by a qualitative study of automated classes. It demonstrates the value of applying semi-automated indexing in operative information retrieval systems.

Details

Journal of Documentation, vol. ahead-of-print no. ahead-of-print
Type: Research Article
ISSN: 0022-0418

Keywords

Article
Publication date: 29 March 2024

Sihao Li, Jiali Wang and Zhao Xu

The compliance checking of Building Information Modeling (BIM) models is crucial throughout the lifecycle of construction. The increasing amount and complexity of information…

Abstract

Purpose

The compliance checking of Building Information Modeling (BIM) models is crucial throughout the lifecycle of construction. The increasing amount and complexity of information carried by BIM models have made compliance checking more challenging, and manual methods are prone to errors. Therefore, this study aims to propose an integrative conceptual framework for automated compliance checking of BIM models, allowing for the identification of errors within BIM models.

Design/methodology/approach

This study first analyzed the typical building standards in the field of architecture and fire protection, and then the ontology of these elements is developed. Based on this, a building standard corpus is built, and deep learning models are trained to automatically label the building standard texts. The Neo4j is utilized for knowledge graph construction and storage, and a data extraction method based on the Dynamo is designed to obtain checking data files. After that, a matching algorithm is devised to express the logical rules of knowledge graph triples, resulting in automated compliance checking for BIM models.

Findings

Case validation results showed that this theoretical framework can achieve the automatic construction of domain knowledge graphs and automatic checking of BIM model compliance. Compared with traditional methods, this method has a higher degree of automation and portability.

Originality/value

This study introduces knowledge graphs and natural language processing technology into the field of BIM model checking and completes the automated process of constructing domain knowledge graphs and checking BIM model data. The validation of its functionality and usability through two case studies on a self-developed BIM checking platform.

Details

Engineering, Construction and Architectural Management, vol. ahead-of-print no. ahead-of-print
Type: Research Article
ISSN: 0969-9988

Keywords

1 – 10 of 14