Search results

1 – 10 of over 4000
Article
Publication date: 21 May 2018

Shutian Ma, Yingyi Zhang and Chengzhi Zhang

The purpose of this paper is to classify Chinese word semantic relations, which are synonyms, antonyms, hyponyms and meronymys.

Abstract

Purpose

The purpose of this paper is to classify Chinese word semantic relations, which are synonyms, antonyms, hyponyms and meronymys.

Design/methodology/approach

Basically, four simple methods are applied, ontology-based, dictionary-based, pattern-based and morpho-syntactic method. The authors make good use of search engine to build lexical and semantic resources for dictionary-based and pattern-based methods. To improve classification performance with more external resources, they also classify the given word pairs in Chinese and in English at the same time by using machine translation.

Findings

Experimental results show that the approach achieved an average F1 score of 50.87 per cent, an average accuracy of 70.36 per cent and an average recall of 40.05 per cent over all classification tasks. Synonym and antonym classification achieved high accuracy, i.e. above 90 per cent. Moreover, dictionary-based and pattern-based approaches work effectively on final data set.

Originality/value

For many natural language processing (NLP) tasks, the step of distinguishing word semantic relation can help to improve system performance, such as information extraction and knowledge graph generation. Currently, common methods for this task rely on large corpora for training or dictionaries and thesauri for inference, where limitation lies in freely data access and keeping built lexical resources up-date. This paper builds a primary system for classifying Chinese word semantic relations by seeking new ways to obtain the external resources efficiently.

Details

Information Discovery and Delivery, vol. 46 no. 2
Type: Research Article
ISSN: 2398-6247

Keywords

Article
Publication date: 10 March 2014

Junping Qiu and Wen Lou

– The purpose of this study is to construct a Chinese information science resource ontology and to explore a new method for semiautomatic ontology construction.

2418

Abstract

Purpose

The purpose of this study is to construct a Chinese information science resource ontology and to explore a new method for semiautomatic ontology construction.

Design/methodology/approach

More than 8,290 articles indexed in the Chinese Social Science Citation Index (CSSCI), covering the years 2001 to 2010, were included in this study. Statistical analysis, co-occurrence analysis, and semantic similarity methods were applied to the selected articles. The ontology was built using existing construction principles and methods, as well as categories and hierarchy definitions based on CSSCI indexing fields.

Findings

Seven categories were found to be relevant for the Chinese information science resource ontology, which, in this study, consists of a three-tier architecture, 78,291 instances, and 182,109 pairs of semantic relations. These results indicate the following: further improvements are required in ontology construction methods; resource ontology is a breakthrough concept in ontology studies; the combination of semantic similarities and co-occurrence analysis can quantitatively describe relationships between concepts.

Originality/value

This study pioneers the resource ontology concept. It is one of the first to combine informetric methods with semantic similarity to reveal deep relationships in textual data.

Details

Aslib Journal of Information Management, vol. 66 no. 2
Type: Research Article
ISSN: 2050-3806

Keywords

Article
Publication date: 14 June 2019

Nora Madi, Rawan Al-Matham and Hend Al-Khalifa

The purpose of this paper is to provide an overall review of grammar checking and relation extraction (RE) literature, their techniques and the open challenges associated with…

Abstract

Purpose

The purpose of this paper is to provide an overall review of grammar checking and relation extraction (RE) literature, their techniques and the open challenges associated with them; and, finally, suggest future directions.

Design/methodology/approach

The review on grammar checking and RE was carried out using the following protocol: we prepared research questions, planed for searching strategy, addressed paper selection criteria to distinguish relevant works, extracted data from these works, and finally, analyzed and synthesized the data.

Findings

The output of error detection models could be used for creating a profile of a certain writer. Such profiles can be used for author identification, native language identification or even the level of education, to name a few. The automatic extraction of relations could be used to build or complete electronic lexical thesauri and knowledge bases.

Originality/value

Grammar checking is the process of detecting and sometimes correcting erroneous words in the text, while RE is the process of detecting and categorizing predefined relationships between entities or words that were identified in the text. The authors found that the most obvious challenge is the lack of data sets, especially for low-resource languages. Also, the lack of unified evaluation methods hinders the ability to compare results.

Details

Data Technologies and Applications, vol. 53 no. 3
Type: Research Article
ISSN: 2514-9288

Keywords

Article
Publication date: 1 June 2015

Quang-Minh Nguyen and Tuan-Dung Cao

The purpose of this paper is to propose an automatic method to generate semantic annotations of football transfer in the news. The current automatic news integration systems on…

Abstract

Purpose

The purpose of this paper is to propose an automatic method to generate semantic annotations of football transfer in the news. The current automatic news integration systems on the Web are constantly faced with the challenge of diversity, heterogeneity of sources. The approaches for information representation and storage based on syntax have some certain limitations in news searching, sorting, organizing and linking it appropriately. The models of semantic representation are promising to be the key to solving these problems.

Design/methodology/approach

The approach of the author leverages Semantic Web technologies to improve the performance of detection of hidden annotations in the news. The paper proposes an automatic method to generate semantic annotations based on named entity recognition and rule-based information extraction. The authors have built a domain ontology and knowledge base integrated with the knowledge and information management (KIM) platform to implement the former task (named entity recognition). The semantic extraction rules are constructed based on defined language models and the developed ontology.

Findings

The proposed method is implemented as a part of the sport news semantic annotations-generating prototype BKAnnotation. This component is a part of the sport integration system based on Web Semantics BKSport. The semantic annotations generated are used for improving features of news searching – sorting – association. The experiments on the news data from SkySport (2014) channel showed positive results. The precisions achieved in both cases, with and without integration of the pronoun recognition method, are both over 80 per cent. In particular, the latter helps increase the recall value to around 10 per cent.

Originality/value

This is one of the initial proposals in automatic creation of semantic data about news, football news in particular and sport news in general. The combination of ontology, knowledge base and patterns of language model allows detection of not only entities with corresponding types but also semantic triples. At the same time, the authors propose a pronoun recognition method using extraction rules to improve the relation recognition process.

Details

International Journal of Pervasive Computing and Communications, vol. 11 no. 2
Type: Research Article
ISSN: 1742-7371

Keywords

Article
Publication date: 9 February 2015

M. Cristina Pattuelli and Matthew Miller

The purpose of this paper is to describe a novel approach to the development and semantic enhancement of a social network to support the analysis and interpretation of digital…

1016

Abstract

Purpose

The purpose of this paper is to describe a novel approach to the development and semantic enhancement of a social network to support the analysis and interpretation of digital oral history data from jazz archives and special collections.

Design/methodology/approach

A multi-method approach was applied including automated named entity recognition and extraction to create a social network, and crowdsourcing techniques to semantically enhance the data through the classification of relations and the integration of contextual information. Linked open data standards provided the knowledge representation technique for the data set underlying the network.

Findings

The study described here identifies the challenges and opportunities of a combination of a machine and a human-driven approach to the development of social networks from textual documents. The creation, visualization and enrichment of a social network are presented within a real-world scenario. The data set from which the network is based is accessible via an application programming interface and, thus, shareable with the knowledge management community for reuse and mash-ups.

Originality/value

This paper presents original methods to address the issue of detecting and representing semantic relationships from text. Another element of novelty is in that it applies semantic web technologies to the construction and enhancement of the network and underlying data set, making the data readable across platforms and linkable with external data sets. This approach has the potential to make social networks dynamic and open to integration with external data sources.

Details

Journal of Knowledge Management, vol. 19 no. 1
Type: Research Article
ISSN: 1367-3270

Keywords

Article
Publication date: 29 March 2024

Sihao Li, Jiali Wang and Zhao Xu

The compliance checking of Building Information Modeling (BIM) models is crucial throughout the lifecycle of construction. The increasing amount and complexity of information…

Abstract

Purpose

The compliance checking of Building Information Modeling (BIM) models is crucial throughout the lifecycle of construction. The increasing amount and complexity of information carried by BIM models have made compliance checking more challenging, and manual methods are prone to errors. Therefore, this study aims to propose an integrative conceptual framework for automated compliance checking of BIM models, allowing for the identification of errors within BIM models.

Design/methodology/approach

This study first analyzed the typical building standards in the field of architecture and fire protection, and then the ontology of these elements is developed. Based on this, a building standard corpus is built, and deep learning models are trained to automatically label the building standard texts. The Neo4j is utilized for knowledge graph construction and storage, and a data extraction method based on the Dynamo is designed to obtain checking data files. After that, a matching algorithm is devised to express the logical rules of knowledge graph triples, resulting in automated compliance checking for BIM models.

Findings

Case validation results showed that this theoretical framework can achieve the automatic construction of domain knowledge graphs and automatic checking of BIM model compliance. Compared with traditional methods, this method has a higher degree of automation and portability.

Originality/value

This study introduces knowledge graphs and natural language processing technology into the field of BIM model checking and completes the automated process of constructing domain knowledge graphs and checking BIM model data. The validation of its functionality and usability through two case studies on a self-developed BIM checking platform.

Details

Engineering, Construction and Architectural Management, vol. ahead-of-print no. ahead-of-print
Type: Research Article
ISSN: 0969-9988

Keywords

Article
Publication date: 1 February 1973

JEAN‐CLAUDE GARDIN

In this presentation I shall be concerned with only one aspect of information science and its relation with linguistics: namely document analysis (for a broader survey of…

720

Abstract

In this presentation I shall be concerned with only one aspect of information science and its relation with linguistics: namely document analysis (for a broader survey of ‘Linguistics and information science’, see the recent article published under that title by C. Montgomery (1972); also M. Kay and K. Sparck Jones (1971), and the report prepared for F.I.D. by the same authors, forthcoming; M. Coyaud (1972), in French, deals with many separate facets of the same subject).

Details

Journal of Documentation, vol. 29 no. 2
Type: Research Article
ISSN: 0022-0418

Article
Publication date: 19 December 2019

Sergio Evangelista Silva, Luciana Paula Reis, June Marques Fernandes and Alana Deusilan Sester Pereira

The purpose of this paper is to introduce a multi-level framework for semantic modeling (MFSM) based on four signification levels: objects, classes of entities, instances and…

Abstract

Purpose

The purpose of this paper is to introduce a multi-level framework for semantic modeling (MFSM) based on four signification levels: objects, classes of entities, instances and domains. In addition, four fundamental propositions of the signification process underpin these levels, namely, classification, decomposition, instantiation and contextualization.

Design/methodology/approach

The deductive approach guided the design of this modeling framework. The authors empirically validated the MFSM in two ways. First, the authors identified the signification processes used in articles that deal with semantic modeling. The authors then applied the MFSM to model the semantic context of the literature about lean manufacturing, a field of management science.

Findings

The MFSM presents a highly consistent approach about the signification process, integrates the semantic modeling literature in a new and comprehensive view; and permits the modeling of any semantic context, thus facilitating the development of knowledge organization systems based on semantic search.

Research limitations/implications

The use of MFSM is manual and, thus, requires a considerable effort of the team that decides to model a semantic context. In this paper, the modeling was generated by specialists, and in the future should be applicated to lay users.

Practical implications

The MFSM opens up avenues to a new form of classification of documents, as well as for the development of tools based on the semantic search, and to investigate how users do their searches.

Social implications

The MFSM can be used to model archives semantically in public or private settings. In future, it can be incorporated to search engines for more efficient searches of users.

Originality/value

The MFSM provides a new and comprehensive approach about the elementary levels and activities in the process of signification. In addition, this new framework presents a new form to model semantically any context classifying its objects.

Article
Publication date: 25 October 2021

Jinju Chen and Shiyan Ou

The purpose of this paper is to semantically annotate the content of digital images with the use of Semantic Web technologies and thus facilitate retrieval, integration and…

Abstract

Purpose

The purpose of this paper is to semantically annotate the content of digital images with the use of Semantic Web technologies and thus facilitate retrieval, integration and knowledge discovery.

Design/Methodology/Approach

After a review and comparison of the existing semantic annotation models for images and a deep analysis of the characteristics of the content of images, a multi-dimensional and hierarchical general semantic annotation framework for digital images was proposed. On this basis, taking histories images, advertising images and biomedical images as examples, by integrating the characteristics of images in these specific domains with related domain knowledge, the general semantic annotation framework for digital images was customized to form a domain annotation ontology for the images in a specific domain. The application of semantic annotation of digital images, such as semantic retrieval, visual analysis and semantic reuse, were also explored.

Findings

The results showed that the semantic annotation framework for digital images constructed in this paper provided a solution for the semantic organization of the content of images. On this basis, deep knowledge services such as semantic retrieval, visual analysis can be provided.

Originality/Value

The semantic annotation framework for digital images can reveal the fine-grained semantics in a multi-dimensional and hierarchical way, which can thus meet the demand for enrichment and retrieval of digital images.

Details

The Electronic Library , vol. 39 no. 6
Type: Research Article
ISSN: 0264-0473

Keywords

Article
Publication date: 15 August 2016

Ioana Barbantan, Mihaela Porumb, Camelia Lemnaru and Rodica Potolea

Improving healthcare services by developing assistive technologies includes both the health aid devices and the analysis of the data collected by them. The acquired data modeled…

Abstract

Purpose

Improving healthcare services by developing assistive technologies includes both the health aid devices and the analysis of the data collected by them. The acquired data modeled as a knowledge base give more insight into each patient’s health status and needs. Therefore, the ultimate goal of a health-care system is obtaining recommendations provided by an assistive decision support system using such knowledge base, benefiting the patients, the physicians and the healthcare industry. This paper aims to define the knowledge flow for a medical assistive decision support system by structuring raw medical data and leveraging the knowledge contained in the data proposing solutions for efficient data search, medical investigation or diagnosis and medication prediction and relationship identification.

Design/methodology/approach

The solution this paper proposes for implementing a medical assistive decision support system can analyze any type of unstructured medical documents which are processed by applying Natural Language Processing (NLP) tasks followed by semantic analysis, leading to the medical concept identification, thus imposing a structure on the input documents. The structured information is filtered and classified such that custom decisions regarding patients’ health status can be made. The current research focuses on identifying the relationships between medical concepts as defined by the REMed (Relation Extraction from Medical documents) solution that aims at finding the patterns that lead to the classification of concept pairs into concept-to-concept relations.

Findings

This paper proposed the REMed solution expressed as a multi-class classification problem tackled using the support vector machine classifier. Experimentally, this paper determined the most appropriate setup for the multi-class classification problem which is a combination of lexical, context, syntactic and grammatical features, as each feature category is good at representing particular relations, but not all. The best results we obtained are expressed as F1-measure of 74.9 per cent which is 1.4 per cent better than the results reported by similar systems.

Research limitations/implications

The difficulty to discriminate between TrIP and TrAP relations revolves around the hierarchical relationship between the two classes as TrIP is a particular type (an instance) of TrAP. The intuition behind this behavior was that the classifier cannot discern the correct relations because of the bias toward the majority classes. The analysis was conducted by using only sentences from electronic health record that contain at least two medical concepts. This limitation was introduced by the availability of the annotated data with reported results, as relations were defined at sentence level.

Originality/value

The originality of the proposed solution lies in the methodology to extract valuable information from the medical records via semantic searches; concept-to-concept relation identification; and recommendations for diagnosis, treatment and further investigations. The REMed solution introduces a learning-based approach for the automatic discovery of relations between medical concepts. We propose an original list of features: lexical – 3, context – 6, grammatical – 4 and syntactic – 4. The similarity feature introduced in this paper has a significant influence on the classification, and, to the best of the authors’ knowledge, it has not been used as feature in similar solutions.

Details

International Journal of Web Information Systems, vol. 12 no. 3
Type: Research Article
ISSN: 1744-0084

Keywords

1 – 10 of over 4000