Search results

1 – 2 of 2
Article
Publication date: 21 September 2012

Jorge Martinez‐Gil and José F. Aldana‐Montes

Semantic similarity measures are very important in many computer‐related fields. Previous works on applications such as data integration, query expansion, tag refactoring or text…

Abstract

Purpose

Semantic similarity measures are very important in many computer‐related fields. Previous works on applications such as data integration, query expansion, tag refactoring or text clustering have used some semantic similarity measures in the past. Despite the usefulness of semantic similarity measures in these applications, the problem of measuring the similarity between two text expressions remains a key challenge. This paper aims to address this issue.

Design/methodology/approach

In this article, the authors propose an optimization environment to improve existing techniques that use the notion of co‐occurrence and the information available on the web to measure similarity between terms.

Findings

The experimental results using the Miller and Charles and Gracia and Mena benchmark datasets show that the proposed approach is able to outperform classic probabilistic web‐based algorithms by a wide margin.

Originality/value

This paper presents two main contributions. The authors propose a novel technique that beats classic probabilistic techniques for measuring semantic similarity between terms. This new technique consists of using not only a search engine for computing web page counts, but a smart combination of several popular web search engines. The approach is evaluated on the Miller and Charles and Gracia and Mena benchmark datasets and compared with existing probabilistic web extraction techniques.

Details

Online Information Review, vol. 36 no. 5
Type: Research Article
ISSN: 1468-4527

Keywords

Article
Publication date: 8 May 2017

Amed Leiva-Mederos, Jose A. Senso, Yusniel Hidalgo-Delgado and Pedro Hipola

Information from Current Research Information Systems (CRIS) is stored in different formats, in platforms that are not compatible, or even in independent networks. It would be…

1130

Abstract

Purpose

Information from Current Research Information Systems (CRIS) is stored in different formats, in platforms that are not compatible, or even in independent networks. It would be helpful to have a well-defined methodology to allow for management data processing from a single site, so as to take advantage of the capacity to link disperse data found in different systems, platforms, sources and/or formats. Based on functionalities and materials of the VLIR project, the purpose of this paper is to present a model that provides for interoperability by means of semantic alignment techniques and metadata crosswalks, and facilitates the fusion of information stored in diverse sources.

Design/methodology/approach

After reviewing the state of the art regarding the diverse mechanisms for achieving semantic interoperability, the paper analyzes the following: the specific coverage of the data sets (type of data, thematic coverage and geographic coverage); the technical specifications needed to retrieve and analyze a distribution of the data set (format, protocol, etc.); the conditions of re-utilization (copyright and licenses); and the “dimensions” included in the data set as well as the semantics of these dimensions (the syntax and the taxonomies of reference). The semantic interoperability framework here presented implements semantic alignment and metadata crosswalk to convert information from three different systems (ABCD, Moodle and DSpace) to integrate all the databases in a single RDF file.

Findings

The paper also includes an evaluation based on the comparison – by means of calculations of recall and precision – of the proposed model and identical consultations made on Open Archives Initiative and SQL, in order to estimate its efficiency. The results have been satisfactory enough, due to the fact that the semantic interoperability facilitates the exact retrieval of information.

Originality/value

The proposed model enhances management of the syntactic and semantic interoperability of the CRIS system designed. In a real setting of use it achieves very positive results.

Access

Year

Content type

Article (2)
1 – 2 of 2