Search results
1 – 2 of 2Pedro Hípola, José A. Senso, Amed Leiva-Mederos and Sandor Domínguez-Velasco
The purpose of this paper is to look into the latest advances in ontology-based text summarization systems, with emphasis on the methodologies of a socio-cognitive approach, the…
Abstract
Purpose
The purpose of this paper is to look into the latest advances in ontology-based text summarization systems, with emphasis on the methodologies of a socio-cognitive approach, the structural discourse models and the ontology-based text summarization systems.
Design/methodology/approach
The paper analyzes the main literature in this field and presents the structure and features of Texminer, a software that facilitates summarization of texts on Port and Coastal Engineering. Texminer entails a combination of several techniques, including: socio-cognitive user models, Natural Language Processing, disambiguation and ontologies. After processing a corpus, the system was evaluated using as a reference various clustering evaluation experiments conducted by Arco (2008) and Hennig et al. (2008). The results were checked with a support vector machine, Rouge metrics, the F-measure and calculation of precision and recall.
Findings
The experiment illustrates the superiority of abstracts obtained through the assistance of ontology-based techniques.
Originality/value
The authors were able to corroborate that the summaries obtained using Texminer are more efficient than those derived through other systems whose summarization models do not use ontologies to summarize texts. Thanks to ontologies, main sentences can be selected with a broad rhetorical structure, especially for a specific knowledge domain.
Details
Keywords
Amed Leiva-Mederos, José A. Senso, Sandor Domínguez-Velasco and Pedro Hípola
The purpose of this paper is to propose a tool that generates authority files to be integrated with linked data by means of learning rules. AUTHORIS is software developed to…
Abstract
Purpose
The purpose of this paper is to propose a tool that generates authority files to be integrated with linked data by means of learning rules. AUTHORIS is software developed to enhance authority control and information exchange among bibliographic and non-bibliographic entities.
Design/methodology/approach
The article analyzes different methods previously developed for authority control as well as IFLA and ALA standards for managing bibliographic records. Semantic Web technologies are also evaluated. AUTHORIS relies on Drupal and incorporates the protocols of Dublin Core, SIOC, SKOS and FOAF. The tool has also taken into account the obsolescence of MARC and its substitution by FRBR and RDA. Its effectiveness was evaluated applying a learning test proposed by RDA. Over 80 percent of the actions were carried out correctly.
Findings
The use of learning rules and the facilities of linked data make it easier for information organizations to reutilize products for authority control and distribute them in a fair and efficient manner.
Research limitations/implications
The ISAD-G records were the ones presenting most errors. EAD was found to be second in the number of errors produced. The rest of the formats – MARC 21, Dublin Core, FRAD, RDF, OWL, XBRL and FOAF – showed fewer than 20 errors in total.
Practical implications
AUTHORIS offers institutions the means of sharing data with a high level of stability, helping to detect records that are duplicated and contributing to lexical disambiguation and data enrichment.
Originality/value
The software combines the facilities of linked data, the potency of the algorithms for converting bibliographic data, and the precision of learning rules.
Details