Search results

1 – 10 of 197
Article
Publication date: 11 November 2014

S. Thenmalar and T.V. Geetha

The purpose of this paper is to improve the conceptual-based search by incorporating structural ontological information such as concepts and relations. Generally, Semantic-based…

1166

Abstract

Purpose

The purpose of this paper is to improve the conceptual-based search by incorporating structural ontological information such as concepts and relations. Generally, Semantic-based information retrieval aims to identify relevant information based on the meanings of the query terms or on the context of the terms and the performance of semantic information retrieval is carried out through standard measures-precision and recall. Higher precision leads to the (meaningful) relevant documents obtained and lower recall leads to the less coverage of the concepts.

Design/methodology/approach

In this paper, the authors enhance the existing ontology-based indexing proposed by Kohler et al., by incorporating sibling information to the index. The index designed by Kohler et al., contains only super and sub-concepts from the ontology. In addition, in our approach, we focus on two tasks; query expansion and ranking of the expanded queries, to improve the efficiency of the ontology-based search. The aforementioned tasks make use of ontological concepts, and relations existing between those concepts so as to obtain semantically more relevant search results for a given query.

Findings

The proposed ontology-based indexing technique is investigated by analysing the coverage of concepts that are being populated in the index. Here, we introduce a new measure called index enhancement measure, to estimate the coverage of ontological concepts being indexed. We have evaluated the ontology-based search for the tourism domain with the tourism documents and tourism-specific ontology. The comparison of search results based on the use of ontology “with and without query expansion” is examined to estimate the efficiency of the proposed query expansion task. The ranking is compared with the ORank system to evaluate the performance of our ontology-based search. From these analyses, the ontology-based search results shows better recall when compared to the other concept-based search systems. The mean average precision of the ontology-based search is found to be 0.79 and the recall is found to be 0.65, the ORank system has the mean average precision of 0.62 and the recall is found to be 0.51, while the concept-based search has the mean average precision of 0.56 and the recall is found to be 0.42.

Practical implications

When the concept is not present in the domain-specific ontology, the concept cannot be indexed. When the given query term is not available in the ontology then the term-based results are retrieved.

Originality/value

In addition to super and sub-concepts, we incorporate the concepts present in same level (siblings) to the ontological index. The structural information from the ontology is determined for the query expansion. The ranking of the documents depends on the type of the query (single concept query, multiple concept queries and concept with relation queries) and the ontological relations that exists in the query and the documents. With this ontological structural information, the search results showed us better coverage of concepts with respect to the query.

Details

Aslib Journal of Information Management, vol. 66 no. 6
Type: Research Article
ISSN: 2050-3806

Keywords

Article
Publication date: 20 April 2012

E. Fersini and F. Sartori

The need of tools for content analysis, information extraction and retrieval of multimedia objects in their native form is strongly emphasized into the judicial domain: digital…

Abstract

Purpose

The need of tools for content analysis, information extraction and retrieval of multimedia objects in their native form is strongly emphasized into the judicial domain: digital videos represent a fundamental informative source of events occurring during judicial proceedings that should be stored, organized and retrieved in short time and with low cost. This paper seeks to address these issues.

Design/methodology/approach

In this context the JUMAS system, stem from the homonymous European Project (www.jumasproject.eu), takes up the challenge of exploiting semantics and machine learning techniques towards a better usability of multimedia judicial folders.

Findings

In this paper one of the most challenging issues addressed by the JUMAS project is described: extracting meaningful abstracts of given judicial debates in order to efficiently access salient contents. In particular, the authors present an ontology enhanced multimedia summarization environment able to derive a synthetic representation of judicial media contents by a limited loss of meaningful information while overcoming the information overload problem.

Originality/value

The adoption of ontology‐based query expansion has made it possible to improve the performance of multimedia summarization algorithms with respect to the traditional approaches based on statistics. The effectiveness of the proposed approach has been evaluated on real media contents, highlighting a good potential for extracting key events in the challenging area of judicial proceedings.

Details

Program, vol. 46 no. 2
Type: Research Article
ISSN: 0033-0337

Keywords

Article
Publication date: 1 May 2005

D. Wollersheim and J. W. Rahayu

This paper presents a framework which combines data and text retrieval techniques to exercise and evaluate ontology based query expansions. We prepare by using linguistic…

Abstract

This paper presents a framework which combines data and text retrieval techniques to exercise and evaluate ontology based query expansions. We prepare by using linguistic techniques to identify query and document concepts, locating them in a ontologically defined semantic space. Expansions originate from the identified query concepts, with success determined by matching in the relevant document set. We identify three orthogonal dimensions that can affect query expansion success; relationship source, success measure technique, and query expansion technique. Expansion technique is further divided into six different categories: simple pruning, complex probability, voting, directional, semantic propagation, and multiple source concept. We describe each technique and show examples where they would be useful. The system architecture used facilitates plugging in of various expansion and evaluation routines, and flowing results from one method to the next. The system is useful for microanalysis of query expansion, discovering which components of ontological derived knowledge most influence query expansion success. In this work, we apply our framework to the medical domain.

Details

International Journal of Web Information Systems, vol. 1 no. 2
Type: Research Article
ISSN: 1744-0084

Keywords

Article
Publication date: 20 June 2016

Awny Sayed and Amal Al Muqrishi

The purpose of this paper is to present an efficient and scalable Arabic semantic search engine based on a domain-specific ontological graph for Colleges of Applied Science…

Abstract

Purpose

The purpose of this paper is to present an efficient and scalable Arabic semantic search engine based on a domain-specific ontological graph for Colleges of Applied Science, Sultanate of Oman (CASOnto). It also supports the factorial question answering and uses two types of searching: the keyword-based search and the semantics-based search in both languages Arabic and English. This engine is built on variety of technologies such as resource description framework data and ontological graph. Furthermore, two experimental results are conducted; the first is a comparison among entity-search and the classical-search in the system itself. The second compares the CASOnto with well-known semantic search engines such as Kngine, Wolfram Alpha and Google to measure their performance and efficiency.

Design/methodology/approach

The design and implementation of the system comprises the following phases, namely, designing inference, storing, indexing, searching, query processing and the user’s friendly interface, where it is designed based on a specific domain of the IBRI CAS (College of Applied Science) to highlight the academic and nonacademic departments. Furthermore, it is ontological inferred data stored in the tuple data base (TDB) and MySQL to handle the keyword-based search as well as entity-based search. The indexing and searching processes are built based on the Lucene for the keyword search, while TDB is used for the entity search. Query processing is a very important component in the search engines that helps to improve the user’s search results and make the system efficient and scalable. CASOnto handles the Arabic issues such as spelling correction, query completion, stop words’ removal and diacritics removal. It also supports the analysis of the factorial question answering.

Findings

In this paper, an efficient and scalable Arabic semantic search engine is proposed. The results show that the semantic search that built on the SPARQL is better than the classical search in both simple and complex queries. Clearly, the accuracy of semantic search equals to 100 per cent in both types of queries. On the other hand, the comparison of CASOnto with the Wolfram Alpha, Kngine and Google refers to better results by CASOnto. Consequently, it seems that our proposed engine retrieved better and efficient results than other engines. Thus, it is built according to the ontological domain-specific, highly scalable performance and handles the complex queries well by understanding the context behind the query.

Research limitations/implications

The proposed engine is built on a specific domain (CAS Ibri – Oman), and in the future vision, it will highlight the nonfactorial question answering and expand the domain of CASOnto to involve more integrated different domains.

Originality/value

The main contribution of this paper is to build an efficient and scalable Arabic semantic search engine. Because of the widespread use of search engines, a new dimension of challenge is created to keep up with the evolution of the semantic Web. Whereas, catering to the needs of users has become a matter of paramount importance in the light of artificial intelligence and technological development to access the accurate and the efficient information in less possible time. However, the research challenges still in its infancy due to lack of research engine that supports the Arabic language. It could be traced back to the complexity of the Arabic language morphological and grammar rules.

Details

International Journal of Web Information Systems, vol. 12 no. 2
Type: Research Article
ISSN: 1744-0084

Keywords

Article
Publication date: 8 July 2010

Andreas Vlachidis, Ceri Binding, Douglas Tudhope and Keith May

This paper sets out to discuss the use of information extraction (IE), a natural language‐processing (NLP) technique to assist “rich” semantic indexing of diverse archaeological…

903

Abstract

Purpose

This paper sets out to discuss the use of information extraction (IE), a natural language‐processing (NLP) technique to assist “rich” semantic indexing of diverse archaeological text resources. The focus of the research is to direct a semantic‐aware “rich” indexing of diverse natural language resources with properties capable of satisfying information retrieval from online publications and datasets associated with the Semantic Technologies for Archaeological Resources (STAR) project.

Design/methodology/approach

The paper proposes use of the English Heritage extension (CRM‐EH) of the standard core ontology in cultural heritage, CIDOC CRM, and exploitation of domain thesauri resources for driving and enhancing an Ontology‐Oriented Information Extraction process. The process of semantic indexing is based on a rule‐based Information Extraction technique, which is facilitated by the General Architecture of Text Engineering (GATE) toolkit and expressed by Java Annotation Pattern Engine (JAPE) rules.

Findings

Initial results suggest that the combination of information extraction with knowledge resources and standard conceptual models is capable of supporting semantic‐aware term indexing. Additional efforts are required for further exploitation of the technique and adoption of formal evaluation methods for assessing the performance of the method in measurable terms.

Originality/value

The value of the paper lies in the semantic indexing of 535 unpublished online documents often referred to as “Grey Literature”, from the Archaeological Data Service OASIS corpus (Online AccesS to the Index of archaeological investigationS), with respect to the CRM ontological concepts E49.Time Appellation and P19.Physical Object.

Details

Aslib Proceedings, vol. 62 no. 4/5
Type: Research Article
ISSN: 0001-253X

Keywords

Article
Publication date: 2 September 2013

Ya-Ning Chen and Hao-Ren Ke

This paper seeks to adopt FRBRoo as an ontological approach to integrate heterogeneous metadata, and transform human-understandable format into machine-understandable format for…

Abstract

Purpose

This paper seeks to adopt FRBRoo as an ontological approach to integrate heterogeneous metadata, and transform human-understandable format into machine-understandable format for semantic query.

Design/methodology/approach

Two cases of use with museum artefacts and literary works were exploited to illustrate how FRBRoo can be used to re-contextualize the semantics of elements and the semantic relationships embedded in those elements. The shared ontology was then RDFized and examples were explored to examine the feasibility of the proposed approach.

Findings

FRBRoo can play a role as inter lingua aligning museum and library metadata to achieve heterogeneous metadata integration and semantic query without changing either of the original approaches to fit the other.

Research limitations/implications

Exploration of more diverse use cases is required to further align the different approaches of museums and libraries using FRBRoo and make revisions.

Practical implications

Solid evidence is provided for the use of FRBRoo in heterogeneous metadata integration and semantic query.

Originality/value

This is the first study to elaborate how FRBRoo can play a role as a shared ontology to integrate the heterogeneous metadata generated by museums and libraries. This paper also shows how the proposed approach is distinct from the Dublin Core format crosswalk in re-contextualizing semantic meanings and their relationships, and further provides four new sub-types for mapping description language.

Details

Journal of Documentation, vol. 69 no. 5
Type: Research Article
ISSN: 0022-0418

Keywords

Article
Publication date: 25 September 2019

Dimitrios A. Koutsomitropoulos

Effective synthesis of learning material is a multidimensional problem, which often relies on handpicking approaches and human expertise. Sources of educational content exist in a…

Abstract

Purpose

Effective synthesis of learning material is a multidimensional problem, which often relies on handpicking approaches and human expertise. Sources of educational content exist in a variety of forms, each offering proprietary metadata information and search facilities. This paper aims to show that it is possible to harvest scholarly resources from various repositories of open educational resources (OERs) in a federated manner. In addition, their subject can be automatically annotated using ontology inference and standard thematic terminologies.

Design/methodology/approach

Based on a semantic interpretation of their metadata, authors can align external collections and maintain them in a shared knowledge pool known as the Learning Object Ontology Repository (LOOR). The author leverages the LOOR and show that it is possible to search through various educational repositories’ metadata and amalgamate their semantics into a common learning object (LO) ontology. The author then proceeds with automatic subject classification of LOs using keyword expansion and referencing standard taxonomic vocabularies for thematic classification, expressed in SKOS.

Findings

The approach for automatic subject classification simply takes advantage of the implicit information in the searching and selection process and combines them with expert knowledge in the domain of reference (SKOS thesauri). This is shown to improve recall by a considerable factor, while precision remains unaffected.

Originality/value

To the best of the author’s knowledge, the idea of subject classification of LOs through the reuse of search query terms combined with SKOS-based matching and expansion has not been investigated before in a federated scholarly setting.

Details

Digital Library Perspectives, vol. 35 no. 3/4
Type: Research Article
ISSN: 2059-5816

Keywords

Article
Publication date: 1 February 2021

Omar El Midaoui, Btihal El Ghali, Abderrahim El Qadi and Moulay Driss Rahmani

Geographical query formulation is one of the key difficulties for users in search engines. The purpose of this study is to improve geographical search by proposing a novel…

Abstract

Purpose

Geographical query formulation is one of the key difficulties for users in search engines. The purpose of this study is to improve geographical search by proposing a novel geographical query reformulation (GQR) technique using a geographical taxonomy and word senses.

Design/methodology/approach

This work introduces an approach for GQR, which combines a method of query components separation that uses GeoNames, a technique for reformulating these components using WordNet and a geographic taxonomy constructed using the latent semantic analysis method.

Findings

The proposed approach was compared to two methods from the literature, using the mean average precision (MAP) and the precision at 20 documents (P@20). The experimental results show that it outperforms the other techniques by 15.73% to 31.21% in terms of P@20 and by 17.81% to 35.52% in terms of MAP.

Research limitations/implications

According to the experimental results, the best created taxonomy using the geographical adjacency taxonomy builder contains 7.67% of incorrect links. This paper believes that using a very big amount of data for taxonomy building can give better results. Thus, in future work, this paper intends to apply the approach in a big data context.

Originality/value

Despite this, the reformulation of geographical queries using the new proposed approach considerably improves the precision of queries and retrieves relevant documents that were not retrieved using the original queries. The strengths of the technique lie in the facts of reformulating both thematic and spatial entities and replacing the spatial entity of the query with terms that explain the intent of the query more precisely using a geographical taxonomy.

Details

Journal of Systems and Information Technology, vol. 23 no. 1
Type: Research Article
ISSN: 1328-7265

Keywords

Article
Publication date: 23 November 2012

Chihli Hung, Chih‐Fong Tsai, Shin‐Yuan Hung and Chang‐Jiang Ku

A grid information retrieval model has benefits for sharing resources and processing mass information, but cannot handle conceptual heterogeneity without integration of semantic…

Abstract

Purpose

A grid information retrieval model has benefits for sharing resources and processing mass information, but cannot handle conceptual heterogeneity without integration of semantic information. The purpose of this research is to propose a concept‐based retrieval mechanism to catch the user's query intentions in a grid environment. This research re‐ranks documents over distributed data sources and evaluates performance based on the user judgment and processing time.

Design/methodology/approach

This research uses the ontology lookup service to build the concept set in the ontology and captures the user's query intentions as a means of query expansion for searching. The Globus toolkit is used to implement the grid service. The modification of the collection retrieval inference (CORI) algorithm is used for re‐ranking documents over distributed data sources.

Findings

The experiments demonstrate that this proposed approach successfully describes the user's query intentions evaluated by user judgment. For processing time, building a grid information retrieval model is a suitable strategy for the ontology‐based retrieval model.

Originality/value

Most current semantic grid models focus on construction of the semantic grid, and do not consider re‐ranking search results from distributed data sources. The significance of evaluation from the user's viewpoint is also ignored. This research proposes a method that captures the user's query intentions and re‐ranks documents in a grid based on the CORI algorithm. This proposed ontology‐based retrieval mechanism calculates the global relevance score of all documents in a grid and displays those documents with higher relevance to users.

Details

Online Information Review, vol. 36 no. 6
Type: Research Article
ISSN: 1468-4527

Keywords

Article
Publication date: 13 January 2012

J. Alfredo Sánchez, María Auxilio Medina, Oleg Starostenko, Antonio Benitez and Eduardo López Domínguez

This paper seeks to focus on the problems of integrating information from open, distributed scholarly collections, and on the opportunities these collections represent for…

Abstract

Purpose

This paper seeks to focus on the problems of integrating information from open, distributed scholarly collections, and on the opportunities these collections represent for research communities in developing countries. The paper aims to introduce OntOAIr, a semi‐automatic method for constructing lightweight ontologies of documents in repositories such as those provided by the Open Archives Initiative (OAI).

Design/methodology/approach

OntOAIr uses simplified document representations, a clustering algorithm, and ontological engineering techniques.

Findings

The paper presents experimental results of the potential positive impact of ontologies and specifically of OntOAIr on the use of collections provided by OAI.

Research limitations/implications

By applying OntOAIr, scholars who frequently spend many hours organizing OAI information spaces will obtain support that will allow them to speed up the entire research cycle and, expectedly, participate more fully in global research communities.

Originality/value

The proposed method allows human and software agents to organize and retrieve groups of documents from multiple collections. Applications of OntOAIr include enhanced document retrieval. In this paper, the authors focus particularly on document retrieval applications.

Details

Aslib Proceedings, vol. 64 no. 1
Type: Research Article
ISSN: 0001-253X

Keywords

1 – 10 of 197