Search results

1 – 10 of over 6000
Article
Publication date: 1 February 1978

S.E. ROBERTSON and N.J. BELKIN

It is often suggested that information retrieval systems should rank documents rather than simply retrieving a set. Two separate reasons are adduced for this: that relevance…

Abstract

It is often suggested that information retrieval systems should rank documents rather than simply retrieving a set. Two separate reasons are adduced for this: that relevance itself is a multi‐valued or continuous variable; and that retrieval is an essentially approximate process. These two reasons lead to different ranking principles, one according to degree of relevance, the other according to probability of relevance. This paper explores the possibility of combining the two principles, but concludes that while neither is adequate alone, nor can any single all‐embracing ranking principle be constructed to replace the two. The only general solution to the problem would be to find an optimal ranking by exploring the effect on the user of every possible ranking. However, some more practical approximate solutions appear possible.

Details

Journal of Documentation, vol. 34 no. 2
Type: Research Article
ISSN: 0022-0418

Article
Publication date: 20 April 2015

Roslina Othman and Ashraf Ali Salahuddin

The purposes of this study were to measure the relevance status of Index Islamicus, evaluate the semantic correlation between a query and documents and inquire the basis of its…

Abstract

Purpose

The purposes of this study were to measure the relevance status of Index Islamicus, evaluate the semantic correlation between a query and documents and inquire the basis of its rank. Sorting the retrieved results from the most relevant to the least relevant is the common option of an information retrieval system. This sorting mechanism or relevance judgment is computed by measuring closeness of query with its documents.

Design/methodology/approach

Forming up 100 queries on Islamic History and Civilizations, with two indexing elements (keyword and concept), a laboratory experiment was generated on its first ten items of the rank. Throughout an experimental research design, the relevance status value formula was used to measure system-computed rank and compare it with mean average precision.

Findings

The results showed that the average status value of Index Islamicus’s ranking on relevance criterion was 18 per cent effective in terms of retrieving precise documents. Despite the main focus of this study being only on one subject domain and the items calculated were only 1,000, this small percentage of its ranking mechanism proved that semantic correlations between queries with subject domain did not achieve the satisfactory level.

Research limitations/implications

Implication of this study could be a guideline for further research on ranking mechanism of other search engines because the limitation of this study was Index Islamicus being the only database, which was the focus of this study.

Practical implications

Throughout this study, Index Islamicus would be benefited knowing the status of its ranking mechanism as well as other databases can make further research on their own ranking method following this study.

Social implications

Researchers and vendors of online databases can ensure their users a true platform of search engine with a proper ranking list.

Originality/value

Relevance status value model for Index Islamicus on Islamic History and Civilization that allows the system to rank documents according to the match between document and query and gives the idea of a better index. The model improves the system’s ranking mechanism, and promotes the use of semantic relationships. This research promotes the computation of relevance status value by domain for capturing subject-specific relevance criteria and semantic relationships.

Details

International Journal of Web Information Systems, vol. 11 no. 1
Type: Research Article
ISSN: 1744-0084

Keywords

Article
Publication date: 1 April 1977

S.E. ROBERTSON

The principle that, for optimal retrieval, documents should be ranked in order of the probability of relevance or usefulness has been brought into question by Cooper. It is shown…

1555

Abstract

The principle that, for optimal retrieval, documents should be ranked in order of the probability of relevance or usefulness has been brought into question by Cooper. It is shown that the principle can be justified under certain assumptions, but that in cases where these assumptions do not hold, the principle is not valid. The major problem appears to lie in the way the principle considers each document independently of the rest. The nature of the information on the basis of which the system decides whether or not to retrieve the documents determines whether the document‐by‐document approach is valid.

Details

Journal of Documentation, vol. 33 no. 4
Type: Research Article
ISSN: 0022-0418

Article
Publication date: 1 April 2004

Christopher S.G. Khoo and Kwok‐Wai Wan

A relevancy‐ranking algorithm for a natural language interface to Boolean online public access catalogs (OPACs) was formulated and compared with that currently used in a…

930

Abstract

A relevancy‐ranking algorithm for a natural language interface to Boolean online public access catalogs (OPACs) was formulated and compared with that currently used in a knowledge‐based search interface called the E‐Referencer, being developed by the authors. The algorithm makes use of seven well‐known ranking criteria: breadth of match, section weighting, proximity of query words, variant word forms (stemming), document frequency, term frequency and document length. The algorithm converts a natural language query into a series of increasingly broader Boolean search statements. In a small experiment with ten subjects in which the algorithm was simulated by hand, the algorithm obtained good results with a mean overall precision of 0.42 and mean average precision of 0.62, representing a 27 percent improvement in precision and 41 percent improvement in average precision compared to the E‐Referencer. The usefulness of each step in the algorithm was analyzed and suggestions are made for improving the algorithm.

Details

The Electronic Library, vol. 22 no. 2
Type: Research Article
ISSN: 0264-0473

Keywords

Article
Publication date: 30 August 2023

Yi-Hung Liu, Sheng-Fong Chen and Dan-Wei (Marian) Wen

Online medical repositories provide a platform for users to share information and dynamically access abundant electronic health data. It is important to determine whether case…

Abstract

Purpose

Online medical repositories provide a platform for users to share information and dynamically access abundant electronic health data. It is important to determine whether case report information can assist the general public in appropriately managing their diseases. Therefore, this paper aims to introduce a novel deep learning-based method that allows non-professionals to make inquiries using ordinary vocabulary, retrieving the most relevant case reports for accurate and effective health information.

Design/methodology/approach

The dataset of case reports was collected from both the patient-generated research network and the digital medical journal repository. To enhance the accuracy of obtaining relevant case reports, the authors propose a retrieval approach that combines BERT and BiLSTM methods. The authors identified representative health-related case reports and analyzed the retrieval performance, as well as user judgments.

Findings

This study aims to provide the necessary functionalities to deliver relevant health case reports based on input from ordinary terms. The proposed framework includes features for health management, user feedback acquisition and ranking by weights to obtain the most pertinent case reports.

Originality/value

This study contributes to health information systems by analyzing patients' experiences and treatments with the case report retrieval model. The results of this study can provide immense benefit to the general public who intend to find treatment decisions and experiences from relevant case reports.

Details

Aslib Journal of Information Management, vol. ahead-of-print no. ahead-of-print
Type: Research Article
ISSN: 2050-3806

Keywords

Article
Publication date: 14 October 2020

Haihua Chen, Yunhan Yang, Wei Lu and Jiangping Chen

Citation contexts have been found useful in many scenarios. However, existing context-based recommendations ignored the importance of diversity in reducing the redundant issues…

Abstract

Purpose

Citation contexts have been found useful in many scenarios. However, existing context-based recommendations ignored the importance of diversity in reducing the redundant issues and thus cannot cover the broad range of user interests. To address this gap, the paper aims to propose a novelty task that can recommend a set of diverse citation contexts extracted from a list of citing articles. This will assist users in understanding how other scholars have cited an article and deciding which articles they should cite in their own writing.

Design/methodology/approach

This research combines three semantic distance algorithms and three diversification re-ranking algorithms for the diversifying recommendation based on the CiteSeerX data set and then evaluates the generated citation context lists by applying a user case study on 30 articles.

Findings

Results show that a diversification strategy that combined “word2vec” and “Integer Linear Programming” leads to better reading experience for participants than other diversification strategies, such as CiteSeerX using a list sorted by citation counts.

Practical implications

This diversifying recommendation task is valuable for developing better systems in information retrieval, automatic academic recommendations and summarization.

Originality/value

The originality of the research lies in the proposal of a novelty task that can recommend a diversification context list describing how other scholars cited an article, thereby making citing decisions easier. A novel mixed approach is explored to generate the most efficient diversifying strategy. Besides, rather than traditional information retrieval evaluation, a user evaluation framework is introduced to reflect user information needs more objectively.

Article
Publication date: 1 April 1996

ALEXANDER M. ROBERTSON and PETER WILLETT

This paper describes the development of a genetic algorithm (GA) for the assignment of weights to query terms in a ranked‐output document retrieval system. The GA involves a…

Abstract

This paper describes the development of a genetic algorithm (GA) for the assignment of weights to query terms in a ranked‐output document retrieval system. The GA involves a fitness function that is based on full relevance information, and the rankings resulting from the use of these weights are compared with the Robertson‐Sparck Jones F4 retrospective relevance weight. Extended experiments with seven document test collections show that the ga can often find weights that are slightly superior to those produced by the deterministic weighting scheme. That said, there are many cases where the two approaches give the same results, and a few cases where the F4 weights are superior to the ga weights. Since the ga has been designed to identify weights yielding the best possible level of retrospective performance, these results indicate that the F4 weights provide an excellent and practicable alternative. Evidence is presented to suggest that negative weights may play an important role in retrospective relevance weighting.

Details

Journal of Documentation, vol. 52 no. 4
Type: Research Article
ISSN: 0022-0418

Article
Publication date: 1 May 1994

E. Michael Keen

This article reports an attempt to understand how a new non‐Boolean ranked output online search facility works. In December 1993, Data‐Star Dialog released a relevance ranking

Abstract

This article reports an attempt to understand how a new non‐Boolean ranked output online search facility works. In December 1993, Data‐Star Dialog released a relevance ranking tool known as TARGET (Dialog 1993b). There is considerable research activity into text retrieval using ranking methods, as instanced by the TREC experiments (Harman 1993), and professional online searchers may wish to know exactly what the ranking algorithm does in order to be able to exploit the facility to best advantage. Though it has not been possible to ‘crack’ the algorithm to the level of calculating its match to the nearest per cent, it is possible to see three or four factors at work in the way TARGET produces items ranked in decreasing order of match with a query. It is emphasised that the analysis presented here is based only on records from one bibliographic database. A more extensive and comparative study of bibliographic and full‐text databases would be needed to provide universal and definitive findings.

Details

Online and CD-Rom Review, vol. 18 no. 5
Type: Research Article
ISSN: 1353-2642

Article
Publication date: 12 April 2013

Jin Zhang, Wei Fei and Taowen Le

The purpose of this paper to investigate the effectiveness of selected search features in the major English and Chinese search engines and compare the search engines’ retrieval

Abstract

Purpose

The purpose of this paper to investigate the effectiveness of selected search features in the major English and Chinese search engines and compare the search engines’ retrieval effectiveness.

Design/approach/methodology

The search engines Google, Google China, and Baidu were selected for this study. Common search features such as title search, basic search, exact phrase search, PDF search, and URL search, were identified and used. Search results from using the five features in the search engines were collected and compared. One‐way ANOVA and regression analysis were used to compare the retrieval effectiveness of the search engines.

Findings

It was found that Google achieved the best retrieval performance with all five search features among the three search engines. Moreover Google achieved the best webpage ranking performance.

Practical implications

The findings of this study improve the understanding of English and Chinese search engines and the differences between them in terms of search features, and can be used to assist users in choosing appropriate and effective search strategies when they search for information on the internet.

Originality/value

The original contributions of this paper are that the Chinese and English search engines in both languages are compared for retrieval effectiveness. Five search features were evaluated, compared, and analysed in the two different language environments by using the discounted cumulative gain method.

Details

Online Information Review, vol. 37 no. 2
Type: Research Article
ISSN: 1468-4527

Keywords

Book part
Publication date: 10 February 2012

Kerstin Denecke

Purpose — Since a couple of years, we are confronted with the phenomenon of information overload. In particular, the web provides a rich source of a variety of information mainly…

Abstract

Purpose — Since a couple of years, we are confronted with the phenomenon of information overload. In particular, the web provides a rich source of a variety of information mainly in textual, i.e. unstructured form. Thus, web search faces new challenges that are how to make the user aware of the variety of content available and how to satisfy users best with such manifold content.

Methodology — This variety of content is considered as diversity, i.e. the reflection of a result set's coverage of multiple interpretations of a query. Diversification within web search aims on the one hand at adapting the ranking in a way that the top results are diverse. Increasingly important becomes on the other hand the organization and classification of content within diversification.

Findings — Various approaches to diversification are available or currently focus on research activities. They range from an adapted ranking by means of similarity measures or diversity scores to a comprehensive diversity analysis which determines topics and classifies text according to opinions etc.

Implications — Given the high diversity of web content, approaches for diversification are extremely important. Web search tries to address this problem from different perspectives. For the future, combination with image search result diversification is important. Further, benchmarks and standard data sets for evaluations need to be established to ensure comparability of results from various approaches.

Originality/value — This chapter provides an overview on diversity in web search from two directions: (a) Diversity is introduced with its notions and dimensions. (b) Methods to assess diversity within web search are presented.

Details

Web Search Engine Research
Type: Book
ISBN: 978-1-78052-636-2

Keywords

1 – 10 of over 6000