Search results

1 – 10 of over 140000
Article
Publication date: 1 August 2006

Amanda Spink, Bernard J. Jansen, Vinish Kathuria and Sherry Koshman

This paper reports the findings of a major study examining the overlap among results retrieved by three major web search engines. The goal of the research was to: measure the…

2476

Abstract

Purpose

This paper reports the findings of a major study examining the overlap among results retrieved by three major web search engines. The goal of the research was to: measure the overlap across three major web search engines on the first results page overlap (i.e. share the same results) and the differences across a wide range of user defined search terms; determine the differences in the first page of search results and their rankings (each web search engine's view of the most relevant content) across single‐source web search engines, including both sponsored and non‐sponsored results; and measure the degree to which a meta‐search web engine, such as Dogpile.com, provides searchers with the most highly ranked search results from three major single source web search engines.

Design/methodology/approach

The authors collected 10,316 random Dogpile.com queries and ran an overlap algorithm using the URL for each result by query. The overlap of first result page search for each query was then summarized across all 10,316 to determine the overall overlap metrics. For a given query, the URL of each result for each engine was retrieved from the database.

Findings

The percent of total results unique retrieved by only one of the three major web search engines was 85 percent, retrieved by two web search engines was 12 percent, and retrieved by all three web search engines was 3 percent. This small level of overlap reflects major differences in web search engines retrieval and ranking results.

Research limitations/implications

This study provides an important contribution to the web research literature. The findings point to the value of meta‐search engines in web retrieval to overcome the biases of single search engines.

Practical implications

The results of this research can inform people and organizations that seek to use the web as part of their information seeking efforts, and the design of web search engines.

Originality/value

This research is a large investigation into web search engine overlap using real data from a major web meta‐search engine and single web search engines that sheds light on the uniqueness of top results retrieved by web search engines.

Details

Internet Research, vol. 16 no. 4
Type: Research Article
ISSN: 1066-2243

Keywords

Article
Publication date: 7 July 2011

Dirk Lewandowski

The purpose of this paper is to test major web search engines on their performance on navigational queries, i.e. searches for homepages.

4749

Abstract

Purpose

The purpose of this paper is to test major web search engines on their performance on navigational queries, i.e. searches for homepages.

Design/methodology/approach

In total, 100 user queries are posed to six search engines (Google, Yahoo!, MSN, Ask, Seekport, and Exalead). Users described the desired pages, and the results position of these was recorded. Measured success and mean reciprocal rank are calculated.

Findings

The performance of the major search engines Google, Yahoo!, and MSN was found to be the best, with around 90 per cent of queries answered correctly. Ask and Exalead performed worse but received good scores as well.

Research limitations/implications

All queries were in German, and the German‐language interfaces of the search engines were used. Therefore, the results are only valid for German queries.

Practical implications

When designing a search engine to compete with the major search engines, care should be taken on the performance on navigational queries. Users can be influenced easily in their quality ratings of search engines based on this performance.

Originality/value

This study systematically compares the major search engines on navigational queries and compares the findings with studies on the retrieval effectiveness of the engines on informational queries.

Article
Publication date: 20 February 2007

Mary L. Robinson and Judith Wusteman

To describe a small‐scale quantitative evaluation of the scholarly information search engine, Google Scholar.

1460

Abstract

Purpose

To describe a small‐scale quantitative evaluation of the scholarly information search engine, Google Scholar.

Design/methodology/approach

Google Scholar's ability to retrieve scholarly information was compared to that of three popular search engines: Ask.com, Google and Yahoo! Test queries were presented to all four search engines and the following measures were used to compare them: precision; Vaughan's Quality of Result Ranking; relative recall; and Vaughan's Ability to Retrieve Top Ranked Pages.

Findings

Significant differences were found in the ability to retrieve top ranked pages between Ask.com and Google and between Ask.com and Google Scholar for scientific queries. No other significant differences were found between the search engines. This may be due to the relatively small sample size of eight queries. Results suggest that, for scientific queries, Google Scholar has the highest precision, relative recall and Ability to Retrieve Top Ranked Pages. However, it achieved the lowest score for these three measures for non‐scientific queries. The best overall score for all four measures was achieved by Google. Vaughan's Quality of Result Ranking found a significant correlation between Google and scientific queries.

Research limitations/implications

As with any search engine evaluation, the results pertain only to performance at the time of the study and must be considered in light of any subsequent changes in the search engine's configuration or functioning. Also, the relatively small sample size limits the scope of the study's findings.

Practical implications

These results suggest that, although Google Scholar may prove useful to those in scientific disciplines, further development is necessary if it is to be useful to the scholarly community in general.

Originality/value

This is a preliminary study in applying the accepted performance measures of precision and recall to Google Scholar. It provides information specialists and users with an objective evaluation of Google Scholar's abilities across both scientific and non‐scientific disciplines and paves the way for a larger study.

Details

Program, vol. 41 no. 1
Type: Research Article
ISSN: 0033-0337

Keywords

Article
Publication date: 1 December 2005

Seda Ozmutlu

The purpose of this study is to investigate whether question and keyword‐format queries are more successfully processed by search engines encouraging answers to searching and…

1414

Abstract

Purpose

The purpose of this study is to investigate whether question and keyword‐format queries are more successfully processed by search engines encouraging answers to searching and keyword‐format querying, respectively. This study aims to investigate whether web user characteristics and choice of search engine affects the relevancy scores and precision of the results.

Design/methodology/approach

The results of two search engines, Google and AskJeeves, were compared for question and keyword‐format queries. It was observed that AskJeeves was slightly more successful in processing question‐format queries, but this finding was not statistically supported. However, Google provided results on keyword‐format queries and the entire set of queries, which were statistically superior to those of AskJeeves.

Findings

Analysis of variance (ANOVA) showed that the age of web user is not as affective on the relevancy score and precision of results as other factors. Interactions of the main factors were also affective on the relevancy scores and precision, meaning that the different combinations of various factors cause a synergy in terms of relevancy scores and precision.

Research limitations/implications

This was a preliminary work on the effect of user characteristics on comprehension and evaluation of search query results. Future work includes expanding this study to include more web user characteristics, more levels of the web user characteristics, and inclusion of more search engines.

Originality/value

The findings of this study provide statistical proof for the relationship between the characteristics of web users, choice of search engine and the relevancy scores and precision of search results.

Details

Online Information Review, vol. 29 no. 6
Type: Research Article
ISSN: 1468-4527

Keywords

Article
Publication date: 29 November 2011

Judit Bar‐Ilan and Mark Levene

The aim of this paper is to develop a methodology for assessing search results retrieved from different sources.

1051

Abstract

Purpose

The aim of this paper is to develop a methodology for assessing search results retrieved from different sources.

Design/methodology/approach

This is a two phase method, where in the first stage users select and rank the ten best search results from a randomly ordered set. In the second stage they are asked to choose the best pre‐ranked result from a set of possibilities. This two‐stage method allows users to consider each search result separately (in the first stage) and to express their views on the rankings as a whole, as they were retrieved by the search provider. The method was tested in a user study that compared different country‐specific search results of Google and Live Search (now Bing). The users were Israelis and the search results came from six sources: Google Israel, Google.com, Google UK, Live Search Israel, Live Search US and Live Search UK. The users evaluated the results of nine pre‐selected queries, created their own preferred ranking and picked the best ranking from the six sources.

Findings

The results indicate that the group of users in this study preferred their local Google interface, i.e. Google succeeded in its country‐specific customisation of search results. Live Search was much less successful in this aspect.

Research limitations/implications

Search engines are highly dynamic, thus the findings of the case study have to be viewed cautiously.

Originality/value

The main contribution of the paper is a two‐phase methodology for comparing and evaluating search results from different sources.

Details

Online Information Review, vol. 35 no. 6
Type: Research Article
ISSN: 1468-4527

Keywords

Article
Publication date: 30 November 2010

Hsiao‐Tieh Pu

Clustering web search results into dynamic clusters and hierarchies provides a promising way to alleviate the overabundance of information typically found in ranked list search…

Abstract

Purpose

Clustering web search results into dynamic clusters and hierarchies provides a promising way to alleviate the overabundance of information typically found in ranked list search engines. This study seeks to investigate the usefulness of clustering textual results in web search by analysing the search performance and users' satisfaction levels with and without the aid of clusters and hierarchies.

Design/methodology/approach

This study utilises two evaluation metrics. One is a usability test of clustering interfaces measured by users' search performances; the other is a comprehension test measured by users' satisfaction levels. Various methods were used to support the two tests, including experiments, observations, questionnaires, interviews, and search log analysis.

Findings

The results showed that there was no significant difference between the ranked list and clustering interfaces, although participants searched slightly faster, retrieved a larger number of relevant pages, and were more satisfied when using the ranked list interface without clustering. Even so, the clustering interface offers opportunities for diversified searching. Moreover, the repetitive ratio of relevant results found by each participant was low. Other advantages of the clustering interface are that it highlights important concepts and offers richer contexts for exploring, learning and discovering related concepts; however, it may induce a certain amount of anxiety about missing or losing important information.

Originality/value

The evaluation of a clustering interface is rather difficult, particularly in the context of the web search environment, which is used by a large heterogeneous user population for a wide variety of tasks. The study employed multiple data collection methods and in particular designed a combination of usability and comprehension tests to offer preliminary results on users' evaluation of real‐world clustering search interfaces. The results may extend the understanding of search characteristics with a cluster‐based web search engine, and could be used as a vehicle for further discussion of user evaluation research into this area.

Details

Online Information Review, vol. 34 no. 6
Type: Research Article
ISSN: 1468-4527

Keywords

Article
Publication date: 16 July 2021

Young Man Ko, Min Sun Song and Seung Jun Lee

This study aims to develop metadata of conceptual elements based on the text structure of research articles on Korean studies, to propose a search algorithm that reflects the…

Abstract

Purpose

This study aims to develop metadata of conceptual elements based on the text structure of research articles on Korean studies, to propose a search algorithm that reflects the combination of semantically relevant data in accordance with the search intention of research paper and to examine the algorithm whether there is a difference in the intention-based search results.

Design/methodology/approach

This study constructed a metadata database of 5,007 research articles on Korean studies arranged by conceptual elements of text structure and developed F1(w)-score weighted to conceptual elements based on the F1-score and the number of data points from each element. This study evaluated the algorithm by comparing search results of the F1(w)-score algorithm with those of the Term Frequency- Inverse Document Frequency (TF-IDF) algorithm and simple keyword search.

Findings

The authors find that the higher the F1(w)-score, the closer the semantic relevance of search intention. Furthermore, F1(w)-score generated search results were more closely related to the search intention than those of TF-IDF and simple keyword search.

Research limitations/implications

Even though the F1(w)-score was developed in this study to evaluate the search results of metadata database structured by conceptual elements of text structure of Korean studies, the algorithm can be used as a tool for searching the database which is a tuning process of weighting required.

Practical implications

A metadata database based on text structure and a search method based on weights of metadata elements – F1(w)-score – can be useful for interdisciplinary studies, especially for semantic search in regional studies.

Originality/value

This paper presents a methodology for supporting IR using F1(w)-score—a novel model for weighting metadata elements based on text structure. The F1(w)-score-based search results show the combination of semantically relevant data, which are otherwise difficult to search for using similarity of search words.

Details

The Electronic Library , vol. 39 no. 5
Type: Research Article
ISSN: 0264-0473

Keywords

Article
Publication date: 1 March 1999

Jason Vaughan

In an effort to better speculate whether a certain set of factors plays a role in information professionals’ choice of Internet search tools, this article describes a survey…

Abstract

In an effort to better speculate whether a certain set of factors plays a role in information professionals’ choice of Internet search tools, this article describes a survey conducted by the author of MSLS/MSIS graduate students and professional librarians at the University of North Carolina at Chapel Hill. Background discussion on Internet search tool design, usability, field testing, and future development is provided. Two sets of factors were defined for this study, one describing utility functions of search tools, the other describing the convenience or ease of use of search tools. The survey reveals a trend in choosing a preferred Internet search tool based on utility factors as opposed to convenience factors. It also suggests a preference for search engines as opposed to subject catalogs. Comprehensive, encompassing results are found to be more important than ease of use of a particular search tool.

Details

Library Hi Tech, vol. 17 no. 1
Type: Research Article
ISSN: 0737-8831

Keywords

Abstract

Details

Automated Information Retrieval: Theory and Methods
Type: Book
ISBN: 978-0-12266-170-9

Article
Publication date: 10 January 2024

Artur Strzelecki and Andrej Miklosik

The landscape of search engine usage has evolved since the last known data were used to calculate click-through rate (CTR) values. The objective was to provide a replicable method…

56

Abstract

Purpose

The landscape of search engine usage has evolved since the last known data were used to calculate click-through rate (CTR) values. The objective was to provide a replicable method for accessing data from the Google search engine using programmatic access and calculating CTR values from the retrieved data to show how the CTRs have changed since the last studies were published.

Design/methodology/approach

In this study, the authors present the estimated CTR values in organic search results based on actual clicks and impressions data, and establish a protocol for collecting this data using Google programmatic access. For this study, the authors collected data on 416,386 clicks, 31,648,226 impressions and 8,861,416 daily queries.

Findings

The results show that CTRs have decreased from previously reported values in both academic research and industry benchmarks. The estimates indicate that the top-ranked result in Google's organic search results features a CTR of 9.28%, followed by 5.82 and 3.11% for positions two and three, respectively. The authors also demonstrate that CTRs vary across various types of devices. On desktop devices, the CTR decreases steadily with each lower ranking position. On smartphones, the CTR starts high but decreases rapidly, with an unprecedented increase from position 13 onwards. Tablets have the lowest and most variable CTR values.

Practical implications

The theoretical implications include the generation of a current dataset on search engine results and user behavior, made available to the research community, creation of a unique methodology for generating new datasets and presenting the updated information on CTR trends. The managerial implications include the establishment of the need for businesses to focus on optimizing other forms of Google search results in addition to organic text results, and the possibility of application of this study's methodology to determine CTRs for their own websites.

Originality/value

This study provides a novel method to access real CTR data and estimates current CTRs for top organic Google search results, categorized by device.

Details

Aslib Journal of Information Management, vol. ahead-of-print no. ahead-of-print
Type: Research Article
ISSN: 2050-3806

Keywords

1 – 10 of over 140000