Search results

1 – 10 of over 1000

View access options

Article

Publication date: 19 October 2010

Classifying the user intent of web queries using k‐means clustering

Ashish Kathuria, Bernard J. Jansen, Carolyn Hafernik and Amanda Spink

Web search engines are frequently used by people to locate information on the Internet. However, not all queries have an informational goal. Instead of information, some people…

HTML

PDF (212 KB)

Downloads

1373

Abstract

Purpose

Web search engines are frequently used by people to locate information on the Internet. However, not all queries have an informational goal. Instead of information, some people may be looking for specific web sites or may wish to conduct transactions with web services. This paper aims to focus on automatically classifying the different user intents behind web queries.

Design/methodology/approach

For the research reported in this paper, 130,000 web search engine queries are categorized as informational, navigational, or transactional using a k‐means clustering approach based on a variety of query traits.

Findings

The research findings show that more than 75 percent of web queries (clustered into eight classifications) are informational in nature, with about 12 percent each for navigational and transactional. Results also show that web queries fall into eight clusters, six primarily informational, and one each of primarily transactional and navigational.

Research limitations/implications

This study provides an important contribution to web search literature because it provides information about the goals of searchers and a method for automatically classifying the intents of the user queries. Automatic classification of user intent can lead to improved web search engines by tailoring results to specific user needs.

Practical implications

The paper discusses how web search engines can use automatically classified user queries to provide more targeted and relevant results in web searching by implementing a real time classification method as presented in this research.

Originality/value

This research investigates a new application of a method for automatically classifying the intent of user queries. There has been limited research to date on automatically classifying the user intent of web queries, even though the pay‐off for web search engines can be quite beneficial.

Details

Internet Research, vol. 20 no. 5

Type: Research Article

DOI:

ISSN: 1066-2243

Keywords

View access options

Article

Publication date: 8 August 2016

Search engine effectiveness using query classification: a study

Sabha Ali and Sumeer Gul

– The purpose of this paper is to highlight the retrieval effectiveness of search engines taking into consideration both precision and relative recall.

HTML

PDF (121 KB)

Downloads

1434

Abstract

Purpose

The purpose of this paper is to highlight the retrieval effectiveness of search engines taking into consideration both precision and relative recall.

Design/methodology/approach

The study is based on search engines that are selected on the basis of Alexa (Actionable Analytics for the web) Rank. Alexa listed top 500 sites, namely, search engines, portals, directories, social networking sites, networking tools, etc. But the scope of study is confined to only general search engines on the basis of language which was confined to English. Therefore only two general search engines are selected for the study . Alexa reports Google.com as the most visited website worldwide and Yahoo.com as the fourth most visited website globally. A total of 15 queries were selected randomly from PG students of Department of Library and Information Science during a period of eight days (from May 8 to May 15, 2014) which are classified manually into navigational, informational and transactional queries. However, queries are largely distributed on the two selected search engines to check their retrieval effectiveness as a training data set in order to define some characteristics of each type. Each query was submitted to the selected search engines which retrieved a large number of results but only the first 30 results were evaluated to limit the study in view of the fact that most of the users usually look up under the first hits of a query.

Findings

The study estimated the precision and relative recall of Google and Yahoo. Queries using concepts in the field of Library and Information Science were tested and were divided into navigational queries, informational queries and transactional queries. Results of the study showed that the mean precision of Google was high with (1.10) followed by Yahoo with (0.88). While as, mean relative recall of Google was high with (0.68) followed by Yahoo with (0.31), respectively.

Research limitations/implications

The study highlights the retrieval effectiveness of only two search engines.

Originality/value

The research work is authentic and does not contain any plagiarized work.

Details

Online Information Review, vol. 40 no. 4

Type: Research Article

DOI:

ISSN: 1468-4527

Keywords

View access options

Article

Publication date: 13 November 2020

Usability effectiveness of a federated search system for electronic theses and dissertations in Nigerian institutional repositories

Sadiat Adetoro Salau, F.P. Abifarin, J.A. Alhassan and S.J. Udoudoh

The purpose of this study was to evaluate the usability effectiveness of a webware for electronic theses and dissertations (ETDs) in Nigerian repositories. The webware…

HTML

PDF (1.2 MB)

Downloads

203

Abstract

Purpose

The purpose of this study was to evaluate the usability effectiveness of a webware for electronic theses and dissertations (ETDs) in Nigerian repositories. The webware (etdsearch.com.ng) is a web application system that curates ETDs from three sampled Federal government-owned universities. The system also links users to the repositories where the theses and dissertations are hosted.

Design/methodology/approach

The case study research strategy was adopted for the study. Sixty postgraduate students from three universities were randomly selected. A usability evaluation questionnaire based on the ISO 9241-11 framework was used to collect data after performing pre-defined queries/tasks based on the informational and transactional query models. The research questions were analysed using the median of the performance score (f_x) of the three universities for each item evaluated, while the Kruskall–Wallis test by ranks was used to test the null hypothesis at a 5% level of significance.

Findings

The study answered two research questions and tested two null hypotheses on the usability effectiveness of the webware based on the informational and transactional queries. The participants found the ETD search system effectively useable. In addition, there was no significant difference in the opinions of the participants.

Research limitations/implications

The webware used simulated repositories as a feed bed for the ETDs in order to have control over the workability of the repositories. Thus, the results may differ slightly when “live” repositories are used.

Practical implications

The effectiveness of a webware that aggregates ETDs in Nigerian repositories will present libraries in Nigeria with evidence on how these systems work and can be improved upon.

Originality/value

There is a dearth of literature on practical usability studies of digital information systems in Nigerian libraries.

Details

Performance Measurement and Metrics, vol. 22 no. 1

Type: Research Article

DOI:

ISSN: 1467-8047

Keywords

View access options

Article

Publication date: 20 April 2012

Query classification and study of university students' search trends

Majdi A. Maabreh, Mohammed N. Al‐Kabi and Izzat M. Alsmadi

This study is an attempt to develop an automatic identification method for Arabic web queries and divide them into several query types using data mining. In addition, it seeks to…

HTML

PDF (312 KB)

Downloads

1216

Abstract

Purpose

This study is an attempt to develop an automatic identification method for Arabic web queries and divide them into several query types using data mining. In addition, it seeks to evaluate the impact of the academic environment on using the internet.

Design/methodology/approach

The web log files were collected from one of the higher institute's servers over a one‐month period. A special program was designed and implemented to extract web search queries from these files and also to automatically classify Arabic queries into three query types (i.e. Navigational, Transactional, and Informational queries) based on predefined specifications for each type.

Findings

The results indicate that students are slowly and gradually using the internet for more relevant academic purposes. Tests showed that it is possible to automatically classify Arabic queries based on query terms, with 80.6 per cent to 80.2 per cent accuracy for the two phases of the test respectively. In their future strategies, Jordanian universities should apply methods to encourage university students to use the internet for academic purposes. Web search engines in general and Arabic search engines in particular may benefit from the proposed classification method in order to improve the effectiveness and relevancy of their results in accordance with users' needs.

Originality/value

Studying internet web logs has been the subject of many papers. However, the particular domain, and the specific focuses on this research are what can distinguish it from the others.

Details

Program, vol. 46 no. 2

Type: Research Article

DOI:

ISSN: 0033-0337

Keywords

View access options

Article

Publication date: 7 July 2011

The retrieval effectiveness of search engines on navigational queries

Dirk Lewandowski

The purpose of this paper is to test major web search engines on their performance on navigational queries, i.e. searches for homepages.

HTML

PDF (168 KB)

Downloads

5211

Abstract

Purpose

The purpose of this paper is to test major web search engines on their performance on navigational queries, i.e. searches for homepages.

Design/methodology/approach

In total, 100 user queries are posed to six search engines (Google, Yahoo!, MSN, Ask, Seekport, and Exalead). Users described the desired pages, and the results position of these was recorded. Measured success and mean reciprocal rank are calculated.

Findings

The performance of the major search engines Google, Yahoo!, and MSN was found to be the best, with around 90 per cent of queries answered correctly. Ask and Exalead performed worse but received good scores as well.

Research limitations/implications

All queries were in German, and the German‐language interfaces of the search engines were used. Therefore, the results are only valid for German queries.

Practical implications

When designing a search engine to compete with the major search engines, care should be taken on the performance on navigational queries. Users can be influenced easily in their quality ratings of search engines based on this performance.

Originality/value

This study systematically compares the major search engines on navigational queries and compares the findings with studies on the retrieval effectiveness of the engines on informational queries.

Details

Aslib Proceedings, vol. 63 no. 4

Type: Research Article

DOI:

ISSN: 0001-253X

Keywords

View access options

Article

Publication date: 16 November 2015

Two ' s company, but three ' s no crowd: Evaluating exploratory web search for individuals and teams

Chirag Shah, Chathra Hendahewa and Roberto González-Ibáñez

The purpose of this paper is to investigate when and how people working in collaboration could be benefitted by an exploratory search task, specifically focussing on team size and…

HTML

PDF (1 MB)

Downloads

954

Abstract

Purpose

The purpose of this paper is to investigate when and how people working in collaboration could be benefitted by an exploratory search task, specifically focussing on team size and its effect on the outcomes of such a task.

Design/methodology/approach

The paper investigates the effects of team sizes on exploratory search tasks using a lab study involving 68 participants – 12 individuals, ten dyads, and 12 triads. In order to assess various factors during their exploratory search sessions, an evaluation framework is synthesized using relevant literature. The framework consists of measures for five groups of quantities relevant to exploratory search: information exposure, information relevancy, information search, performance, and learning.

Findings

The analyses on the user study data using the proposed framework reveals that while individuals working alone cover more information than those working in teams, the teams (dyads and triads) are able to achieve better information coverage and search performance due to their collaborative strategies. In many of the measures, the triads are found to be even better than the dyads, demonstrating the value of adding a collaborator to a search process with multiple facets.

Originality/value

The findings shed light on not only how collaborative work could help in achieving better results in exploratory search, but also how team sizes affect specific aspects – information exposure, information relevancy, information search, performance, and learning – of exploratory search. This has implications for system designers, information managers, and educators.

Details

Aslib Journal of Information Management, vol. 67 no. 6

Type: Research Article

DOI:

ISSN: 2050-3806

Keywords

View access options

Article

Publication date: 27 April 2022

Large-scale analysis of query logs to profile users for dataset search

Romina Sharifpour, Mingfang Wu and Xiuzhen Zhang

With an explosion of datasets available on the Web, dataset search has gained attention as an emerging research domain. Understanding users' dataset behaviour is imperative for…

HTML

PDF (739 KB)

Downloads

501

Abstract

Purpose

With an explosion of datasets available on the Web, dataset search has gained attention as an emerging research domain. Understanding users' dataset behaviour is imperative for providing effective data discovery services. In this paper, the authors present a study on users' dataset search behaviour through the analysis of search logs from a research data discovery portal.

Design/methodology/approach

Using query and session based features, the authors apply cluster analysis to discover distinct user profiles with different search behaviours. One particular behavioural construct of our interest is users' expertise that the authors generate via computing semantic similarity between users' search queries and the title of metadata records in the displayed search results.

Findings

The findings revealed that there are six distinct classes of user behaviours for dataset search, namely; Expert Research, Expert Search, Expert Explore, Novice Research, Novice Search and Novice Explore.

Research limitations/implications

The user profiles are derived based on analysis of the search log of the research data catalogue in this study. Further research is needed to generalise the user profiles to other dataset search settings. Future research can take on a confirmatory approach to verify these user groups and establish a deeper understanding of their information needs.

Practical implications

The findings in this paper have implications for designing search systems that tailor search results matching the diverse information needs of different user groups.

Originality/value

We propose for the first time a taxonomy of users for dataset search based on their domain expertise and search behaviour.

Details

Journal of Documentation, vol. 79 no. 1

Type: Research Article

DOI:

ISSN: 0022-0418

Keywords

View access options

Article

Publication date: 22 June 2010

Podcast search: user goals and retrieval technologies

Jana Besser, Martha Larson and Katja Hofmann

This research aims to identify users' goals and strategies when searching for podcasts and their impact on the design of podcast retrieval technology. In particular, the paper…

HTML

PDF (170 KB)

Downloads

1846

Abstract

Purpose

This research aims to identify users' goals and strategies when searching for podcasts and their impact on the design of podcast retrieval technology. In particular, the paper seeks to explore the potential to address user goals with indexing based on podcast metadata and automatic speech recognition (ASR) transcripts.

Design/methodology/approach

The paper conducted a user study to obtain an overview of podcast search behaviour and goals, using a multi‐method approach of an online survey, a diary study, and contextual interviews. In a subsequent podcast retrieval experiment, the paper investigated the retrieval performance of the two choices of indexing features for search goals identified during the study.

Findings

The paper found that study participants used a variety of search strategies, partially influenced by available tools and their perceptions of these tools. Furthermore the experimental results revealed that retrieval using ASR transcripts performed significantly better than metadata‐based searching. However, a detailed result analysis suggested that the efficacy of the indexing methods was search‐goal dependent.

Research limitations/implications

The research constitutes a step towards a future framework for investigating user needs and addressing them in an experimental set‐up. It was primarily qualitative and exploratory in nature.

Practical implications

Podcast search engines require evidence about suitable indexing methods in order to make an informed decision concerning whether it is worth the resources to generate speech recognition transcripts.

Originality/value

Systematic studies of podcast searching have not previously been reported. Investigations of this kind hold the potential to optimise podcast retrieval in the long term.

Details

Online Information Review, vol. 34 no. 3

Type: Research Article

DOI:

ISSN: 1468-4527

Keywords

View access options

Article

Publication date: 17 October 2008

The retrieval effectiveness of web search engines: considering results descriptions

Dirk Lewandowski

The purpose of this paper is to compare five major web search engines (Google, Yahoo, MSN, Ask.com, and Seekport) for their retrieval effectiveness, taking into account not only…

HTML

PDF (489 KB)

Downloads

2741

Abstract

Purpose

The purpose of this paper is to compare five major web search engines (Google, Yahoo, MSN, Ask.com, and Seekport) for their retrieval effectiveness, taking into account not only the results, but also the results descriptions.

Design/methodology/approach

The study uses real‐life queries. Results are made anonymous and are randomized. Results are judged by the persons posing the original queries.

Findings

The two major search engines, Google and Yahoo, perform best, and there are no significant differences between them. Google delivers significantly more relevant result descriptions than any other search engine. This could be one reason for users perceiving this engine as superior.

Research limitations/implications

The study is based on a user model where the user takes into account a certain amount of results rather systematically. This may not be the case in real life.

Practical implications

The paper implies that search engines should focus on relevant descriptions. Searchers are advised to use other search engines in addition to Google.

Originality/value

This is the first major study comparing results and descriptions systematically and proposes new retrieval measures to take into account results descriptions.

Details

Journal of Documentation, vol. 64 no. 6

Type: Research Article

DOI:

ISSN: 0022-0418

Keywords

View access options

Article

Publication date: 21 April 2020

Retrieval performance of Google, Yahoo and Bing for navigational queries in the field of “life science and biomedicine”

Sumeer Gul, Sabha Ali and Aabid Hussain

The purpose of this study is to assess the retrieval performance of three search engines, i.e. Google, Yahoo and Bing for navigational queries using two important retrieval…

HTML

PDF (464 KB)

Downloads

652

Abstract

Purpose

The purpose of this study is to assess the retrieval performance of three search engines, i.e. Google, Yahoo and Bing for navigational queries using two important retrieval measures, i.e. precision and relative recall in the field of life science and biomedicine.

Design/methodology/approach

Top three search engines namely Google, Yahoo and Bing were selected on the basis of their ranking as per Alexa, an analytical tool that provides ranking of global websites. Furthermore, the scope of study was confined to those search engines having interface in English. Clarivate Analytics' Web of Science was used for the extraction of navigational queries in the field of life science and biomedicine. Navigational queries (classified as one-word, two-word and three-word queries) were extracted from the keywords of the papers representing the top 100 contributing authors in the select field. Keywords were also checked for the duplication. Two important evaluation parameters, i.e. precision and relative recall were used to calculate the performance of search engines on the navigational queries.

Findings

The mean precision for Google scores high (2.30) followed by Yahoo (2.29) and Bing (1.68), while mean relative recall also scores high for Google (0.36) followed by Yahoo (0.33) and Bing (0.31) respectively.

Research limitations/implications

The study is of great help to the researchers and academia in determining the retrieval efficiency of Google, Yahoo and Bing in terms of navigational query execution in the field of life science and biomedicine. The study can help users to focus on various search processes and the query structuring and its execution across the select search engines for achieving desired result list in a professional search environment. The study can also act as a ready reference source for exploring navigational queries and how these queries can be managed in the context of information retrieval process. It will also help to showcase the retrieval efficiency of various search engines on the basis of subject diversity (life science and biomedicine) highlighting the same in terms of query intention.

Originality/value

Though many studies have been conducted highlighting the retrieval efficiency of search engines the current work is the first of its kind to study the retrieval effectiveness of Google, Yahoo and Bing on navigational queries in the field of life science and biomedicine. The study will help in understanding various methods and approaches to be adopted by the users for the navigational query execution across a professional search environment, i.e. “life science and biomedicine”

Details

Data Technologies and Applications, vol. 54 no. 2

Type: Research Article

DOI:

ISSN: 2514-9288

Keywords

Access

Year

Content type

1 – 10 of over 1000