Search results

1 – 10 of 270
Article
Publication date: 18 March 2024

Raj Kumar Bhardwaj, Ritesh Kumar and Mohammad Nazim

This paper evaluates the precision of four metasearch engines (MSEs) – DuckDuckGo, Dogpile, Metacrawler and Startpage, to determine which metasearch engine exhibits the highest…

Abstract

Purpose

This paper evaluates the precision of four metasearch engines (MSEs) – DuckDuckGo, Dogpile, Metacrawler and Startpage, to determine which metasearch engine exhibits the highest level of precision and to identify the metasearch engine that is most likely to return the most relevant search results.

Design/methodology/approach

The research is divided into two parts: the first phase involves four queries categorized into two segments (4-Q-2-S), while the second phase includes six queries divided into three segments (6-Q-3-S). These queries vary in complexity, falling into three types: simple, phrase and complex. The precision, average precision and the presence of duplicates across all the evaluated metasearch engines are determined.

Findings

The study clearly demonstrated that Startpage returned the most relevant results and achieved the highest precision (0.98) among the four MSEs. Conversely, DuckDuckGo exhibited consistent performance across both phases of the study.

Research limitations/implications

The study only evaluated four metasearch engines, which may not be representative of all available metasearch engines. Additionally, a limited number of queries were used, which may not be sufficient to generalize the findings to all types of queries.

Practical implications

The findings of this study can be valuable for accreditation agencies in managing duplicates, improving their search capabilities and obtaining more relevant and precise results. These findings can also assist users in selecting the best metasearch engine based on precision rather than interface.

Originality/value

The study is the first of its kind which evaluates the four metasearch engines. No similar study has been conducted in the past to measure the performance of metasearch engines.

Details

Performance Measurement and Metrics, vol. 25 no. 1
Type: Research Article
ISSN: 1467-8047

Keywords

Article
Publication date: 16 February 2022

Maedeh Mosharraf

The purpose of the paper is to propose a semantic model for describing open source software (OSS) in a machine–human understandable format. The model is extracted to support…

Abstract

Purpose

The purpose of the paper is to propose a semantic model for describing open source software (OSS) in a machine–human understandable format. The model is extracted to support source code reusing and revising as the two primary targets of OSS through a systematic review of related documents.

Design/methodology/approach

Conducting a systematic review, all the software reusing criteria are identified and introduced to the web of data by an ontology for OSS (O4OSS). The software semantic model introduced in this paper explores OSS through triple expressions in which the O4OSS properties are predicates.

Findings

This model improves the quality of web data by describing software in a structured machine–human readable profile, which is linked to the related data that was previously published on the web. Evaluating the OSS semantic model is accomplished through comparing it with previous approaches, comparing the software structured metadata with profile index of software in some well-known repositories, calculating the software retrieval rank and surveying domain experts.

Originality/value

Considering context-specific information and authority levels, the proposed software model would be applicable to any open and close software. Using this model to publish software provides an infrastructure of connected meaningful data and helps developers overcome some specific challenges. By navigating software data, many questions which can be answered only through reading multiple documents can be automatically responded on the web of data.

Details

Aslib Journal of Information Management, vol. 75 no. 4
Type: Research Article
ISSN: 2050-3806

Keywords

Content available
Book part
Publication date: 16 August 2023

Abstract

Details

Resilient and Sustainable Destinations After Disaster
Type: Book
ISBN: 978-1-80382-022-4

Book part
Publication date: 16 August 2023

Debasish Batabyal, Nilanjan Ray, Sudin Bag and Kaustav Nag

India is the birthplace of four major religions which are Hinduism, Jainism, Buddhism, and Sikhism. It is a country where people of all religions live in peace and harmony. Many…

Abstract

India is the birthplace of four major religions which are Hinduism, Jainism, Buddhism, and Sikhism. It is a country where people of all religions live in peace and harmony. Many tourists experience different forms of harassment during their pilgrimage journey, for example, fleecing, extortion of money, harassment by beggars, persistence by vendors and priests, fraud, sexual harassment, and other unacceptable behaviors. In order to appreciate the extent of harassment encountered by tourists, an in-depth study was conducted on the reviews provided by tourists on TripAdvisor's (Indian) website. This study characterizes harassments through ethnographic research approach of published reviews. A total of 260 reviews of 28 top Hindu temples are considered for all the states and union territories where the top Hindu pilgrim centers are located, (excluding Nagaland) according to TripAdvisor. The concerned reviews are categorized and further investigated through a primary data collection in proportion with the reviews received in respective temple sites in the study. through structural equation modeling (SEM). Important factors have been identified for future policy issues and recommendations in these most crowded places with unique mass tourism practices.

Article
Publication date: 4 April 2024

Artur Strzelecki

This paper aims to give an overview of the history and evolution of commercial search engines. It traces the development of search engines from their early days to their current…

Abstract

Purpose

This paper aims to give an overview of the history and evolution of commercial search engines. It traces the development of search engines from their early days to their current form as complex technology-powered systems that offer a wide range of features and services.

Design/methodology/approach

In recent years, advancements in artificial intelligence (AI) technology have led to the development of AI-powered chat services. This study explores official announcements and releases of three major search engines, Google, Bing and Baidu, of AI-powered chat services.

Findings

Three major players in the search engine market, Google, Microsoft and Baidu started to integrate AI chat into their search results. Google has released Bard, later upgraded to Gemini, a LaMDA-powered conversational AI service. Microsoft has launched Bing Chat, renamed later to Copilot, a GPT-powered by OpenAI search engine. The largest search engine in China, Baidu, released a similar service called Ernie. There are also new AI-based search engines, which are briefly described.

Originality/value

This paper discusses the strengths and weaknesses of the traditional – algorithmic powered search engines and modern search with generative AI support, and the possibilities of merging them into one service. This study stresses the types of inquiries provided to search engines, users’ habits of using search engines and the technological advantage of search engine infrastructure.

Details

Library Hi Tech News, vol. ahead-of-print no. ahead-of-print
Type: Research Article
ISSN: 0741-9058

Keywords

Article
Publication date: 8 September 2023

Oussama Ayoub, Christophe Rodrigues and Nicolas Travers

This paper aims to manage the word gap in information retrieval (IR) especially for long documents belonging to specific domains. In fact, with the continuous growth of text data…

Abstract

Purpose

This paper aims to manage the word gap in information retrieval (IR) especially for long documents belonging to specific domains. In fact, with the continuous growth of text data that modern IR systems have to manage, existing solutions are needed to efficiently find the best set of documents for a given request. The words used to describe a query can differ from those used in related documents. Despite meaning closeness, nonoverlapping words are challenging for IR systems. This word gap becomes significant for long documents from specific domains.

Design/methodology/approach

To generate new words for a document, a deep learning (DL) masked language model is used to infer related words. Used DL models are pretrained on massive text data and carry common or specific domain knowledge to propose a better document representation.

Findings

The authors evaluate the approach of this study on specific IR domains with long documents to show the genericity of the proposed model and achieve encouraging results.

Originality/value

In this paper, to the best of the authors’ knowledge, an original unsupervised and modular IR system based on recent DL methods is introduced.

Details

International Journal of Web Information Systems, vol. 19 no. 5/6
Type: Research Article
ISSN: 1744-0084

Keywords

Article
Publication date: 21 June 2023

Debasis Majhi and Bhaskar Mukherjee

The purpose of this study is to identify the research fronts by analysing highly cited core papers adjusted with the age of a paper in library and information science (LIS) where…

Abstract

Purpose

The purpose of this study is to identify the research fronts by analysing highly cited core papers adjusted with the age of a paper in library and information science (LIS) where natural language processing (NLP) is being applied significantly.

Design/methodology/approach

By excavating international databases, 3,087 core papers that received at least 5% of the total citations have been identified. By calculating the average mean years of these core papers, and total citations received, a CPT (citation/publication/time) value was calculated in all 20 fronts to understand how a front is relatively receiving greater attention among peers within a course of time. One theme article has been finally identified from each of these 20 fronts.

Findings

Bidirectional encoder representations from transformers with CPT value 1.608 followed by sentiment analysis with CPT 1.292 received highest attention in NLP research. Columbia University New York, in terms of University, Journal of the American Medical Informatics Association, in terms of journals, USA followed by People Republic of China, in terms of country and Xu, H., University of Texas, in terms of author are the top in these fronts. It is identified that the NLP applications boost the performance of digital libraries and automated library systems in the digital environment.

Practical implications

Any research fronts that are identified in the findings of this paper may be used as a base for researchers who intended to perform extensive research on NLP.

Originality/value

To the best of the authors’ knowledge, the methodology adopted in this paper is the first of its kind where meta-analysis approach has been used for understanding the research fronts in sub field like NLP for a broad domain like LIS.

Details

Digital Library Perspectives, vol. 39 no. 3
Type: Research Article
ISSN: 2059-5816

Keywords

Article
Publication date: 2 May 2023

Carlos Lopezosa, Dimitrios Giomelakis, Leyberson Pedrosa and Lluís Codina

This paper constitutes the first academic study to be made of Google Discover as applied to online journalism.

Abstract

Purpose

This paper constitutes the first academic study to be made of Google Discover as applied to online journalism.

Design/methodology/approach

This paper constitutes the first academic study to be made of Google Discover as applied to online journalism. The study involved conducting 61 semi-structured interviews with experts that are representative of a range of different professional profiles within the fields of journalism and search engine positioning (SEO) in Brazil, Spain and Greece. Based on the data collected, the authors created five semantic categories and compared the experts' perceptions in order to detect common response patterns.

Findings

This study results confirm the existence of different degrees of convergence and divergence in the opinions expressed in these three countries regarding the main dimensions of Google Discover, including specific strategies using the feed, its impact on web traffic, its impact on both quality and sensationalist content and on the degree of responsibility shown by the digital media in its use. The authors are also able to propose a set of best practices that journalists and digital media in-house web visibility teams should take into account to increase their probability of appearing in Google Discover. To this end, the authors consider strategies in the following areas of application: topics, different aspects of publication, elements of user experience, strategic analysis and diffusion and marketing.

Originality/value

Although research exists on the application of SEO to different areas, there have not, to date, been any studies examining Google Discover.

Peer review

The peer-review history for this article is available at: https://publons.com/publon/10.1108/OIR-10-2022-0574

Details

Online Information Review, vol. 48 no. 1
Type: Research Article
ISSN: 1468-4527

Keywords

Article
Publication date: 8 January 2024

Morteza Mohammadi Ostani, Jafar Ebadollah Amoughin and Mohadeseh Jalili Manaf

This study aims to adjust Thesis-type properties on Schema.org using metadata models and standards (MS) (Bibframe, electronic thesis and dissertations [ETD]-MS, Common European…

Abstract

Purpose

This study aims to adjust Thesis-type properties on Schema.org using metadata models and standards (MS) (Bibframe, electronic thesis and dissertations [ETD]-MS, Common European Research Information Format [CERIF] and Dublin Core [DC]) to enrich the Thesis-type properties for better description and processing on the Web.

Design/methodology/approach

This study is applied, descriptive analysis in nature and is based on content analysis in terms of method. The research population consisted of elements and attributes of the metadata model and standards (Bibframe, ETD-MS, CERIF and DC) and Thesis-type properties in the Schema.org. The data collection tool was a researcher-made checklist, and the data collection method was structured observation.

Findings

The results show that the 65 Thesis-type properties and the two levels of Thing and CreativeWork as its parents on Schema.org that corresponds to the elements and attributes of related models and standards. In addition, 12 properties are special to the Thesis type for better comprehensive description and processing, and 27 properties are added to the CreativeWork type.

Practical implications

Enrichment and expansion of Thesis-type properties on Schema.org is one of the practical applications of the present study, which have enabled more comprehensive description and processing and increased access points and visibility for ETDs in the environment Web and digital libraries.

Originality/value

This study has offered some new Thesis type properties and CreativeWork levels on Schema.org. To the best of the authors’ knowledge, this is the first time this issue is investigated.

Details

Digital Library Perspectives, vol. ahead-of-print no. ahead-of-print
Type: Research Article
ISSN: 2059-5816

Keywords

Article
Publication date: 29 November 2023

Emine Sendurur and Sonja Gabriel

This study aims to discover how domain familiarity and language affect the cognitive load and the strategies applied for the evaluation of search engine results pages (SERP).

Abstract

Purpose

This study aims to discover how domain familiarity and language affect the cognitive load and the strategies applied for the evaluation of search engine results pages (SERP).

Design/methodology/approach

This study used an experimental research design. The pattern of the experiment was based upon repeated measures design. Each student was given four SERPs varying in two dimensions: language and content. The criteria of students to decide on the three best links within the SERP, the reasoning behind their selection, and their perceived cognitive load of the given task were the repeated measures collected from each participant.

Findings

The evaluation criteria changed according to the language and task type. The cognitive load was reported higher when the content was presented in English or when the content was academic. Regarding the search strategies, a majority of students trusted familiar sources or relied on keywords they found in the short description of the links. A qualitative analysis showed that students can be grouped into different types according to the reasons they stated for their choices. Source seeker, keyword seeker and specific information seeker were the most common types observed.

Originality/value

This study has an international scope with regard to data collection. Moreover, the tasks and findings contribute to the literature on information literacy.

Details

The Electronic Library , vol. 42 no. 2
Type: Research Article
ISSN: 0264-0473

Keywords

1 – 10 of 270