Search results

1 – 10 of 165
Article
Publication date: 18 March 2024

Raj Kumar Bhardwaj, Ritesh Kumar and Mohammad Nazim

This paper evaluates the precision of four metasearch engines (MSEs) – DuckDuckGo, Dogpile, Metacrawler and Startpage, to determine which metasearch engine exhibits the highest…

Abstract

Purpose

This paper evaluates the precision of four metasearch engines (MSEs) – DuckDuckGo, Dogpile, Metacrawler and Startpage, to determine which metasearch engine exhibits the highest level of precision and to identify the metasearch engine that is most likely to return the most relevant search results.

Design/methodology/approach

The research is divided into two parts: the first phase involves four queries categorized into two segments (4-Q-2-S), while the second phase includes six queries divided into three segments (6-Q-3-S). These queries vary in complexity, falling into three types: simple, phrase and complex. The precision, average precision and the presence of duplicates across all the evaluated metasearch engines are determined.

Findings

The study clearly demonstrated that Startpage returned the most relevant results and achieved the highest precision (0.98) among the four MSEs. Conversely, DuckDuckGo exhibited consistent performance across both phases of the study.

Research limitations/implications

The study only evaluated four metasearch engines, which may not be representative of all available metasearch engines. Additionally, a limited number of queries were used, which may not be sufficient to generalize the findings to all types of queries.

Practical implications

The findings of this study can be valuable for accreditation agencies in managing duplicates, improving their search capabilities and obtaining more relevant and precise results. These findings can also assist users in selecting the best metasearch engine based on precision rather than interface.

Originality/value

The study is the first of its kind which evaluates the four metasearch engines. No similar study has been conducted in the past to measure the performance of metasearch engines.

Details

Performance Measurement and Metrics, vol. 25 no. 1
Type: Research Article
ISSN: 1467-8047

Keywords

Article
Publication date: 4 April 2024

Artur Strzelecki

This paper aims to give an overview of the history and evolution of commercial search engines. It traces the development of search engines from their early days to their current…

Abstract

Purpose

This paper aims to give an overview of the history and evolution of commercial search engines. It traces the development of search engines from their early days to their current form as complex technology-powered systems that offer a wide range of features and services.

Design/methodology/approach

In recent years, advancements in artificial intelligence (AI) technology have led to the development of AI-powered chat services. This study explores official announcements and releases of three major search engines, Google, Bing and Baidu, of AI-powered chat services.

Findings

Three major players in the search engine market, Google, Microsoft and Baidu started to integrate AI chat into their search results. Google has released Bard, later upgraded to Gemini, a LaMDA-powered conversational AI service. Microsoft has launched Bing Chat, renamed later to Copilot, a GPT-powered by OpenAI search engine. The largest search engine in China, Baidu, released a similar service called Ernie. There are also new AI-based search engines, which are briefly described.

Originality/value

This paper discusses the strengths and weaknesses of the traditional – algorithmic powered search engines and modern search with generative AI support, and the possibilities of merging them into one service. This study stresses the types of inquiries provided to search engines, users’ habits of using search engines and the technological advantage of search engine infrastructure.

Details

Library Hi Tech News, vol. ahead-of-print no. ahead-of-print
Type: Research Article
ISSN: 0741-9058

Keywords

Article
Publication date: 8 September 2023

Oussama Ayoub, Christophe Rodrigues and Nicolas Travers

This paper aims to manage the word gap in information retrieval (IR) especially for long documents belonging to specific domains. In fact, with the continuous growth of text data…

Abstract

Purpose

This paper aims to manage the word gap in information retrieval (IR) especially for long documents belonging to specific domains. In fact, with the continuous growth of text data that modern IR systems have to manage, existing solutions are needed to efficiently find the best set of documents for a given request. The words used to describe a query can differ from those used in related documents. Despite meaning closeness, nonoverlapping words are challenging for IR systems. This word gap becomes significant for long documents from specific domains.

Design/methodology/approach

To generate new words for a document, a deep learning (DL) masked language model is used to infer related words. Used DL models are pretrained on massive text data and carry common or specific domain knowledge to propose a better document representation.

Findings

The authors evaluate the approach of this study on specific IR domains with long documents to show the genericity of the proposed model and achieve encouraging results.

Originality/value

In this paper, to the best of the authors’ knowledge, an original unsupervised and modular IR system based on recent DL methods is introduced.

Details

International Journal of Web Information Systems, vol. 19 no. 5/6
Type: Research Article
ISSN: 1744-0084

Keywords

Article
Publication date: 2 May 2023

Carlos Lopezosa, Dimitrios Giomelakis, Leyberson Pedrosa and Lluís Codina

This paper constitutes the first academic study to be made of Google Discover as applied to online journalism.

Abstract

Purpose

This paper constitutes the first academic study to be made of Google Discover as applied to online journalism.

Design/methodology/approach

This paper constitutes the first academic study to be made of Google Discover as applied to online journalism. The study involved conducting 61 semi-structured interviews with experts that are representative of a range of different professional profiles within the fields of journalism and search engine positioning (SEO) in Brazil, Spain and Greece. Based on the data collected, the authors created five semantic categories and compared the experts' perceptions in order to detect common response patterns.

Findings

This study results confirm the existence of different degrees of convergence and divergence in the opinions expressed in these three countries regarding the main dimensions of Google Discover, including specific strategies using the feed, its impact on web traffic, its impact on both quality and sensationalist content and on the degree of responsibility shown by the digital media in its use. The authors are also able to propose a set of best practices that journalists and digital media in-house web visibility teams should take into account to increase their probability of appearing in Google Discover. To this end, the authors consider strategies in the following areas of application: topics, different aspects of publication, elements of user experience, strategic analysis and diffusion and marketing.

Originality/value

Although research exists on the application of SEO to different areas, there have not, to date, been any studies examining Google Discover.

Peer review

The peer-review history for this article is available at: https://publons.com/publon/10.1108/OIR-10-2022-0574

Details

Online Information Review, vol. 48 no. 1
Type: Research Article
ISSN: 1468-4527

Keywords

Article
Publication date: 8 January 2024

Morteza Mohammadi Ostani, Jafar Ebadollah Amoughin and Mohadeseh Jalili Manaf

This study aims to adjust Thesis-type properties on Schema.org using metadata models and standards (MS) (Bibframe, electronic thesis and dissertations [ETD]-MS, Common European…

Abstract

Purpose

This study aims to adjust Thesis-type properties on Schema.org using metadata models and standards (MS) (Bibframe, electronic thesis and dissertations [ETD]-MS, Common European Research Information Format [CERIF] and Dublin Core [DC]) to enrich the Thesis-type properties for better description and processing on the Web.

Design/methodology/approach

This study is applied, descriptive analysis in nature and is based on content analysis in terms of method. The research population consisted of elements and attributes of the metadata model and standards (Bibframe, ETD-MS, CERIF and DC) and Thesis-type properties in the Schema.org. The data collection tool was a researcher-made checklist, and the data collection method was structured observation.

Findings

The results show that the 65 Thesis-type properties and the two levels of Thing and CreativeWork as its parents on Schema.org that corresponds to the elements and attributes of related models and standards. In addition, 12 properties are special to the Thesis type for better comprehensive description and processing, and 27 properties are added to the CreativeWork type.

Practical implications

Enrichment and expansion of Thesis-type properties on Schema.org is one of the practical applications of the present study, which have enabled more comprehensive description and processing and increased access points and visibility for ETDs in the environment Web and digital libraries.

Originality/value

This study has offered some new Thesis type properties and CreativeWork levels on Schema.org. To the best of the authors’ knowledge, this is the first time this issue is investigated.

Details

Digital Library Perspectives, vol. ahead-of-print no. ahead-of-print
Type: Research Article
ISSN: 2059-5816

Keywords

Article
Publication date: 29 November 2023

Emine Sendurur and Sonja Gabriel

This study aims to discover how domain familiarity and language affect the cognitive load and the strategies applied for the evaluation of search engine results pages (SERP).

Abstract

Purpose

This study aims to discover how domain familiarity and language affect the cognitive load and the strategies applied for the evaluation of search engine results pages (SERP).

Design/methodology/approach

This study used an experimental research design. The pattern of the experiment was based upon repeated measures design. Each student was given four SERPs varying in two dimensions: language and content. The criteria of students to decide on the three best links within the SERP, the reasoning behind their selection, and their perceived cognitive load of the given task were the repeated measures collected from each participant.

Findings

The evaluation criteria changed according to the language and task type. The cognitive load was reported higher when the content was presented in English or when the content was academic. Regarding the search strategies, a majority of students trusted familiar sources or relied on keywords they found in the short description of the links. A qualitative analysis showed that students can be grouped into different types according to the reasons they stated for their choices. Source seeker, keyword seeker and specific information seeker were the most common types observed.

Originality/value

This study has an international scope with regard to data collection. Moreover, the tasks and findings contribute to the literature on information literacy.

Details

The Electronic Library , vol. 42 no. 2
Type: Research Article
ISSN: 0264-0473

Keywords

Article
Publication date: 26 March 2024

Keyu Chen, Beiyu You, Yanbo Zhang and Zhengyi Chen

Prefabricated building has been widely applied in the construction industry all over the world, which can significantly reduce labor consumption and improve construction…

Abstract

Purpose

Prefabricated building has been widely applied in the construction industry all over the world, which can significantly reduce labor consumption and improve construction efficiency compared with conventional approaches. During the construction of prefabricated buildings, the overall efficiency largely depends on the lifting sequence and path of each prefabricated component. To improve the efficiency and safety of the lifting process, this study proposes a framework for automatically optimizing the lifting path of prefabricated building components using building information modeling (BIM), improved 3D-A* and a physic-informed genetic algorithm (GA).

Design/methodology/approach

Firstly, the industry foundation class (IFC) schema for prefabricated buildings is established to enrich the semantic information of BIM. After extracting corresponding component attributes from BIM, the models of typical prefabricated components and their slings are simplified. Further, the slings and elements’ rotations are considered to build a safety bounding box. Secondly, an efficient 3D-A* is proposed for element path planning by integrating both safety factors and variable step size. Finally, an efficient GA is designed to obtain the optimal lifting sequence that satisfies physical constraints.

Findings

The proposed optimization framework is validated in a physics engine with a pilot project, which enables better understanding. The results show that the framework can intuitively and automatically generate the optimal lifting path for each type of prefabricated building component. Compared with traditional algorithms, the improved path planning algorithm significantly reduces the number of nodes computed by 91.48%, resulting in a notable decrease in search time by 75.68%.

Originality/value

In this study, a prefabricated component path planning framework based on the improved A* algorithm and GA is proposed for the first time. In addition, this study proposes a safety-bounding box that considers the effects of torsion and slinging of components during lifting. The semantic information of IFC for component lifting is enriched by taking into account lifting data such as binding positions, lifting methods, lifting angles and lifting offsets.

Details

Engineering, Construction and Architectural Management, vol. ahead-of-print no. ahead-of-print
Type: Research Article
ISSN: 0969-9988

Keywords

Article
Publication date: 3 October 2023

Haklae Kim

Despite ongoing research into archival metadata standards, digital archives are unable to effectively represent records in their appropriate contexts. This study aims to propose a…

Abstract

Purpose

Despite ongoing research into archival metadata standards, digital archives are unable to effectively represent records in their appropriate contexts. This study aims to propose a knowledge graph that depicts the diverse relationships between heterogeneous digital archive entities.

Design/methodology/approach

This study introduces and describes a method for applying knowledge graphs to digital archives in a step-by-step manner. It examines archival metadata standards, such as Records in Context Ontology (RiC-O), for characterising digital records; explains the process of data refinement, enrichment and reconciliation with examples; and demonstrates the use of knowledge graphs constructed using semantic queries.

Findings

This study introduced the 97imf.kr archive as a knowledge graph, enabling meaningful exploration of relationships within the archive’s records. This approach facilitated comprehensive record descriptions about different record entities. Applying archival ontologies with general-purpose vocabularies to digital records was advised to enhance metadata coherence and semantic search.

Originality/value

Most digital archives serviced in Korea are limited in the proper use of archival metadata standards. The contribution of this study is to propose a practical application of knowledge graph technology for linking and exploring digital records. This study details the process of collecting raw data on archives, data preprocessing and data enrichment, and demonstrates how to build a knowledge graph connected to external data. In particular, the knowledge graph of RiC-O vocabulary, Wikidata and Schema.org vocabulary and the semantic query using it can be applied to supplement keyword search in conventional digital archives.

Details

The Electronic Library , vol. 42 no. 1
Type: Research Article
ISSN: 0264-0473

Keywords

Article
Publication date: 29 November 2023

Hui Shi, Drew Hwang, Dazhi Chong and Gongjun Yan

Today’s in-demand skills may not be needed tomorrow. As companies are adopting a new group of technologies, they are in huge need of information technology (IT) professionals who…

25

Abstract

Purpose

Today’s in-demand skills may not be needed tomorrow. As companies are adopting a new group of technologies, they are in huge need of information technology (IT) professionals who can fill various IT positions with a mixture of technical and problem-solving skills. This study aims to adopt a sematic analysis approach to explore how the US Information Systems (IS) programs meet the challenges of emerging IT topics.

Design/methodology/approach

This study considers the application of a hybrid semantic analysis approach to the analysis of IS higher education programs in the USA. It proposes a semantic analysis framework and a semantic analysis algorithm to analyze and evaluate the context of the IS programs. To be more specific, the study uses digital transformation as a case study to examine the readiness of the IS programs in the USA to meet the challenges of digital transformation. First, this study developed a knowledge pool of 15 principles and 98 keywords from an extensive literature review on digital transformation. Second, this study collects 4,093 IS courses from 315 IS programs in the USA and 493,216 scientific publication records from the Web of Science Core Collection.

Findings

Using the knowledge pool and two collected data sets, the semantic analysis algorithm was implemented to compute a semantic similarity score (DxScore) between an IS course’s context and digital transformation. To present the credibility of the research results of this paper, the state ranking using the similarity scores and the state employment ranking were compared. The research results can be used by IS educators in the future in the process of updating the IS curricula. Regarding IT professionals in the industry, the results can provide insights into the training of their current/future employees.

Originality/value

This study explores the status of the IS programs in the USA by proposing a semantic analysis framework, using digital transformation as a case study to illustrate the application of the proposed semantic analysis framework, and developing a knowledge pool, a corpus and a course information collection.

Details

Information Discovery and Delivery, vol. ahead-of-print no. ahead-of-print
Type: Research Article
ISSN: 2398-6247

Keywords

Article
Publication date: 16 May 2023

Arun Malik, Shamneesh Sharma, Isha Batra, Chetan Sharma, Mahender Singh Kaswan and Jose Arturo Garza-Reyes

Environmental sustainability is quickly becoming one of the most critical issues in industry development. This study aims to conduct a systematic literature review through which…

Abstract

Purpose

Environmental sustainability is quickly becoming one of the most critical issues in industry development. This study aims to conduct a systematic literature review through which the author can provide various research areas to work on for future researchers and provide insight into Industry 4.0 and environmental sustainability.

Design/methodology/approach

This study accomplishes this by performing a backward analysis using text mining on the Scopus database. Latent semantic analysis (LSA) was used to analyze the corpus of 4,364 articles published between 2013 and 2023. The authors generated ten clusters using keywords in the industrial revolution and environmental sustainability domain, highlighting ten research avenues for further exploration.

Findings

In this study, three research questions discuss the role of environmental sustainability with Industry 4.0. The author predicted ten clusters treated as recent trends on which more insight is required from future researchers. The authors provided year-wise analysis, top authors, top countries, top sources and network analysis related to the topic. Finally, the study provided industrialization’s effect on environmental sustainability and the future aspect of automation.

Research limitations/implications

The reliability of the current study may be compromised, notwithstanding the size of the sample used. Poor retrieval of the literature corpus can be attributed to the limitations imposed by the search words, synonyms, string construction and variety of search engines used, as well as to the accurate exclusion of results for which the search string is insufficient.

Originality/value

This research is the first-ever study in which a natural language processing technique is implemented to predict future research areas based on the keywords–document relationship.

Details

International Journal of Lean Six Sigma, vol. 15 no. 1
Type: Research Article
ISSN: 2040-4166

Keywords

1 – 10 of 165