Search results

1 – 10 of 189
Article
Publication date: 10 January 2024

Artur Strzelecki and Andrej Miklosik

The landscape of search engine usage has evolved since the last known data were used to calculate click-through rate (CTR) values. The objective was to provide a replicable method…

67

Abstract

Purpose

The landscape of search engine usage has evolved since the last known data were used to calculate click-through rate (CTR) values. The objective was to provide a replicable method for accessing data from the Google search engine using programmatic access and calculating CTR values from the retrieved data to show how the CTRs have changed since the last studies were published.

Design/methodology/approach

In this study, the authors present the estimated CTR values in organic search results based on actual clicks and impressions data, and establish a protocol for collecting this data using Google programmatic access. For this study, the authors collected data on 416,386 clicks, 31,648,226 impressions and 8,861,416 daily queries.

Findings

The results show that CTRs have decreased from previously reported values in both academic research and industry benchmarks. The estimates indicate that the top-ranked result in Google's organic search results features a CTR of 9.28%, followed by 5.82 and 3.11% for positions two and three, respectively. The authors also demonstrate that CTRs vary across various types of devices. On desktop devices, the CTR decreases steadily with each lower ranking position. On smartphones, the CTR starts high but decreases rapidly, with an unprecedented increase from position 13 onwards. Tablets have the lowest and most variable CTR values.

Practical implications

The theoretical implications include the generation of a current dataset on search engine results and user behavior, made available to the research community, creation of a unique methodology for generating new datasets and presenting the updated information on CTR trends. The managerial implications include the establishment of the need for businesses to focus on optimizing other forms of Google search results in addition to organic text results, and the possibility of application of this study's methodology to determine CTRs for their own websites.

Originality/value

This study provides a novel method to access real CTR data and estimates current CTRs for top organic Google search results, categorized by device.

Details

Aslib Journal of Information Management, vol. ahead-of-print no. ahead-of-print
Type: Research Article
ISSN: 2050-3806

Keywords

Article
Publication date: 18 May 2023

Rongen Yan, Depeng Dang, Hu Gao, Yan Wu and Wenhui Yu

Question answering (QA) answers the questions asked by people in the form of natural language. In the QA, due to the subjectivity of users, the questions they query have different…

Abstract

Purpose

Question answering (QA) answers the questions asked by people in the form of natural language. In the QA, due to the subjectivity of users, the questions they query have different expressions, which increases the difficulty of text retrieval. Therefore, the purpose of this paper is to explore new query rewriting method for QA that integrates multiple related questions (RQs) to form an optimal question. Moreover, it is important to generate a new dataset of the original query (OQ) with multiple RQs.

Design/methodology/approach

This study collects a new dataset SQuAD_extend by crawling the QA community and uses word-graph to model the collected OQs. Next, Beam search finds the best path to get the best question. To deeply represent the features of the question, pretrained model BERT is used to model sentences.

Findings

The experimental results show three outstanding findings. (1) The quality of the answers is better after adding the RQs of the OQs. (2) The word-graph that is used to model the problem and choose the optimal path is conducive to finding the best question. (3) Finally, BERT can deeply characterize the semantics of the exact problem.

Originality/value

The proposed method can use word-graph to construct multiple questions and select the optimal path for rewriting the question, and the quality of answers is better than the baseline. In practice, the research results can help guide users to clarify their query intentions and finally achieve the best answer.

Details

Data Technologies and Applications, vol. 58 no. 1
Type: Research Article
ISSN: 2514-9288

Keywords

Open Access
Article
Publication date: 23 May 2023

Kimmo Kettunen, Heikki Keskustalo, Sanna Kumpulainen, Tuula Pääkkönen and Juha Rautiainen

This study aims to identify user perception of different qualities of optical character recognition (OCR) in texts. The purpose of this paper is to study the effect of different…

Abstract

Purpose

This study aims to identify user perception of different qualities of optical character recognition (OCR) in texts. The purpose of this paper is to study the effect of different quality OCR on users' subjective perception through an interactive information retrieval task with a collection of one digitized historical Finnish newspaper.

Design/methodology/approach

This study is based on the simulated work task model used in interactive information retrieval. Thirty-two users made searches to an article collection of Finnish newspaper Uusi Suometar 1869–1918 which consists of ca. 1.45 million autosegmented articles. The article search database had two versions of each article with different quality OCR. Each user performed six pre-formulated and six self-formulated short queries and evaluated subjectively the top 10 results using a graded relevance scale of 0–3. Users were not informed about the OCR quality differences of the otherwise identical articles.

Findings

The main result of the study is that improved OCR quality affects subjective user perception of historical newspaper articles positively: higher relevance scores are given to better-quality texts.

Originality/value

To the best of the authors’ knowledge, this simulated interactive work task experiment is the first one showing empirically that users' subjective relevance assessments are affected by a change in the quality of an optically read text.

Details

Journal of Documentation, vol. 79 no. 7
Type: Research Article
ISSN: 0022-0418

Keywords

Article
Publication date: 9 May 2023

Jing Chen, Hongli Chen and Yingyun Li

Cross-app interactive search has become the new normal, but the characteristics of their tactic transitions are still unclear. This study investigated the transitions of daily…

Abstract

Purpose

Cross-app interactive search has become the new normal, but the characteristics of their tactic transitions are still unclear. This study investigated the transitions of daily search tactics during the cross-app interaction search process.

Design/methodology/approach

In total, 204 young participants' impressive cross-app search experiences in real daily situations were collected. The search tactics and tactic transition sequences in their search process were obtained by open coding. Statistical analysis and sequence analysis were used to analyze the frequently applied tactics, the frequency and probability of tactic transitions and the tactic transition sequences representing characteristics of tactic transitions occurring at the beginning, middle and ending phases. 

Findings

Creating the search statement (Creat), evaluating search results (EvalR), evaluating an individual item (EvalI) and keeping a record (Rec) were the most frequently applied tactics. The frequency and probability of transitions differed significantly between different tactic types. “Creat? EvalR? EvalI? Rec” is the typical path; Initiate the search in various ways and modifying the search statement were highlighted at the beginning phase; iteratively creating the search statement is highlighted in the middle phase; Moreover, utilization and feedback of information are highlighted at the ending phase. 

Originality/value

The present study shed new light on tactic transitions in the cross-app interactive environment to explore information search behaviour. The findings of this work provide targeted suggestions for optimizing APP query, browsing and monitoring systems.

Details

Information Technology & People, vol. 37 no. 3
Type: Research Article
ISSN: 0959-3845

Keywords

Article
Publication date: 18 March 2024

Raj Kumar Bhardwaj, Ritesh Kumar and Mohammad Nazim

This paper evaluates the precision of four metasearch engines (MSEs) – DuckDuckGo, Dogpile, Metacrawler and Startpage, to determine which metasearch engine exhibits the highest…

Abstract

Purpose

This paper evaluates the precision of four metasearch engines (MSEs) – DuckDuckGo, Dogpile, Metacrawler and Startpage, to determine which metasearch engine exhibits the highest level of precision and to identify the metasearch engine that is most likely to return the most relevant search results.

Design/methodology/approach

The research is divided into two parts: the first phase involves four queries categorized into two segments (4-Q-2-S), while the second phase includes six queries divided into three segments (6-Q-3-S). These queries vary in complexity, falling into three types: simple, phrase and complex. The precision, average precision and the presence of duplicates across all the evaluated metasearch engines are determined.

Findings

The study clearly demonstrated that Startpage returned the most relevant results and achieved the highest precision (0.98) among the four MSEs. Conversely, DuckDuckGo exhibited consistent performance across both phases of the study.

Research limitations/implications

The study only evaluated four metasearch engines, which may not be representative of all available metasearch engines. Additionally, a limited number of queries were used, which may not be sufficient to generalize the findings to all types of queries.

Practical implications

The findings of this study can be valuable for accreditation agencies in managing duplicates, improving their search capabilities and obtaining more relevant and precise results. These findings can also assist users in selecting the best metasearch engine based on precision rather than interface.

Originality/value

The study is the first of its kind which evaluates the four metasearch engines. No similar study has been conducted in the past to measure the performance of metasearch engines.

Details

Performance Measurement and Metrics, vol. 25 no. 1
Type: Research Article
ISSN: 1467-8047

Keywords

Open Access
Article
Publication date: 30 October 2023

Koraljka Golub, Xu Tan, Ying-Hsang Liu and Jukka Tyrkkö

This exploratory study aims to help contribute to the understanding of online information search behaviour of PhD students from different humanities fields, with a focus on…

Abstract

Purpose

This exploratory study aims to help contribute to the understanding of online information search behaviour of PhD students from different humanities fields, with a focus on subject searching.

Design/methodology/approach

The methodology is based on a semi-structured interview within which the participants are asked to conduct both a controlled search task and a free search task. The sample comprises eight PhD students in several humanities disciplines at Linnaeus University, a medium-sized Swedish university from 2020.

Findings

Most humanities PhD students in the study have received training in information searching, but it has been too basic. Most rely on web search engines like Google and Google Scholar for publications' search, and university's discovery system for known-item searching. As these systems do not rely on controlled vocabularies, the participants often struggle with too many retrieved documents that are not relevant. Most only rarely or never use disciplinary bibliographic databases. The controlled search task has shown some benefits of using controlled vocabularies in the disciplinary databases, but incomplete synonym or concept coverage as well as user unfriendly search interface present hindrances.

Originality/value

The paper illuminates an often-forgotten but pervasive challenge of subject searching, especially for humanities researchers. It demonstrates difficulties and shows how most PhD students have missed finding an important resource in their research. It calls for the need to reconsider training in information searching and the need to make use of controlled vocabularies implemented in various search systems with usable search and browse user interfaces.

Article
Publication date: 1 December 2023

Andreas Skalkos, Aggeliki Tsohou, Maria Karyda and Spyros Kokolakis

Search engines, the most popular online services, are associated with several concerns. Users are concerned about the unauthorized processing of their personal data, as well as…

Abstract

Purpose

Search engines, the most popular online services, are associated with several concerns. Users are concerned about the unauthorized processing of their personal data, as well as about search engines keeping track of their search preferences. Various search engines have been introduced to address these concerns, claiming that they protect users’ privacy. The authors call these search engines privacy-preserving search engines (PPSEs). This paper aims to investigate the factors that motivate search engine users to use PPSEs.

Design/methodology/approach

This study adopted protection motivation theory (PMT) and associated its constructs with subjective norms to build a comprehensive research model. The authors tested the research model using survey data from 830 search engine users worldwide.

Findings

The results confirm the interpretive power of PMT in privacy-related decision-making and show that users are more inclined to take protective measures when they consider that data abuse is a more severe risk and that they are more vulnerable to data abuse. Furthermore, the results highlight the importance of subjective norms in predicting and determining PPSE use. Because subjective norms refer to perceived social influences from important others to engage or refrain from protective behavior, the authors reveal that the recommendation from people that users consider important motivates them to take protective measures and use PPSE.

Research limitations/implications

Despite its interesting results, this research also has some limitations. First, because the survey was conducted online, the study environment was less controlled. Participants may have been disrupted or affected, for example, by the presence of others or background noise during the session. Second, some of the survey items could possibly be misinterpreted by the respondents in the study questionnaire, as they did not have access to clarifications that a researcher could possibly provide. Third, another limitation refers to the use of the Amazon Turk tool. According Paolacci and Chandler (2014) in comparison to the US population, the MTurk workers are more educated, younger and less religiously and politically diverse. Fourth, another limitation of this study could be that Actual Use of PPSE is self-reported by the participants. This could cause bias because it is argued that internet users’ statements may be in contrast with their actions in real life or in an experimental scenario (Berendt et al., 2005, Jensen et al., 2005); Moreover, some limitations of this study emerge from the use of PMT as the background theory of the study. PMT identifies the main factors that affect protection motivation, but other environmental and cognitive factors can also have a significant role in determining the way an individual’s attitude is formed. As Rogers (1975) argued, PMT as proposed does not attempt to specify all of the possible factors in a fear appeal that may affect persuasion, but rather a systematic exposition of a limited set of components and cognitive mediational processes that may account for a significant portion of the variance in acceptance by users. In addition, as Tanner et al. (1991) argue, the ‘PMT’s assumption that the subjects have not already developed a coping mechanism is one of its limitations. Finally, another limitation is that the sample does not include users from China, which is the second most populated country. Unfortunately, DuckDuckGo has been blocked in China, so it has not been feasible to include users from China in this study.

Practical implications

The proposed model and, specifically, the subjective norms construct proved to be successful in predicting PPSE use. This study demonstrates the need for PPSE to exhibit and advertise the technology and measures they use to protect users’ privacy. This will contribute to the effort to persuade internet users to use these tools.

Social implications

This study sought to explore the privacy attitudes of search engine users using PMT and its constructs’ association with subjective norms. It used the PMT to elucidate users’ perceptions that motivate them to privacy adoption behavior, as well as how these perceptions influence the type of search engine they use. This research is a first step toward gaining a better understanding of the processes that drive people’s motivation to, or not to, protect their privacy online by means of using PPSE. At the same time, this study contributes to search engine vendors by revealing that users’ need to be persuaded not only about their policy toward privacy but also by considering and implementing new strategies of diffusion that could enhance the use of the PPSE.

Originality/value

This research is a first step toward gaining a better understanding of the processes that drive people’s motivation to, or not to, protect their privacy online by means of using PPSEs.

Details

Information & Computer Security, vol. ahead-of-print no. ahead-of-print
Type: Research Article
ISSN: 2056-4961

Keywords

Open Access
Article
Publication date: 22 March 2024

Geming Zhang, Lin Yang and Wenxiang Jiang

The purpose of this study is to introduce the top-level design ideas and the overall architecture of earthquake early-warning system for high speed railways in China, which is…

Abstract

Purpose

The purpose of this study is to introduce the top-level design ideas and the overall architecture of earthquake early-warning system for high speed railways in China, which is based on P-wave earthquake early-warning and multiple ways of rapid treatment.

Design/methodology/approach

The paper describes the key technologies that are involved in the development of the system, such as P-wave identification and earthquake early-warning, multi-source seismic information fusion and earthquake emergency treatment technologies. The paper also presents the test results of the system, which show that it has complete functions and its major performance indicators meet the design requirements.

Findings

The study demonstrates that the high speed railways earthquake early-warning system serves as an important technical tool for high speed railways to cope with the threat of earthquake to the operation safety. The key technical indicators of the system have excellent performance: The first report time of the P-wave is less than three seconds. From the first arrival of P-wave to the beginning of train braking, the total delay of onboard emergency treatment is 3.63 seconds under 95% probability. The average total delay for power failures triggered by substations is 3.3 seconds.

Originality/value

The paper provides a valuable reference for the research and development of earthquake early-warning system for high speed railways in other countries and regions. It also contributes to the earthquake prevention and disaster reduction efforts.

Article
Publication date: 8 March 2024

Feng Zhang, Youliang Wei and Tao Feng

GraphQL is a new Open API specification that allows clients to send queries and obtain data flexibly according to their needs. However, a high-complexity GraphQL query may lead to…

Abstract

Purpose

GraphQL is a new Open API specification that allows clients to send queries and obtain data flexibly according to their needs. However, a high-complexity GraphQL query may lead to an excessive data volume of the query result, which causes problems such as resource overload of the API server. Therefore, this paper aims to address this issue by predicting the response data volume of a GraphQL query statement.

Design/methodology/approach

This paper proposes a GraphQL response data volume prediction approach based on Code2Vec and AutoML. First, a GraphQL query statement is transformed into a path collection of an abstract syntax tree based on the idea of Code2Vec, and then the query is aggregated into a vector with the fixed length. Finally, the response result data volume is predicted by a fully connected neural network. To further improve the prediction accuracy, the prediction results of embedded features are combined with the field features and summary features of the query statement to predict the final response data volume by the AutoML model.

Findings

Experiments on two public GraphQL API data sets, GitHub and Yelp, show that the accuracy of the proposed approach is 15.85% and 50.31% higher than existing GraphQL response volume prediction approaches based on machine learning techniques, respectively.

Originality/value

This paper proposes an approach that combines Code2Vec and AutoML for GraphQL query response data volume prediction with higher accuracy.

Details

International Journal of Web Information Systems, vol. 20 no. 3
Type: Research Article
ISSN: 1744-0084

Keywords

Article
Publication date: 13 September 2022

Haixiao Dai, Phong Lam Nguyen and Cat Kutay

Digital learning systems are crucial for education and data collected can analyse students learning performances to improve support. The purpose of this study is to design and…

Abstract

Purpose

Digital learning systems are crucial for education and data collected can analyse students learning performances to improve support. The purpose of this study is to design and build an asynchronous hardware and software system that can store data on a local device until able to share. It was developed for staff and students at university who are using the limited internet access in areas such as remote Northern Territory. This system can asynchronously link the users’ devices and the central server at the university using unstable internet.

Design/methodology/approach

A Learning Box has been build based on minicomputer and a web learning management system (LMS). This study presents different options to create such a system and discusses various approaches for data syncing. The structure of the final setup is a Moodle (Modular Object Oriented Developmental Learning Environment) LMS on a Raspberry Pi which provides a Wi-Fi hotspot. The authors worked with lecturers from X University who work in remote Northern Territory regions to test this and provide feedback. This study also considered suitable data collection and techniques that can be used to analyse the available data to support learning analysis by the staff. This research focuses on building an asynchronous hardware and software system that can store data on a local device until able to share. It was developed for staff and students at university who are using the limited internet access in areas such as remote Northern Territory. This system can asynchronously link the users’ devices and the central server at the university using unstable internet. Digital learning systems are crucial for education, and data collected can analyse students learning performances to improve support.

Findings

The resultant system has been tested in various scenarios to ensure it is robust when students’ submissions are collected. Furthermore, issues around student familiarity and ability to use online systems have been considered due to early feedback.

Research limitations/implications

Monitoring asynchronous collaborative learning systems through analytics can assist students learning in their own time. Learning Hubs can be easily set up and maintained using micro-computers now easily available. A phone interface is sufficient for learning when video and audio submissions are supported in the LMS.

Practical implications

This study shows digital learning can be implemented in an offline environment by using a Raspberry Pi as LMS server. Offline collaborative learning in remote communities can be achieved by applying asynchronized data syncing techniques. Also asynchronized data syncing can be reliably achieved by using change logs and incremental syncing technique.

Social implications

Focus on audio and video submission allows engagement in higher education by students with lower literacy but higher practice skills. Curriculum that clearly supports the level of learning required for a job needs to be developed, and the assumption that literacy is part of the skilled job in the workplace needs to be removed.

Originality/value

To the best of the authors’ knowledge, this is the first remote asynchronous collaborative LMS environment that has been implemented. This provides the hardware and software for opportunities to share learning remotely. Material to support low literacy students is also included.

Details

Interactive Technology and Smart Education, vol. 21 no. 1
Type: Research Article
ISSN: 1741-5659

Keywords

1 – 10 of 189