Search results

1 – 10 of 733
Article
Publication date: 8 September 2023

Oussama Ayoub, Christophe Rodrigues and Nicolas Travers

This paper aims to manage the word gap in information retrieval (IR) especially for long documents belonging to specific domains. In fact, with the continuous growth of text data…

Abstract

Purpose

This paper aims to manage the word gap in information retrieval (IR) especially for long documents belonging to specific domains. In fact, with the continuous growth of text data that modern IR systems have to manage, existing solutions are needed to efficiently find the best set of documents for a given request. The words used to describe a query can differ from those used in related documents. Despite meaning closeness, nonoverlapping words are challenging for IR systems. This word gap becomes significant for long documents from specific domains.

Design/methodology/approach

To generate new words for a document, a deep learning (DL) masked language model is used to infer related words. Used DL models are pretrained on massive text data and carry common or specific domain knowledge to propose a better document representation.

Findings

The authors evaluate the approach of this study on specific IR domains with long documents to show the genericity of the proposed model and achieve encouraging results.

Originality/value

In this paper, to the best of the authors’ knowledge, an original unsupervised and modular IR system based on recent DL methods is introduced.

Details

International Journal of Web Information Systems, vol. 19 no. 5/6
Type: Research Article
ISSN: 1744-0084

Keywords

Article
Publication date: 31 August 2023

Faycal Touazi and Amel Boustil

The purpose of this paper is to address the need for new approaches in locating items that closely match user preference criteria due to the rise in data volume of knowledge bases…

Abstract

Purpose

The purpose of this paper is to address the need for new approaches in locating items that closely match user preference criteria due to the rise in data volume of knowledge bases resulting from Open Data initiatives. Specifically, the paper focuses on evaluating SPARQL qualitative preference queries over user preferences in SPARQL.

Design/methodology/approach

The paper outlines a novel approach for handling SPARQL preference queries by representing preferences through symbolic weights using the possibilistic logic (PL) framework. This approach allows for the management of symbolic weights without relying on numerical values, using a partial ordering system instead. The paper compares this approach with numerous other approaches, including those based on skylines, fuzzy sets and conditional preference networks.

Findings

The paper highlights the advantages of the proposed approach, which enables the representation of preference criteria through symbolic weights and qualitative considerations. This approach offers a more intuitive way to convey preferences and manage rankings.

Originality/value

The paper demonstrates the usefulness and originality of the proposed SPARQL language in the PL framework. The approach extends SPARQL by incorporating symbolic weights and qualitative preferences.

Details

International Journal of Web Information Systems, vol. 19 no. 5/6
Type: Research Article
ISSN: 1744-0084

Keywords

Article
Publication date: 6 November 2023

Daniel Coughlin, Andrew Dudash and Jacob Gordon

The purpose of this paper is to investigate the feasibility of automating Google Scholar searching to harvest citation data of monographs for collection analysis.

Abstract

Purpose

The purpose of this paper is to investigate the feasibility of automating Google Scholar searching to harvest citation data of monographs for collection analysis.

Design/methodology/approach

This study discusses the creation and refinement of a Scraper application programming interface query structure created to match library collection inventories to their Google Scholar listings to retrieve citation counts.

Findings

This paper indicates that Google Scholar is a feasible and usable tool for retrieving monograph citation data.

Originality/value

This study shows that Google Scholar citation data can be harvested for monographs in an automated fashion to serve as a source of bibliographic data, something not typically done outside of individual academics and writers tracking their personal academic impact factors.

Details

Library Hi Tech News, vol. ahead-of-print no. ahead-of-print
Type: Research Article
ISSN: 0741-9058

Keywords

Article
Publication date: 30 August 2023

Yi-Hung Liu, Sheng-Fong Chen and Dan-Wei (Marian) Wen

Online medical repositories provide a platform for users to share information and dynamically access abundant electronic health data. It is important to determine whether case…

Abstract

Purpose

Online medical repositories provide a platform for users to share information and dynamically access abundant electronic health data. It is important to determine whether case report information can assist the general public in appropriately managing their diseases. Therefore, this paper aims to introduce a novel deep learning-based method that allows non-professionals to make inquiries using ordinary vocabulary, retrieving the most relevant case reports for accurate and effective health information.

Design/methodology/approach

The dataset of case reports was collected from both the patient-generated research network and the digital medical journal repository. To enhance the accuracy of obtaining relevant case reports, the authors propose a retrieval approach that combines BERT and BiLSTM methods. The authors identified representative health-related case reports and analyzed the retrieval performance, as well as user judgments.

Findings

This study aims to provide the necessary functionalities to deliver relevant health case reports based on input from ordinary terms. The proposed framework includes features for health management, user feedback acquisition and ranking by weights to obtain the most pertinent case reports.

Originality/value

This study contributes to health information systems by analyzing patients' experiences and treatments with the case report retrieval model. The results of this study can provide immense benefit to the general public who intend to find treatment decisions and experiences from relevant case reports.

Details

Aslib Journal of Information Management, vol. ahead-of-print no. ahead-of-print
Type: Research Article
ISSN: 2050-3806

Keywords

Open Access
Article
Publication date: 23 May 2023

Kimmo Kettunen, Heikki Keskustalo, Sanna Kumpulainen, Tuula Pääkkönen and Juha Rautiainen

This study aims to identify user perception of different qualities of optical character recognition (OCR) in texts. The purpose of this paper is to study the effect of different…

Abstract

Purpose

This study aims to identify user perception of different qualities of optical character recognition (OCR) in texts. The purpose of this paper is to study the effect of different quality OCR on users' subjective perception through an interactive information retrieval task with a collection of one digitized historical Finnish newspaper.

Design/methodology/approach

This study is based on the simulated work task model used in interactive information retrieval. Thirty-two users made searches to an article collection of Finnish newspaper Uusi Suometar 1869–1918 which consists of ca. 1.45 million autosegmented articles. The article search database had two versions of each article with different quality OCR. Each user performed six pre-formulated and six self-formulated short queries and evaluated subjectively the top 10 results using a graded relevance scale of 0–3. Users were not informed about the OCR quality differences of the otherwise identical articles.

Findings

The main result of the study is that improved OCR quality affects subjective user perception of historical newspaper articles positively: higher relevance scores are given to better-quality texts.

Originality/value

To the best of the authors’ knowledge, this simulated interactive work task experiment is the first one showing empirically that users' subjective relevance assessments are affected by a change in the quality of an optically read text.

Details

Journal of Documentation, vol. 79 no. 7
Type: Research Article
ISSN: 0022-0418

Keywords

Article
Publication date: 18 March 2024

Raj Kumar Bhardwaj, Ritesh Kumar and Mohammad Nazim

This paper evaluates the precision of four metasearch engines (MSEs) – DuckDuckGo, Dogpile, Metacrawler and Startpage, to determine which metasearch engine exhibits the highest…

Abstract

Purpose

This paper evaluates the precision of four metasearch engines (MSEs) – DuckDuckGo, Dogpile, Metacrawler and Startpage, to determine which metasearch engine exhibits the highest level of precision and to identify the metasearch engine that is most likely to return the most relevant search results.

Design/methodology/approach

The research is divided into two parts: the first phase involves four queries categorized into two segments (4-Q-2-S), while the second phase includes six queries divided into three segments (6-Q-3-S). These queries vary in complexity, falling into three types: simple, phrase and complex. The precision, average precision and the presence of duplicates across all the evaluated metasearch engines are determined.

Findings

The study clearly demonstrated that Startpage returned the most relevant results and achieved the highest precision (0.98) among the four MSEs. Conversely, DuckDuckGo exhibited consistent performance across both phases of the study.

Research limitations/implications

The study only evaluated four metasearch engines, which may not be representative of all available metasearch engines. Additionally, a limited number of queries were used, which may not be sufficient to generalize the findings to all types of queries.

Practical implications

The findings of this study can be valuable for accreditation agencies in managing duplicates, improving their search capabilities and obtaining more relevant and precise results. These findings can also assist users in selecting the best metasearch engine based on precision rather than interface.

Originality/value

The study is the first of its kind which evaluates the four metasearch engines. No similar study has been conducted in the past to measure the performance of metasearch engines.

Details

Performance Measurement and Metrics, vol. 25 no. 1
Type: Research Article
ISSN: 1467-8047

Keywords

Article
Publication date: 19 April 2024

Hui-Min Lai, Shin-Yuan Hung and David C. Yen

Seekers who visit professional virtual communities (PVCs) are usually motivated by knowledge-seeking, which is a complex cognitive process. How do seekers search for knowledge…

Abstract

Purpose

Seekers who visit professional virtual communities (PVCs) are usually motivated by knowledge-seeking, which is a complex cognitive process. How do seekers search for knowledge, and how is their search linked to prior knowledge or PVC situation factors? From the cognitive process and interactional psychology perspectives, this study investigated the three-way interactions between seekers’ expertise, task complexity, and perceptions of PVC features (i.e. knowledge quality and system quality) on knowledge-seeking strategies and resultant outcomes.

Design/methodology/approach

A field experiment was conducted with 119 seekers in a PVC using a 2 × 2 factorial design of seekers’ expertise (i.e. expert versus novice) and task complexity (i.e. low versus high).

Findings

The study reveals three significant insights: (1) For a high-complexity task, experts adopt an ask-directed searching strategy compared to novices, whereas novices adopt a browsing strategy; (2) For a high-complexity task, experts who perceive a high system quality are more likely than novices to adopt an ask-directed searching strategy; and (3) Task completion time and task quality are associated with the adoption of ask-directed searching strategies, whereas knowledge seekers’ satisfaction is more associated with the adoption of browsing strategy.

Originality/value

We draw on the perspectives of cognitive process and interactional psychology to explore potential two- and three-way interactions of seekers’ expertise, task complexity, and PVC features on the adoption of knowledge-seeking strategies in a PVC context. Our findings provide deep insights into seekers’ behavior in a PVC, given the popularity of the search for knowledge in PVCs.

Details

Information Technology & People, vol. ahead-of-print no. ahead-of-print
Type: Research Article
ISSN: 0959-3845

Keywords

Abstract

Details

Technology vs. Government: The Irresistible Force Meets the Immovable Object
Type: Book
ISBN: 978-1-83867-951-4

Article
Publication date: 3 February 2023

Frendy and Fumiko Takeda

Partners are responsible for allocating audit tasks and facilitating knowledge sharing among team members. This study considers changes in the composition of partners to proxy for…

Abstract

Purpose

Partners are responsible for allocating audit tasks and facilitating knowledge sharing among team members. This study considers changes in the composition of partners to proxy for the continuity of the audit team. This study examines the effect of audit team continuity on audit outcomes (audit quality and report lags), pricing and its determinant (lead partner experience), which have not been thoroughly examined in previous studies.

Design/methodology/approach

This study employs string similarity metrics to measure audit team continuity. The study employs multivariate panel data regression empirical models to estimate a sample of 26,007 firm-years of listed Japanese companies from 2008 to 2019.

Findings

The study reveals that audit team continuity is negatively associated with audit fees, regardless of the auditor’s size. This finding contributes to the existing literature by showing that audit team continuity represents one of the determinant factors of audit fee. For clients of large audit firms, companies with higher (lower) audit team continuity issue audit reports in less (more) time. The experience of lead partners is a strong predictor of audit team continuity, irrespective of audit firm size. Audit quality is not associated with audit team continuity for either large or small audit firms.

Originality/value

This study proposes and examines audit team continuity measures that employ string similarity metrics to quantify changes in the composition of partners in consecutive audit engagements. Audit team continuity expands upon the tenure of individual audit partners, which is commonly used in prior literature as a measure of client–partner relationships.

Details

Journal of Accounting Literature, vol. 45 no. 2
Type: Research Article
ISSN: 0737-4607

Keywords

Article
Publication date: 11 September 2023

Ying Gao, Qiang Zhang, Xiaoran Wang, Yanmei Huang, Fanshuang Meng and Wan Tao

Currently, the Tang tomb mural cultural relic resources are presented in a multi-source and heterogeneous manner, with a lack of effective organization and sharing between…

Abstract

Purpose

Currently, the Tang tomb mural cultural relic resources are presented in a multi-source and heterogeneous manner, with a lack of effective organization and sharing between resources. Therefore, this study aims to propose a multidimensional knowledge discovery solution for Tang tomb mural cultural relic resources.

Design/methodology/approach

Taking the Tang tomb murals collected by the Shaanxi History Museum as an example, based on clarifying the relevant concepts of Tang tomb mural resources and considering both dynamic and static dimensions, a top-down approach was adopted to first construct an ontology model of Tang tomb mural type cultural relics resources. Then, the actual case data was imported into the Neo4J graph database according to the defined pattern hierarchy to complete the static organization of knowledge, and presented in a multimodal form in knowledge reasoning and retrieval. In addition, geographic information system (GIS) technology is used to dynamically display the spatiotemporal distribution of Tang tomb mural resources, and the distribution trend is analysed from a digital humanistic perspective.

Findings

The multi-dimensional knowledge discovery of Tang tomb mural cultural relics resources can help establish the correlation and spatiotemporal relationship between resources, providing support for semantic retrieval and navigation, knowledge discovery and visualization and so on.

Originality/value

This study takes the murals in the collection of the Shaanxi History Museum as an example, revealing potential knowledge associations in a static and intelligent way, achieving knowledge discovery and management of Tang tomb murals, and dynamically presents the spatial distribution of Tang tomb murals through GIS technology, meeting the knowledge presentation needs of different users and opening up new ideas for the study of Tang tomb murals.

Details

The Electronic Library , vol. 42 no. 1
Type: Research Article
ISSN: 0264-0473

Keywords

1 – 10 of 733