Search results

1 – 10 of 186
Article
Publication date: 4 January 2022

Mohammad Moradi and Mohammad Reza Keyvanpour

Image annotation plays an important role in image retrieval process, especially when it comes to content-based image retrieval. In order to compensate the intrinsic weakness of…

Abstract

Purpose

Image annotation plays an important role in image retrieval process, especially when it comes to content-based image retrieval. In order to compensate the intrinsic weakness of machines in performing cognitive task of (human-like) image annotation, leveraging humans’ knowledge and abilities in the form of crowdsourcing-based annotation have gained momentum. Among various approaches for this purpose, an innovative one is integrating the annotation process into the CAPTCHA workflow. In this paper, the current state of the research works in the field and experimental efficiency analysis of this approach are investigated.

Design/methodology/approach

At first, and with the aim of presenting a current state report of research studies in the field, a comprehensive literature review is provided. Then, several experiments and statistical analyses are conducted to investigate how CAPTCHA-based image annotation is reliable, accurate and efficient.

Findings

In addition to study of current trends and best practices for CAPTCHA-based image annotation, the experimental results demonstrated that despite some intrinsic limitations on leveraging the CAPTCHA as a crowdsourcing platform, when the challenge, i.e. annotation task, is selected and designed appropriately, the efficiency of CAPTCHA-based image annotation can outperform traditional approaches. Nonetheless, there are several design considerations that should be taken into account when the CAPTCHA is used as an image annotation platform.

Originality/value

To the best of the authors’ knowledge, this is the first study to analyze different aspects of the titular topic through exploration of the literature and experimental investigation. Therefore, it is anticipated that the outcomes of this study can draw a roadmap for not only CAPTCHA-based image annotation but also CAPTCHA-mediated crowdsourcing and even image annotation.

Details

Aslib Journal of Information Management, vol. 74 no. 3
Type: Research Article
ISSN: 2050-3806

Keywords

Article
Publication date: 6 November 2017

Chih-Ming Chen and Ming-Yueh Tsay

Collaboratively annotating digital texts allow users to add valued information, share ideas and create knowledge. Most importantly, annotated content can help users obtain a…

1809

Abstract

Purpose

Collaboratively annotating digital texts allow users to add valued information, share ideas and create knowledge. Most importantly, annotated content can help users obtain a deeper and broader understanding of a text compared to digital content without annotations. This work proposes a novel collaborative annotation system (CAS) with four types of multimedia annotations including text annotation, picture annotation, voice annotation and video annotation which can embedded into any HTML Web pages to enable users to collaboratively add and manage annotations on these pages and provide a shared mechanism for discussing shared annotations among multiple users. By applying the CAS in a mashup on static HTML Web pages, this study aims to discuss the applications of CAS in digital curation, crowdsourcing and digital humanities to encourage existing strong relations among them.

Design/methodology/approach

This work adopted asynchronous JavaScript (Ajax) and a model-view-controller framework to implement a CAS with reading annotation tools for knowledge creating, archiving and sharing services, as well as applying the implemented CAS to support digital curation, crowdsourcing and digital humanities. A questionnaire survey method was used to investigate the ideas and satisfaction of visitors who attended a digital curation with CAS support in the item dimensions of the interactivity with displayed products, the attraction and the content absorption effect. Also, to collect qualitative data that may not be revealed by the questionnaire survey, semi-structured interviews were performed at the end of the digital curation exhibition activity. Additionally, the effects of the crowdsourcing and digital humanities with CAS support on collecting and organizing ideas and opinions for historical events and promoting humanity research outcomes were considered as future works because they all need to take a long time to investigate.

Findings

Based on the questionnaire survey, this work found that the digital curation with CAS support revealed the highest rating score in terms of the item dimension of attraction effect. The result shows applying CAS to support digital curation is practicable, novel and interesting to visitors. Additionally, this work also successfully applied the developed CAS to crowdsourcing and digital humanities so that the two research fields may be brought into a new ground.

Originality/value

Based on the CAS, this work developed a novel digital curation approach which has a high degree of satisfaction on attraction effect to visitors, an innovative crowdsourcing platform that combined with a digital archive system to efficiently gather collective intelligence to solve the difficult problems of identifying digital archive contents and a high potential digital humanity research mode that can assist humanities scholars to annotate the texts with distinct interpretation and viewpoints on an ancient map, as well as discuss with other humanities scholars to stimulate discussion on more issues.

Details

The Electronic Library, vol. 35 no. 6
Type: Research Article
ISSN: 0264-0473

Keywords

Article
Publication date: 13 December 2022

Chengxi Yan, Xuemei Tang, Hao Yang and Jun Wang

The majority of existing studies about named entity recognition (NER) concentrate on the prediction enhancement of deep neural network (DNN)-based models themselves, but the…

Abstract

Purpose

The majority of existing studies about named entity recognition (NER) concentrate on the prediction enhancement of deep neural network (DNN)-based models themselves, but the issues about the scarcity of training corpus and the difficulty of annotation quality control are not fully solved, especially for Chinese ancient corpora. Therefore, designing a new integrated solution for Chinese historical NER, including automatic entity extraction and man-machine cooperative annotation, is quite valuable for improving the effectiveness of Chinese historical NER and fostering the development of low-resource information extraction.

Design/methodology/approach

The research provides a systematic approach for Chinese historical NER with a three-stage framework. In addition to the stage of basic preprocessing, the authors create, retrain and yield a high-performance NER model only using limited labeled resources during the stage of augmented deep active learning (ADAL), which entails three steps—DNN-based NER modeling, hybrid pool-based sampling (HPS) based on the active learning (AL), and NER-oriented data augmentation (DA). ADAL is thought to have the capacity to maintain the performance of DNN as high as possible under the few-shot constraint. Then, to realize machine-aided quality control in crowdsourcing settings, the authors design a stage of globally-optimized automatic label consolidation (GALC). The core of GALC is a newly-designed label consolidation model called simulated annealing-based automatic label aggregation (“SA-ALC”), which incorporates the factors of worker reliability and global label estimation. The model can assure the annotation quality of those data from a crowdsourcing annotation system.

Findings

Extensive experiments on two types of Chinese classical historical datasets show that the authors’ solution can effectively reduce the corpus dependency of a DNN-based NER model and alleviate the problem of label quality. Moreover, the results also show the superior performance of the authors’ pipeline approaches (i.e. HPS + DA and SA-ALC) compared to equivalent baselines in each stage.

Originality/value

The study sheds new light on the automatic extraction of Chinese historical entities in an all-technological-process integration. The solution is helpful to effectively reducing the annotation cost and controlling the labeling quality for the NER task. It can be further applied to similar tasks of information extraction and other low-resource fields in theoretical and practical ways.

Details

Aslib Journal of Information Management, vol. 75 no. 3
Type: Research Article
ISSN: 2050-3806

Keywords

Open Access
Article
Publication date: 16 August 2019

Morteza Moradi, Mohammad Moradi, Farhad Bayat and Adel Nadjaran Toosi

Human or machine, which one is more intelligent and powerful for performing computing and processing tasks? Over the years, researchers and scientists have spent significant…

3904

Abstract

Purpose

Human or machine, which one is more intelligent and powerful for performing computing and processing tasks? Over the years, researchers and scientists have spent significant amounts of money and effort to answer this question. Nonetheless, despite some outstanding achievements, replacing humans in the intellectual tasks is not yet a reality. Instead, to compensate for the weakness of machines in some (mostly cognitive) tasks, the idea of putting human in the loop has been introduced and widely accepted. In this paper, the notion of collective hybrid intelligence as a new computing framework and comprehensive.

Design/methodology/approach

According to the extensive acceptance and efficiency of crowdsourcing, hybrid intelligence and distributed computing concepts, the authors have come up with the (complementary) idea of collective hybrid intelligence. In this regard, besides providing a brief review of the efforts made in the related contexts, conceptual foundations and building blocks of the proposed framework are delineated. Moreover, some discussion on architectural and realization issues are presented.

Findings

The paper describes the conceptual architecture, workflow and schematic representation of a new hybrid computing concept. Moreover, by introducing three sample scenarios, its benefits, requirements, practical roadmap and architectural notes are explained.

Originality/value

The major contribution of this work is introducing the conceptual foundations to combine and integrate collective intelligence of humans and machines to achieve higher efficiency and (computing) performance. To the best of the authors’ knowledge, this the first study in which such a blessing integration is considered. Therefore, it is believed that the proposed computing concept could inspire researchers toward realizing such unprecedented possibilities in practical and theoretical contexts.

Details

International Journal of Crowd Science, vol. 3 no. 2
Type: Research Article
ISSN: 2398-7294

Keywords

Article
Publication date: 7 January 2020

Xuanhui Zhang, Si Chen, Yuxiang Chris Zhao, Shijie Song and Qinghua Zhu

The purpose of this paper is to explore how social value orientation and domain knowledge affect cooperation levels and transcription quality in crowdsourced manuscript…

Abstract

Purpose

The purpose of this paper is to explore how social value orientation and domain knowledge affect cooperation levels and transcription quality in crowdsourced manuscript transcription, and contribute to the recruitment of participants in such projects in practice.

Design/methodology/approach

The authors conducted a quasi-experiment using Transcribe-Sheng, which is a well-known crowdsourced manuscript transcription project in China, to investigate the influences of social value orientation and domain knowledge. The experiment lasted one month and involved 60 participants. ANOVA was used to test the research hypotheses. Moreover, inverviews and thematic analyses were conducted to analyze the qualitative data in order to provide additional insights.

Findings

The analysis confirmed that in crowdsourced manuscript transcription, social value orientation has a significant effect on participants’ cooperation level and transcription quality; domain knowledge has a significant effect on participants’ transcription quality, but not on their cooperation level. The results also reveal the interactive effect of social value orientation and domain knowledge on cooperation levels and quality of transcription. The analysis of the qualitative data illustrated the influences of social value orientation and domain knowledge on crowdsourced manuscript transcription in detail.

Originality/value

Researchers have paid little attention to the impacts of the psychological and cognitive factors on crowdsourced manuscript transcription. This study investigated the effect of social value orientation and the combined effect of social value orientation and domain knowledge in this context. The findings shed light on crowdsourcing transcription initiatives in the cultural heritage domain and can be used to facilitate participant selection in such projects.

Details

Aslib Journal of Information Management, vol. 72 no. 2
Type: Research Article
ISSN: 2050-3806

Keywords

Article
Publication date: 25 March 2020

Jihong Liang, Hao Wang and Xiaojing Li

The purpose of this paper is to explore the task design and assignment of full-text generation on mass Chinese historical archives (CHAs) by crowdsourcing, with special attention…

Abstract

Purpose

The purpose of this paper is to explore the task design and assignment of full-text generation on mass Chinese historical archives (CHAs) by crowdsourcing, with special attention paid to how to best divide full-text generation tasks into smaller ones assigned to crowdsourced volunteers and to improve the digitization of mass CHAs and the data-oriented processing of the digital humanities.

Design/methodology/approach

This paper starts from the complexities of character recognition of mass CHAs, takes Sheng Xuanhuai archives crowdsourcing project of Shanghai Library as a case study, and makes use of the theories of archival science, including diplomatics of Chinese archival documents, and the historical approach of Chinese archival traditions as the theoretical basis and analysis methods. The results are generated through the comprehensive research.

Findings

This paper points out that volunteer tasks of full-text generation include transcription, punctuation, proofreading, metadata description, segmentation, and attribute annotation in digital humanities and provides a metadata element set for volunteers to use in creating or revising metadata descriptions and also provides an attribute tag set. The two sets can be used across the humanities to construct overall observations about texts and the archives of which they are a part. Along these lines, this paper presents significant insights for application in outlining the principles, methods, activities, and procedures of crowdsourced full-text generation for mass CHAs.

Originality/value

This study is the first to explore and identify the effective design and allocation of tasks for crowdsourced volunteers completing full-text generation on CHAs in digital humanities.

Details

Aslib Journal of Information Management, vol. 72 no. 2
Type: Research Article
ISSN: 2050-3806

Keywords

Article
Publication date: 26 August 2014

Giannis Skevakis, Chrisa Tsinaraki, Ioanna Trochatou and Stavros Christodoulakis

This paper aims to describe MoM-NOCS, a Framework and a System that support communities with common interests in nature to capture and share multimedia observations of nature…

Abstract

Purpose

This paper aims to describe MoM-NOCS, a Framework and a System that support communities with common interests in nature to capture and share multimedia observations of nature objects or events using mobile devices.

Design/methodology/approach

The observations are automatically associated with contextual metadata that allow them to be visualized on top of 2D or 3D maps. The observations are managed by a multimedia management system, and annotated by the same and/or other users with common interests. Annotations made by the crowd support the knowledge distillation of the data and data provenance processes in the system.

Findings

MoM-NOCS is complementary and interoperable with systems that are managed by natural history museums like MMAT (Makris et al., 2013) and biodiversity metadata management systems like BIOCASE (BioCASE) and GBIF (GBIF) so that they can link to interesting observations in the system, and the statistics of the observations that they manage can be visualized by the software.

Originality/value

The Framework offers rich functionality for visualizing the observations made by the crowd as function of time.

Details

International Journal of Pervasive Computing and Communications, vol. 10 no. 3
Type: Research Article
ISSN: 1742-7371

Keywords

Article
Publication date: 11 October 2021

Changro Lee

Sampling taxpayers for audits has always been a major concern for policymakers of tax administration. The purpose of this study is to propose a systematic method to select a small…

Abstract

Purpose

Sampling taxpayers for audits has always been a major concern for policymakers of tax administration. The purpose of this study is to propose a systematic method to select a small number of taxpayers with a high probability of tax fraud.

Design/methodology/approach

An efficient sampling method for taxpayers for an audit is investigated in the context of a property acquisition tax. An autoencoder, a popular unsupervised learning algorithm, is applied to 2,228 tax returns, and reconstruction errors are calculated to determine the probability of tax deficiencies for each return. The reasonableness of the estimated reconstruction errors is verified using the Apriori algorithm, a well-known marketing tool for identifying patterns in purchased item sets.

Findings

The sorted reconstruction scores are reasonably consistent with actual fraudulent/non-fraudulent cases, indicating that the reconstruction errors can be utilized to select suspected taxpayers for an audit in a cost-effective manner.

Originality/value

The proposed deep learning-based approach is expected to be applied in a real-world tax administration, promoting voluntary compliance of taxpayers, and reinforcing the self-assessing acquisition tax system.

Details

Data Technologies and Applications, vol. 56 no. 3
Type: Research Article
ISSN: 2514-9288

Keywords

Open Access
Article
Publication date: 16 April 2019

Mohammad Moradi

As a relatively new computing paradigm, crowdsourcing has gained enormous attention in the recent decade. Its compliance with the Web 2.0 principles, also, puts forward…

2299

Abstract

Purpose

As a relatively new computing paradigm, crowdsourcing has gained enormous attention in the recent decade. Its compliance with the Web 2.0 principles, also, puts forward unprecedented opportunities to empower the related services and mechanisms by leveraging humans’ intelligence and problem solving abilities. With respect to the pivotal role of search engines in the Web and information community, this paper aims to investigate the advantages and challenges of incorporating people – as intelligent agents – into search engines’ workflow.

Design/methodology/approach

To emphasize the role of the human in computational processes, some specific and related areas are studied. Then, through studying the current trends in the field of crowd-powered search engines and analyzing the actual needs and requirements, the perspectives and challenges are discussed.

Findings

As the research on this topic is still in its infancy, it is believed that this study can be considered as a roadmap for future works in the field. In this regard, current status and development trends are delineated through providing a general overview of the literature. Moreover, several recommendations for extending the applicability and efficiency of next generation of crowd-powered search engines are presented. In fact, becoming aware of different aspects and challenges of constructing search engines of this kind can shed light on the way of developing working systems with respect to essential considerations.

Originality/value

The present study was aimed to portrait the big picture of crowd-powered search engines and possible challenges and issues. As one of the early works that provided a comprehensive report on different aspects of the topic, it can be regarded as a reference point.

Details

International Journal of Crowd Science, vol. 3 no. 1
Type: Research Article
ISSN: 2398-7294

Keywords

Open Access
Article
Publication date: 18 January 2022

Srinimalan Balakrishnan Selvakumaran and Daniel Mark Hall

The purpose of this paper is to investigate the feasibility of an end-to-end simplified and automated reconstruction pipeline for digital building assets using the design science…

1457

Abstract

Purpose

The purpose of this paper is to investigate the feasibility of an end-to-end simplified and automated reconstruction pipeline for digital building assets using the design science research approach. Current methods to create digital assets by capturing the state of existing buildings can provide high accuracy but are time-consuming, expensive and difficult.

Design/methodology/approach

Using design science research, this research identifies the need for a crowdsourced and cloud-based approach to reconstruct digital building assets. The research then develops and tests a fully functional smartphone application prototype. The proposed end-to-end smartphone workflow begins with data capture and ends with user applications.

Findings

The resulting implementation can achieve a realistic three-dimensional (3D) model characterized by different typologies, minimal trade-off in accuracy and low processing costs. By crowdsourcing the images, the proposed approach can reduce costs for asset reconstruction by an estimated 93% compared to manual modeling and 80% compared to locally processed reconstruction algorithms.

Practical implications

The resulting implementation achieves “good enough” reconstruction of as-is 3D models with minimal tradeoffs in accuracy compared to automated approaches and 15× cost savings compared to a manual approach. Potential facility management use cases include the issue and information tracking, 3D mark-up and multi-model configurators.

Originality/value

Through user engagement, development, testing and validation, this work demonstrates the feasibility and impact of a novel crowdsourced and cloud-based approach for the reconstruction of digital building assets.

Details

Journal of Facilities Management , vol. 20 no. 3
Type: Research Article
ISSN: 1472-5967

Keywords

1 – 10 of 186