Search results

1 – 10 of 643
Content available
Article
Publication date: 10 June 2019

Eric M. Meyers

Abstract

Details

Information and Learning Sciences, vol. 120 no. 5/6
Type: Research Article
ISSN: 2398-5348

Open Access
Article
Publication date: 15 February 2022

Martin Nečaský, Petr Škoda, David Bernhauer, Jakub Klímek and Tomáš Skopal

Semantic retrieval and discovery of datasets published as open data remains a challenging task. The datasets inherently originate in the globally distributed web jungle, lacking…

1210

Abstract

Purpose

Semantic retrieval and discovery of datasets published as open data remains a challenging task. The datasets inherently originate in the globally distributed web jungle, lacking the luxury of centralized database administration, database schemes, shared attributes, vocabulary, structure and semantics. The existing dataset catalogs provide basic search functionality relying on keyword search in brief, incomplete or misleading textual metadata attached to the datasets. The search results are thus often insufficient. However, there exist many ways of improving the dataset discovery by employing content-based retrieval, machine learning tools, third-party (external) knowledge bases, countless feature extraction methods and description models and so forth.

Design/methodology/approach

In this paper, the authors propose a modular framework for rapid experimentation with methods for similarity-based dataset discovery. The framework consists of an extensible catalog of components prepared to form custom pipelines for dataset representation and discovery.

Findings

The study proposes several proof-of-concept pipelines including experimental evaluation, which showcase the usage of the framework.

Originality/value

To the best of authors’ knowledge, there is no similar formal framework for experimentation with various similarity methods in the context of dataset discovery. The framework has the ambition to establish a platform for reproducible and comparable research in the area of dataset discovery. The prototype implementation of the framework is available on GitHub.

Details

Data Technologies and Applications, vol. 56 no. 4
Type: Research Article
ISSN: 2514-9288

Keywords

Content available
Book part
Publication date: 3 September 2020

Abstract

Details

Cultural Competence in Higher Education
Type: Book
ISBN: 978-1-78769-772-0

Content available
Book part
Publication date: 22 May 2017

Jürgen Deters

Abstract

Details

Global Leadership Talent Management
Type: Book
ISBN: 978-1-78714-543-6

Content available
Book part
Publication date: 16 August 2016

Abstract

Details

University Partnerships for Academic Programs and Professional Development
Type: Book
ISBN: 978-1-78635-299-6

Content available
Book part
Publication date: 16 August 2021

Abstract

Details

Intercultural Management in Practice
Type: Book
ISBN: 978-1-83982-827-0

Open Access
Article
Publication date: 17 October 2019

Qiong Bu, Elena Simperl, Adriane Chapman and Eddy Maddalena

Ensuring quality is one of the most significant challenges in microtask crowdsourcing tasks. Aggregation of the collected data from the crowd is one of the important steps to…

1289

Abstract

Purpose

Ensuring quality is one of the most significant challenges in microtask crowdsourcing tasks. Aggregation of the collected data from the crowd is one of the important steps to infer the correct answer, but the existing study seems to be limited to the single-step task. This study aims to look at multiple-step classification tasks and understand aggregation in such cases; hence, it is useful for assessing the classification quality.

Design/methodology/approach

The authors present a model to capture the information of the workflow, questions and answers for both single- and multiple-question classification tasks. They propose an adapted approach on top of the classic approach so that the model can handle tasks with several multiple-choice questions in general instead of a specific domain or any specific hierarchical classifications. They evaluate their approach with three representative tasks from existing citizen science projects in which they have the gold standard created by experts.

Findings

The results show that the approach can provide significant improvements to the overall classification accuracy. The authors’ analysis also demonstrates that all algorithms can achieve higher accuracy for the volunteer- versus paid-generated data sets for the same task. Furthermore, the authors observed interesting patterns in the relationship between the performance of different algorithms and workflow-specific factors including the number of steps and the number of available options in each step.

Originality/value

Due to the nature of crowdsourcing, aggregating the collected data is an important process to understand the quality of crowdsourcing results. Different inference algorithms have been studied for simple microtasks consisting of single questions with two or more answers. However, as classification tasks typically contain many questions, the proposed method can be applied to a wide range of tasks including both single- and multiple-question classification tasks.

Details

International Journal of Crowd Science, vol. 3 no. 3
Type: Research Article
ISSN: 2398-7294

Keywords

Content available
Book part
Publication date: 3 April 2020

Chris Brown

Abstract

Details

The Networked School Leader
Type: Book
ISBN: 978-1-83867-722-0

Content available
Book part
Publication date: 20 June 2017

David Shinar

Abstract

Details

Traffic Safety and Human Behavior
Type: Book
ISBN: 978-1-78635-222-4

Content available
Book part
Publication date: 3 August 2020

Abstract

Details

Leadership Strategies for Promoting Social Responsibility in Higher Education
Type: Book
ISBN: 978-1-83909-427-9

1 – 10 of 643