Search results

1 – 10 of 513
Open Access
Article
Publication date: 14 August 2017

Xiu Susie Fang, Quan Z. Sheng, Xianzhi Wang, Anne H.H. Ngu and Yihong Zhang

This paper aims to propose a system for generating actionable knowledge from Big Data and use this system to construct a comprehensive knowledge base (KB), called GrandBase.

2053

Abstract

Purpose

This paper aims to propose a system for generating actionable knowledge from Big Data and use this system to construct a comprehensive knowledge base (KB), called GrandBase.

Design/methodology/approach

In particular, this study extracts new predicates from four types of data sources, namely, Web texts, Document Object Model (DOM) trees, existing KBs and query stream to augment the ontology of the existing KB (i.e. Freebase). In addition, a graph-based approach to conduct better truth discovery for multi-valued predicates is also proposed.

Findings

Empirical studies demonstrate the effectiveness of the approaches presented in this study and the potential of GrandBase. The future research directions regarding GrandBase construction and extension has also been discussed.

Originality/value

To revolutionize our modern society by using the wisdom of Big Data, considerable KBs have been constructed to feed the massive knowledge-driven applications with Resource Description Framework triples. The important challenges for KB construction include extracting information from large-scale, possibly conflicting and different-structured data sources (i.e. the knowledge extraction problem) and reconciling the conflicts that reside in the sources (i.e. the truth discovery problem). Tremendous research efforts have been contributed on both problems. However, the existing KBs are far from being comprehensive and accurate: first, existing knowledge extraction systems retrieve data from limited types of Web sources; second, existing truth discovery approaches commonly assume each predicate has only one true value. In this paper, the focus is on the problem of generating actionable knowledge from Big Data. A system is proposed, which consists of two phases, namely, knowledge extraction and truth discovery, to construct a broader KB, called GrandBase.

Details

PSU Research Review, vol. 1 no. 2
Type: Research Article
ISSN: 2399-1747

Keywords

Content available
Article
Publication date: 17 April 2007

251

Abstract

Details

Online Information Review, vol. 31 no. 2
Type: Research Article
ISSN: 1468-4527

Open Access
Article
Publication date: 6 March 2017

Zhuoxuan Jiang, Chunyan Miao and Xiaoming Li

Recent years have witnessed the rapid development of massive open online courses (MOOCs). With more and more courses being produced by instructors and being participated by…

2123

Abstract

Purpose

Recent years have witnessed the rapid development of massive open online courses (MOOCs). With more and more courses being produced by instructors and being participated by learners all over the world, unprecedented massive educational resources are aggregated. The educational resources include videos, subtitles, lecture notes, quizzes, etc., on the teaching side, and forum contents, Wiki, log of learning behavior, log of homework, etc., on the learning side. However, the data are both unstructured and diverse. To facilitate knowledge management and mining on MOOCs, extracting keywords from the resources is important. This paper aims to adapt the state-of-the-art techniques to MOOC settings and evaluate the effectiveness on real data. In terms of practice, this paper also tries to answer the questions for the first time that to what extend can the MOOC resources support keyword extraction models, and how many human efforts are required to make the models work well.

Design/methodology/approach

Based on which side generates the data, i.e instructors or learners, the data are classified to teaching resources and learning resources, respectively. The approach used on teaching resources is based on machine learning models with labels, while the approach used on learning resources is based on graph model without labels.

Findings

From the teaching resources, the methods used by the authors can accurately extract keywords with only 10 per cent labeled data. The authors find a characteristic of the data that the resources of various forms, e.g. subtitles and PPTs, should be separately considered because they have the different model ability. From the learning resources, the keywords extracted from MOOC forums are not as domain-specific as those extracted from teaching resources, but they can reflect the topics which are lively discussed in forums. Then instructors can get feedback from the indication. The authors implement two applications with the extracted keywords: generating concept map and generating learning path. The visual demos show they have the potential to improve learning efficiency when they are integrated into a real MOOC platform.

Research limitations/implications

Conducting keyword extraction on MOOC resources is quite difficult because teaching resources are hard to be obtained due to copyrights. Also, getting labeled data is tough because usually expertise of the corresponding domain is required.

Practical implications

The experiment results support that MOOC resources are good enough for building models of keyword extraction, and an acceptable balance between human efforts and model accuracy can be achieved.

Originality/value

This paper presents a pioneer study on keyword extraction on MOOC resources and obtains some new findings.

Details

International Journal of Crowd Science, vol. 1 no. 1
Type: Research Article
ISSN: 2398-7294

Keywords

Open Access
Article
Publication date: 22 October 2019

Renato Ribeiro Nogueira Ferraz, Marcus Vinícius Cesso da Silva, Renan Antônio da Silva and Luc Quoniam

The purpose of this paper is to present the use of a free code computational tool, Patent2net, in the search of patents for the implementation of distance learning aimed at…

1194

Abstract

Purpose

The purpose of this paper is to present the use of a free code computational tool, Patent2net, in the search of patents for the implementation of distance learning aimed at Continuing Medical Education.

Design/methodology/approach

This technical report is based on the extraction, organization and availability, in the format of graphs and dynamic tables, and also based on information in other patents on the subject, made available in the Espacenet database.

Findings

As a result, it was possible to identify a Chinese patent, free for reproduction in Brazil, which describes an e-learning system that simulates 3D scenarios for training nursing teams.

Research limitations/implications

The paper has used one unique patent database, but containing more than 100m documents.

Practical implications

The selected patent can contribute to the improvement of care and behavioral techniques of the health professionals.

Social implications

The training of health professionals can improve the public and supplementary health systems.

Originality/value

This is the first paper in that de technometric analisys of patents was used to solve a problem regarding the training of health professionals.

Details

Revista de Gestão, vol. 27 no. 1
Type: Research Article
ISSN: 2177-8736

Keywords

Open Access
Article
Publication date: 15 February 2022

Martin Nečaský, Petr Škoda, David Bernhauer, Jakub Klímek and Tomáš Skopal

Semantic retrieval and discovery of datasets published as open data remains a challenging task. The datasets inherently originate in the globally distributed web jungle, lacking…

1213

Abstract

Purpose

Semantic retrieval and discovery of datasets published as open data remains a challenging task. The datasets inherently originate in the globally distributed web jungle, lacking the luxury of centralized database administration, database schemes, shared attributes, vocabulary, structure and semantics. The existing dataset catalogs provide basic search functionality relying on keyword search in brief, incomplete or misleading textual metadata attached to the datasets. The search results are thus often insufficient. However, there exist many ways of improving the dataset discovery by employing content-based retrieval, machine learning tools, third-party (external) knowledge bases, countless feature extraction methods and description models and so forth.

Design/methodology/approach

In this paper, the authors propose a modular framework for rapid experimentation with methods for similarity-based dataset discovery. The framework consists of an extensible catalog of components prepared to form custom pipelines for dataset representation and discovery.

Findings

The study proposes several proof-of-concept pipelines including experimental evaluation, which showcase the usage of the framework.

Originality/value

To the best of authors’ knowledge, there is no similar formal framework for experimentation with various similarity methods in the context of dataset discovery. The framework has the ambition to establish a platform for reproducible and comparable research in the area of dataset discovery. The prototype implementation of the framework is available on GitHub.

Details

Data Technologies and Applications, vol. 56 no. 4
Type: Research Article
ISSN: 2514-9288

Keywords

Content available
Article
Publication date: 23 January 2009

579

Abstract

Details

Library Hi Tech News, vol. 26 no. 1/2
Type: Research Article
ISSN: 0741-9058

Content available
Article
Publication date: 7 August 2009

267

Abstract

Details

Library Hi Tech News, vol. 26 no. 7
Type: Research Article
ISSN: 0741-9058

Content available
Article
Publication date: 4 July 2008

204

Abstract

Details

Library Hi Tech News, vol. 25 no. 6
Type: Research Article
ISSN: 0741-9058

Content available
540

Abstract

Details

Library Hi Tech News, vol. 16 no. 9/10
Type: Research Article
ISSN: 0741-9058

Content available
Book part
Publication date: 30 July 2018

Abstract

Details

Marketing Management in Turkey
Type: Book
ISBN: 978-1-78714-558-0

Access

Only content I have access to

Year

All dates (513)

Content type

1 – 10 of 513