Search results

1 – 10 of over 1000
Open Access
Article
Publication date: 14 August 2017

Xiu Susie Fang, Quan Z. Sheng, Xianzhi Wang, Anne H.H. Ngu and Yihong Zhang

This paper aims to propose a system for generating actionable knowledge from Big Data and use this system to construct a comprehensive knowledge base (KB), called GrandBase.

2047

Abstract

Purpose

This paper aims to propose a system for generating actionable knowledge from Big Data and use this system to construct a comprehensive knowledge base (KB), called GrandBase.

Design/methodology/approach

In particular, this study extracts new predicates from four types of data sources, namely, Web texts, Document Object Model (DOM) trees, existing KBs and query stream to augment the ontology of the existing KB (i.e. Freebase). In addition, a graph-based approach to conduct better truth discovery for multi-valued predicates is also proposed.

Findings

Empirical studies demonstrate the effectiveness of the approaches presented in this study and the potential of GrandBase. The future research directions regarding GrandBase construction and extension has also been discussed.

Originality/value

To revolutionize our modern society by using the wisdom of Big Data, considerable KBs have been constructed to feed the massive knowledge-driven applications with Resource Description Framework triples. The important challenges for KB construction include extracting information from large-scale, possibly conflicting and different-structured data sources (i.e. the knowledge extraction problem) and reconciling the conflicts that reside in the sources (i.e. the truth discovery problem). Tremendous research efforts have been contributed on both problems. However, the existing KBs are far from being comprehensive and accurate: first, existing knowledge extraction systems retrieve data from limited types of Web sources; second, existing truth discovery approaches commonly assume each predicate has only one true value. In this paper, the focus is on the problem of generating actionable knowledge from Big Data. A system is proposed, which consists of two phases, namely, knowledge extraction and truth discovery, to construct a broader KB, called GrandBase.

Details

PSU Research Review, vol. 1 no. 2
Type: Research Article
ISSN: 2399-1747

Keywords

Content available
Article
Publication date: 28 January 2014

5

Abstract

Details

Program, vol. 48 no. 1
Type: Research Article
ISSN: 0033-0337

Open Access
Article
Publication date: 30 April 2014

Sonia Froufe, Mame Gningue and Charles–Henri Fredouet

Due to the globalization of trade, hundreds of millions containers pass every year through world ports. Such a situation is extremely challenging in terms of securing freight…

Abstract

Due to the globalization of trade, hundreds of millions containers pass every year through world ports. Such a situation is extremely challenging in terms of securing freight transport operations. However, costs and lead-times are still very important components of supply chains' performance models. Therefore, the drive for enhanced safety and security cannot be made at the expense of these other two factors of competitiveness, and the processes implemented by the global supply chain links, including the maritime port one, should tend to a joint optimization of trade facilitation and operational safety / security.

The research on which this paper feeds back falls within the frame of this mixed performance requirement. More specifically, the paper presents a decision-support system dedicated to managing the risks associated with land and maritime container transportation; this system is based on the modeling of the knowledge of a group of experts, and covers the three phases of risk identification, assessment and avoidance / mitigation.

Abstract

Details

Aslib Journal of Information Management, vol. 75 no. 3
Type: Research Article
ISSN: 2050-3806

Open Access
Article
Publication date: 6 March 2017

Zhuoxuan Jiang, Chunyan Miao and Xiaoming Li

Recent years have witnessed the rapid development of massive open online courses (MOOCs). With more and more courses being produced by instructors and being participated by…

2117

Abstract

Purpose

Recent years have witnessed the rapid development of massive open online courses (MOOCs). With more and more courses being produced by instructors and being participated by learners all over the world, unprecedented massive educational resources are aggregated. The educational resources include videos, subtitles, lecture notes, quizzes, etc., on the teaching side, and forum contents, Wiki, log of learning behavior, log of homework, etc., on the learning side. However, the data are both unstructured and diverse. To facilitate knowledge management and mining on MOOCs, extracting keywords from the resources is important. This paper aims to adapt the state-of-the-art techniques to MOOC settings and evaluate the effectiveness on real data. In terms of practice, this paper also tries to answer the questions for the first time that to what extend can the MOOC resources support keyword extraction models, and how many human efforts are required to make the models work well.

Design/methodology/approach

Based on which side generates the data, i.e instructors or learners, the data are classified to teaching resources and learning resources, respectively. The approach used on teaching resources is based on machine learning models with labels, while the approach used on learning resources is based on graph model without labels.

Findings

From the teaching resources, the methods used by the authors can accurately extract keywords with only 10 per cent labeled data. The authors find a characteristic of the data that the resources of various forms, e.g. subtitles and PPTs, should be separately considered because they have the different model ability. From the learning resources, the keywords extracted from MOOC forums are not as domain-specific as those extracted from teaching resources, but they can reflect the topics which are lively discussed in forums. Then instructors can get feedback from the indication. The authors implement two applications with the extracted keywords: generating concept map and generating learning path. The visual demos show they have the potential to improve learning efficiency when they are integrated into a real MOOC platform.

Research limitations/implications

Conducting keyword extraction on MOOC resources is quite difficult because teaching resources are hard to be obtained due to copyrights. Also, getting labeled data is tough because usually expertise of the corresponding domain is required.

Practical implications

The experiment results support that MOOC resources are good enough for building models of keyword extraction, and an acceptable balance between human efforts and model accuracy can be achieved.

Originality/value

This paper presents a pioneer study on keyword extraction on MOOC resources and obtains some new findings.

Details

International Journal of Crowd Science, vol. 1 no. 1
Type: Research Article
ISSN: 2398-7294

Keywords

Open Access
Article
Publication date: 15 February 2022

Martin Nečaský, Petr Škoda, David Bernhauer, Jakub Klímek and Tomáš Skopal

Semantic retrieval and discovery of datasets published as open data remains a challenging task. The datasets inherently originate in the globally distributed web jungle, lacking…

1207

Abstract

Purpose

Semantic retrieval and discovery of datasets published as open data remains a challenging task. The datasets inherently originate in the globally distributed web jungle, lacking the luxury of centralized database administration, database schemes, shared attributes, vocabulary, structure and semantics. The existing dataset catalogs provide basic search functionality relying on keyword search in brief, incomplete or misleading textual metadata attached to the datasets. The search results are thus often insufficient. However, there exist many ways of improving the dataset discovery by employing content-based retrieval, machine learning tools, third-party (external) knowledge bases, countless feature extraction methods and description models and so forth.

Design/methodology/approach

In this paper, the authors propose a modular framework for rapid experimentation with methods for similarity-based dataset discovery. The framework consists of an extensible catalog of components prepared to form custom pipelines for dataset representation and discovery.

Findings

The study proposes several proof-of-concept pipelines including experimental evaluation, which showcase the usage of the framework.

Originality/value

To the best of authors’ knowledge, there is no similar formal framework for experimentation with various similarity methods in the context of dataset discovery. The framework has the ambition to establish a platform for reproducible and comparable research in the area of dataset discovery. The prototype implementation of the framework is available on GitHub.

Details

Data Technologies and Applications, vol. 56 no. 4
Type: Research Article
ISSN: 2514-9288

Keywords

Open Access
Article
Publication date: 7 June 2018

Zhang Yanjie and Sun Hongbo

For many pattern recognition problems, the relation between the sample vectors and the class labels are known during the data acquisition procedure. However, how to find the…

Abstract

Purpose

For many pattern recognition problems, the relation between the sample vectors and the class labels are known during the data acquisition procedure. However, how to find the useful rules or knowledge hidden in the data is very important and challengeable. Rule extraction methods are very useful in mining the important and heuristic knowledge hidden in the original high-dimensional data. It can help us to construct predictive models with few attributes of the data so as to provide valuable model interpretability and less training times.

Design/methodology/approach

In this paper, a novel rule extraction method with the application of biclustering algorithm is proposed.

Findings

To choose the most significant biclusters from the huge number of detected biclusters, a specially modified information entropy calculation method is also provided. It will be shown that all of the important knowledge is in practice hidden in these biclusters.

Originality/value

The novelty of the new method lies in the detected biclusters can be conveniently translated into if-then rules. It provides an intuitively explainable and comprehensive approach to extract rules from high-dimensional data while keeping high classification accuracy.

Details

International Journal of Crowd Science, vol. 2 no. 2
Type: Research Article
ISSN: 2398-7294

Keywords

Open Access
Article
Publication date: 12 December 2023

Laura Lucantoni, Sara Antomarioni, Filippo Emanuele Ciarapica and Maurizio Bevilacqua

The Overall Equipment Effectiveness (OEE) is considered a standard for measuring equipment productivity in terms of efficiency. Still, Artificial Intelligence solutions are rarely…

Abstract

Purpose

The Overall Equipment Effectiveness (OEE) is considered a standard for measuring equipment productivity in terms of efficiency. Still, Artificial Intelligence solutions are rarely used for analyzing OEE results and identifying corrective actions. Therefore, the approach proposed in this paper aims to provide a new rule-based Machine Learning (ML) framework for OEE enhancement and the selection of improvement actions.

Design/methodology/approach

Association Rules (ARs) are used as a rule-based ML method for extracting knowledge from huge data. First, the dominant loss class is identified and traditional methodologies are used with ARs for anomaly classification and prioritization. Once selected priority anomalies, a detailed analysis is conducted to investigate their influence on the OEE loss factors using ARs and Network Analysis (NA). Then, a Deming Cycle is used as a roadmap for applying the proposed methodology, testing and implementing proactive actions by monitoring the OEE variation.

Findings

The method proposed in this work has also been tested in an automotive company for framework validation and impact measuring. In particular, results highlighted that the rule-based ML methodology for OEE improvement addressed seven anomalies within a year through appropriate proactive actions: on average, each action has ensured an OEE gain of 5.4%.

Originality/value

The originality is related to the dual application of association rules in two different ways for extracting knowledge from the overall OEE. In particular, the co-occurrences of priority anomalies and their impact on asset Availability, Performance and Quality are investigated.

Details

International Journal of Quality & Reliability Management, vol. 41 no. 5
Type: Research Article
ISSN: 0265-671X

Keywords

Open Access
Article
Publication date: 5 December 2023

Manuel J. Sánchez-Franco and Sierra Rey-Tienda

This research proposes to organise and distil this massive amount of data, making it easier to understand. Using data mining, machine learning techniques and visual approaches…

Abstract

Purpose

This research proposes to organise and distil this massive amount of data, making it easier to understand. Using data mining, machine learning techniques and visual approaches, researchers and managers can extract valuable insights (on guests' preferences) and convert them into strategic thinking based on exploration and predictive analysis. Consequently, this research aims to assist hotel managers in making informed decisions, thus improving the overall guest experience and increasing competitiveness.

Design/methodology/approach

This research employs natural language processing techniques, data visualisation proposals and machine learning methodologies to analyse unstructured guest service experience content. In particular, this research (1) applies data mining to evaluate the role and significance of critical terms and semantic structures in hotel assessments; (2) identifies salient tokens to depict guests' narratives based on term frequency and the information quantity they convey; and (3) tackles the challenge of managing extensive document repositories through automated identification of latent topics in reviews by using machine learning methods for semantic grouping and pattern visualisation.

Findings

This study’s findings (1) aim to identify critical features and topics that guests highlight during their hotel stays, (2) visually explore the relationships between these features and differences among diverse types of travellers through online hotel reviews and (3) determine predictive power. Their implications are crucial for the hospitality domain, as they provide real-time insights into guests' perceptions and business performance and are essential for making informed decisions and staying competitive.

Originality/value

This research seeks to minimise the cognitive processing costs of the enormous amount of content published by the user through a better organisation of hotel service reviews and their visualisation. Likewise, this research aims to propose a methodology and method available to tourism organisations to obtain truly useable knowledge in the design of the hotel offer and its value propositions.

Details

Management Decision, vol. ahead-of-print no. ahead-of-print
Type: Research Article
ISSN: 0025-1747

Keywords

Open Access
Article
Publication date: 6 December 2021

Anna Visvizi, Orlando Troisi, Mara Grimaldi and Francesca Loia

The study queries the drivers of innovation management in contemporary data-driven organizations/companies. It is argued that data-driven organizations that integrate a strategic…

4264

Abstract

Purpose

The study queries the drivers of innovation management in contemporary data-driven organizations/companies. It is argued that data-driven organizations that integrate a strategic orientation grounded in data, human abilities and proactive management are more effective in triggering innovation.

Design/methodology/approach

Research reported in this paper employs constructivist grounded theory, Gioia methodology, and the abductive approach. The data collected through semi-structured interviews administered to 20 Italian start-up founders are then examined.

Findings

The paper identifies the key enablers of innovation development in data-driven companies and reveals that data-driven companies may generate different innovation patterns depending on the kind of capabilities activated.

Originality/value

The study provides evidence of how the combination of data-driven culture, skills' enhancement and the promotion of human resources may boost the emergence of innovation.

Details

European Journal of Innovation Management, vol. 25 no. 6
Type: Research Article
ISSN: 1460-1060

Keywords

1 – 10 of over 1000