To read this content please select one of the options below:

Big Data analytics for prediction: parallel processing of the big learning base with the possibility of improving the final result of the prediction

Laouni Djafri (Department of Computer Science, Djillali Liabes University, EEDIS laboratory -univ-SBA-Algeria, Sidi-Bel-Abbes, Algeria)
Djamel Amar Bensaber (Superior School of Computer Science, LabRI laboratory -ESI-SBA-Algeria, Sidi Bel-Abbes, Algeria)
Reda Adjoudj (Department of Computer Science, Djillali Liabes University, EEDIS laboratory -univ-SBA-Algeria, Sidi Bel-Abbes, Algeria)

Information Discovery and Delivery

ISSN: 2398-6247

Article publication date: 20 August 2018

417

Abstract

Purpose

This paper aims to solve the problems of big data analytics for prediction including volume, veracity and velocity by improving the prediction result to an acceptable level and in the shortest possible time.

Design/methodology/approach

This paper is divided into two parts. The first one is to improve the result of the prediction. In this part, two ideas are proposed: the double pruning enhanced random forest algorithm and extracting a shared learning base from the stratified random sampling method to obtain a representative learning base of all original data. The second part proposes to design a distributed architecture supported by new technologies solutions, which in turn works in a coherent and efficient way with the sampling strategy under the supervision of the Map-Reduce algorithm.

Findings

The representative learning base obtained by the integration of two learning bases, the partial base and the shared base, presents an excellent representation of the original data set and gives very good results of the Big Data predictive analytics. Furthermore, these results were supported by the improved random forests supervised learning method, which played a key role in this context.

Originality/value

All companies are concerned, especially those with large amounts of information and want to screen them to improve their knowledge for the customer and optimize their campaigns.

Keywords

Citation

Djafri, L., Amar Bensaber, D. and Adjoudj, R. (2018), "Big Data analytics for prediction: parallel processing of the big learning base with the possibility of improving the final result of the prediction", Information Discovery and Delivery, Vol. 46 No. 3, pp. 147-160. https://doi.org/10.1108/IDD-02-2018-0002

Publisher

:

Emerald Publishing Limited

Copyright © 2018, Emerald Publishing Limited

Related articles