Search results

1 – 10 of over 16000
Article
Publication date: 21 December 2021

Laouni Djafri

This work can be used as a building block in other settings such as GPU, Map-Reduce, Spark or any other. Also, DDPML can be deployed on other distributed systems such as P2P…

380

Abstract

Purpose

This work can be used as a building block in other settings such as GPU, Map-Reduce, Spark or any other. Also, DDPML can be deployed on other distributed systems such as P2P networks, clusters, clouds computing or other technologies.

Design/methodology/approach

In the age of Big Data, all companies want to benefit from large amounts of data. These data can help them understand their internal and external environment and anticipate associated phenomena, as the data turn into knowledge that can be used for prediction later. Thus, this knowledge becomes a great asset in companies' hands. This is precisely the objective of data mining. But with the production of a large amount of data and knowledge at a faster pace, the authors are now talking about Big Data mining. For this reason, the authors’ proposed works mainly aim at solving the problem of volume, veracity, validity and velocity when classifying Big Data using distributed and parallel processing techniques. So, the problem that the authors are raising in this work is how the authors can make machine learning algorithms work in a distributed and parallel way at the same time without losing the accuracy of classification results. To solve this problem, the authors propose a system called Dynamic Distributed and Parallel Machine Learning (DDPML) algorithms. To build it, the authors divided their work into two parts. In the first, the authors propose a distributed architecture that is controlled by Map-Reduce algorithm which in turn depends on random sampling technique. So, the distributed architecture that the authors designed is specially directed to handle big data processing that operates in a coherent and efficient manner with the sampling strategy proposed in this work. This architecture also helps the authors to actually verify the classification results obtained using the representative learning base (RLB). In the second part, the authors have extracted the representative learning base by sampling at two levels using the stratified random sampling method. This sampling method is also applied to extract the shared learning base (SLB) and the partial learning base for the first level (PLBL1) and the partial learning base for the second level (PLBL2). The experimental results show the efficiency of our solution that the authors provided without significant loss of the classification results. Thus, in practical terms, the system DDPML is generally dedicated to big data mining processing, and works effectively in distributed systems with a simple structure, such as client-server networks.

Findings

The authors got very satisfactory classification results.

Originality/value

DDPML system is specially designed to smoothly handle big data mining classification.

Details

Data Technologies and Applications, vol. 56 no. 4
Type: Research Article
ISSN: 2514-9288

Keywords

Article
Publication date: 21 April 2020

Mohammed Anouar Naoui, Brahim Lejdel, Mouloud Ayad, Abdelfattah Amamra and Okba kazar

The purpose of this paper is to propose a distributed deep learning architecture for smart cities in big data systems.

Abstract

Purpose

The purpose of this paper is to propose a distributed deep learning architecture for smart cities in big data systems.

Design/methodology/approach

We have proposed an architectural multilayer to describe the distributed deep learning for smart cities in big data systems. The components of our system are Smart city layer, big data layer, and deep learning layer. The Smart city layer responsible for the question of Smart city components, its Internet of things, sensors and effectors, and its integration in the system, big data layer concerns data characteristics 10, and its distribution over the system. The deep learning layer is the model of our system. It is responsible for data analysis.

Findings

We apply our proposed architecture in a Smart environment and Smart energy. 10; In a Smart environment, we study the Toluene forecasting in Madrid Smart city. For Smart energy, we study wind energy foresting in Australia. Our proposed architecture can reduce the time of execution and improve the deep learning model, such as Long Term Short Memory10;.

Research limitations/implications

This research needs the application of other deep learning models, such as convolution neuronal network and autoencoder.

Practical implications

Findings of the research will be helpful in Smart city architecture. It can provide a clear view into a Smart city, data storage, and data analysis. The 10; Toluene forecasting in a Smart environment can help the decision-maker to ensure environmental safety. The Smart energy of our proposed model can give a clear prediction of power generation.

Originality/value

The findings of this study are expected to contribute valuable information to decision-makers for a better understanding of the key to Smart city architecture. Its relation with data storage, processing, and data analysis.

Details

Smart and Sustainable Built Environment, vol. 10 no. 1
Type: Research Article
ISSN: 2046-6099

Keywords

Article
Publication date: 30 October 2018

Anuoluwapo Ajayi, Lukumon Oyedele, Juan Manuel Davila Delgado, Lukman Akanbi, Muhammad Bilal, Olugbenga Akinade and Oladimeji Olawale

The purpose of this paper is to highlight the use of the big data technologies for health and safety risks analytics in the power infrastructure domain with large data sets of…

2115

Abstract

Purpose

The purpose of this paper is to highlight the use of the big data technologies for health and safety risks analytics in the power infrastructure domain with large data sets of health and safety risks, which are usually sparse and noisy.

Design/methodology/approach

The study focuses on using the big data frameworks for designing a robust architecture for handling and analysing (exploratory and predictive analytics) accidents in power infrastructure. The designed architecture is based on a well coherent health risk analytics lifecycle. A prototype of the architecture interfaced various technology artefacts was implemented in the Java language to predict the likelihoods of health hazards occurrence. A preliminary evaluation of the proposed architecture was carried out with a subset of an objective data, obtained from a leading UK power infrastructure company offering a broad range of power infrastructure services.

Findings

The proposed architecture was able to identify relevant variables and improve preliminary prediction accuracies and explanatory capacities. It has also enabled conclusions to be drawn regarding the causes of health risks. The results represent a significant improvement in terms of managing information on construction accidents, particularly in power infrastructure domain.

Originality/value

This study carries out a comprehensive literature review to advance the health and safety risk management in construction. It also highlights the inability of the conventional technologies in handling unstructured and incomplete data set for real-time analytics processing. The study proposes a technique in big data technology for finding complex patterns and establishing the statistical cohesion of hidden patterns for optimal future decision making.

Details

World Journal of Science, Technology and Sustainable Development, vol. 16 no. 1
Type: Research Article
ISSN: 2042-5945

Keywords

Article
Publication date: 24 January 2018

David Maynard Gerrard, James Edward Mooney and Dave Thompson

The purpose of this paper is to consider how digital preservation system architectures will support business analysis of large-scale collections of preserved resources, and the…

1504

Abstract

Purpose

The purpose of this paper is to consider how digital preservation system architectures will support business analysis of large-scale collections of preserved resources, and the use of Big Data analyses by future researchers.

Design/methodology/approach

This paper reviews the architecture of existing systems, then discusses experimental surveys of large digital collections using existing digital preservation tools at Big Data scales. Finally, it introduces the design of a proposed new architecture to work with Big Data volumes of preserved digital resources – also based upon experience of managing a collection of 30 million digital images.

Findings

Modern visualisation tools enable business analyses based on file-related metadata, but most currently available systems need more of this functionality “out-of-the-box”. Scalability of preservation architecture to Big Data volumes depends upon the ability to run preservation processes in parallel, so indexes that enable effective sub-division of collections are vital. Not all processes scale easily: those that do not require complex management.

Practical implications

The complexities caused by scaling up to Big Data volumes can be seen as being at odds with preservation, where simplicity matters. However, the sustainability of preservation systems relates directly to their usefulness, and maintaining usefulness will increasingly depend upon being able to process digital resources at Big Data volumes. An effective balance between these conflicting situations must be struck.

Originality/value

Preservation systems are at a step-change as they move to Big Data scale architectures and respond to more technical research processes. This paper is a timely illustration of the state of play at this pivotal moment.

Details

Library Hi Tech, vol. 36 no. 3
Type: Research Article
ISSN: 0737-8831

Keywords

Article
Publication date: 27 March 2020

Bokolo Anthony Jnr, Sobah Abbas Petersen, Dirk Ahlers and John Krogstie

Electric mobility as a service (eMaaS) is suggested as a possible solution to ease transportation and lessen environmental issues by providing a collaborative transport sharing…

1188

Abstract

Purpose

Electric mobility as a service (eMaaS) is suggested as a possible solution to ease transportation and lessen environmental issues by providing a collaborative transport sharing infrastructure that is based on electric vehicles (EVs) such as electric cars, electric bicycles and so on. Accordingly, this study aims to propose a multi-tier architecture to support the collection, processing, analytics and usage of mobility data in providing eMaaS within smart cities. The architecture uses application programming interfaces to enable interoperability between different infrastructures required for eMaaS and allow multiple partners to exchange and share data for making decision regarding electric mobility services.

Design/methodology/approach

Design science methodology based on a case study by interview was used to collect data from an infrastructure company in Norway to verify the applicability of the proposed multi-tier architecture.

Findings

Findings suggest that the architecture offers an approach for collecting, aggregating, processing and provisioning of data originating from sources to improve electric mobility in smart cities. More importantly, findings from this study provide guidance for municipalities and policymakers in improving electric mobility services. Moreover, the author’s findings provide a practical data-driven mobility use case that can be used by transport companies in deploying eMaaS in smart cities.

Research limitations/implications

Data was collected from a single company in Norway, hence, it is required to further verify the architecture with data collected from other companies.

Practical implications

eMaaS operates on heterogeneous data, which are generated from EVs and used by citizens and stakeholders such as city administration, municipality transport providers, charging station providers and so on. Therefore, the proposed architecture enables the sharing and usage of generated data as openly available data to be used in creating value-added services to improve citizen’s quality of life and viability of businesses.

Social implications

This study proposes the deployment of electric mobility to address increased usage of vehicles, which contributes to pollution of the environment that has a serious effect on citizen’s quality of life.

Originality/value

This study proposes a multi-tier architecture that stores, processes, analyze and provides data and related services to improve electric mobility within smart cities. The multi-tier architecture aims to support and increase eMaaS operation of EVs toward improving transportation services for city transport operators and citizens for sustainable transport and mobility system.

Article
Publication date: 9 October 2019

Elham Ali Shammar and Ammar Thabit Zahary

Internet has changed radically in the way people interact in the virtual world, in their careers or social relationships. IoT technology has added a new vision to this process by…

6449

Abstract

Purpose

Internet has changed radically in the way people interact in the virtual world, in their careers or social relationships. IoT technology has added a new vision to this process by enabling connections between smart objects and humans, and also between smart objects themselves, which leads to anything, anytime, anywhere, and any media communications. IoT allows objects to physically see, hear, think, and perform tasks by making them talk to each other, share information and coordinate decisions. To enable the vision of IoT, it utilizes technologies such as ubiquitous computing, context awareness, RFID, WSN, embedded devices, CPS, communication technologies, and internet protocols. IoT is considered to be the future internet, which is significantly different from the Internet we use today. The purpose of this paper is to provide up-to-date literature on trends of IoT research which is driven by the need for convergence of several interdisciplinary technologies and new applications.

Design/methodology/approach

A comprehensive IoT literature review has been performed in this paper as a survey. The survey starts by providing an overview of IoT concepts, visions and evolutions. IoT architectures are also explored. Then, the most important components of IoT are discussed including a thorough discussion of IoT operating systems such as Tiny OS, Contiki OS, FreeRTOS, and RIOT. A review of IoT applications is also presented in this paper and finally, IoT challenges that can be recently encountered by researchers are introduced.

Findings

Studies of IoT literature and projects show the disproportionate importance of technology in IoT projects, which are often driven by technological interventions rather than innovation in the business model. There are a number of serious concerns about the dangers of IoT growth, particularly in the areas of privacy and security; hence, industry and government began addressing these concerns. At the end, what makes IoT exciting is that we do not yet know the exact use cases which would have the ability to significantly influence our lives.

Originality/value

This survey provides a comprehensive literature review on IoT techniques, operating systems and trends.

Details

Library Hi Tech, vol. 38 no. 1
Type: Research Article
ISSN: 0737-8831

Keywords

Article
Publication date: 19 October 2015

Kun Chen, Xin Li and Huaiqing Wang

Although big data analytics has reaped great business rewards, big data system design and integration still face challenges resulting from the demanding environment, including…

2718

Abstract

Purpose

Although big data analytics has reaped great business rewards, big data system design and integration still face challenges resulting from the demanding environment, including challenges involving variety, uncertainty, and complexity. These characteristics in big data systems demand flexible and agile integration architectures. Furthermore, a formal model is needed to support design and verification. The purpose of this paper is to resolve the two problems with a collective intelligence (CI) model.

Design/methodology/approach

In the conceptual CI framework as proposed by Schut (2010), a CI design should be comprised of a general model, which has formal form for verification and validation, and also a specific model, which is an implementable system architecture. After analyzing the requirements of system integration in big data environments, the authors apply the CI framework to resolve the integration problem. In the model instantiation, the authors use multi-agent paradigm as the specific model, and the hierarchical colored Petri Net (PN) as the general model.

Findings

First, multi-agent paradigm is a good implementation for reuse and integration of big data analytics modules in an agile and loosely coupled method. Second, the PN models provide effective simulation results in the system design period. It gives advice on business process design and workload balance control. Third, the CI framework provides an incrementally build and deployed method for system integration. It is especially suitable to the dynamic data analytics environment. These findings have both theoretical and managerial implications.

Originality/value

In this paper, the authors propose a CI framework, which includes both practical architectures and theoretical foundations, to solve the system integration problem in big data environment. It provides a new point of view to dynamically integrate large-scale modules in an organization. This paper also has practical suggestions for Chief Technical Officers, who want to employ big data technologies in their companies.

Details

Industrial Management & Data Systems, vol. 115 no. 9
Type: Research Article
ISSN: 0263-5577

Keywords

Article
Publication date: 20 August 2018

Laouni Djafri, Djamel Amar Bensaber and Reda Adjoudj

This paper aims to solve the problems of big data analytics for prediction including volume, veracity and velocity by improving the prediction result to an acceptable level and in…

Abstract

Purpose

This paper aims to solve the problems of big data analytics for prediction including volume, veracity and velocity by improving the prediction result to an acceptable level and in the shortest possible time.

Design/methodology/approach

This paper is divided into two parts. The first one is to improve the result of the prediction. In this part, two ideas are proposed: the double pruning enhanced random forest algorithm and extracting a shared learning base from the stratified random sampling method to obtain a representative learning base of all original data. The second part proposes to design a distributed architecture supported by new technologies solutions, which in turn works in a coherent and efficient way with the sampling strategy under the supervision of the Map-Reduce algorithm.

Findings

The representative learning base obtained by the integration of two learning bases, the partial base and the shared base, presents an excellent representation of the original data set and gives very good results of the Big Data predictive analytics. Furthermore, these results were supported by the improved random forests supervised learning method, which played a key role in this context.

Originality/value

All companies are concerned, especially those with large amounts of information and want to screen them to improve their knowledge for the customer and optimize their campaigns.

Details

Information Discovery and Delivery, vol. 46 no. 3
Type: Research Article
ISSN: 2398-6247

Keywords

Article
Publication date: 14 December 2018

Erion Çano and Maurizio Morisio

The fabulous results of convolution neural networks in image-related tasks attracted attention of text mining, sentiment analysis and other text analysis researchers. It is…

Abstract

Purpose

The fabulous results of convolution neural networks in image-related tasks attracted attention of text mining, sentiment analysis and other text analysis researchers. It is, however, difficult to find enough data for feeding such networks, optimize their parameters, and make the right design choices when constructing network architectures. The purpose of this paper is to present the creation steps of two big data sets of song emotions. The authors also explore usage of convolution and max-pooling neural layers on song lyrics, product and movie review text data sets. Three variants of a simple and flexible neural network architecture are also compared.

Design/methodology/approach

The intention was to spot any important patterns that can serve as guidelines for parameter optimization of similar models. The authors also wanted to identify architecture design choices which lead to high performing sentiment analysis models. To this end, the authors conducted a series of experiments with neural architectures of various configurations.

Findings

The results indicate that parallel convolutions of filter lengths up to 3 are usually enough for capturing relevant text features. Also, max-pooling region size should be adapted to the length of text documents for producing the best feature maps.

Originality/value

Top results the authors got are obtained with feature maps of lengths 6–18. An improvement on future neural network models for sentiment analysis could be generating sentiment polarity prediction of documents using aggregation of predictions on smaller excerpt of the entire text.

Details

Data Technologies and Applications, vol. 53 no. 1
Type: Research Article
ISSN: 2514-9288

Keywords

Article
Publication date: 4 April 2016

Alain Yee Loong Chong, Boying Li, Eric W.T. Ngai, Eugene Ch'ng and Filbert Lee

The purpose of this paper is to investigate if online reviews (e.g. valence and volume), online promotional strategies (e.g. free delivery and discounts) and sentiments from user…

9982

Abstract

Purpose

The purpose of this paper is to investigate if online reviews (e.g. valence and volume), online promotional strategies (e.g. free delivery and discounts) and sentiments from user reviews can help predict product sales.

Design/methodology/approach

The authors designed a big data architecture and deployed Node.js agents for scraping the Amazon.com pages using asynchronous input/output calls. The completed web crawling and scraping data sets were then preprocessed for sentimental and neural network analysis. The neural network was employed to examine which variables in the study are important predictors of product sales.

Findings

This study found that although online reviews, online promotional strategies and online sentiments can all predict product sales, some variables are more important predictors than others. The authors found that the interplay effects of these variables become more important variables than the individual variables themselves. For example, online volume interactions with sentiments and discounts are more important than the individual predictors of discounts, sentiments or online volume.

Originality/value

This study designed big data architecture, in combination with sentimental and neural network analysis that can facilitate future business research for predicting product sales in an online environment. This study also employed a predictive analytic approach (e.g. neural network) to examine the variables, and this approach is useful for future data analysis in a big data environment where prediction can have more practical implications than significance testing. This study also examined the interplay between online reviews, sentiments and promotional strategies, which up to now have mostly been examined individually in previous studies.

Details

International Journal of Operations & Production Management, vol. 36 no. 4
Type: Research Article
ISSN: 0144-3577

Keywords

1 – 10 of over 16000