Search results

1 – 10 of over 19000
Article
Publication date: 20 October 2021

Sumeer Gul, Shohar Bano and Taseen Shah

Data mining along with its varied technologies like numerical mining, textual mining, multimedia mining, web mining, sentiment analysis and big data mining proves itself as an…

1002

Abstract

Purpose

Data mining along with its varied technologies like numerical mining, textual mining, multimedia mining, web mining, sentiment analysis and big data mining proves itself as an emerging field and manifests itself in the form of different techniques such as information mining; big data mining; big data mining and Internet of Things (IoT); and educational data mining. This paper aims to discuss how these technologies and techniques are used to derive information and, eventually, knowledge from data.

Design/methodology/approach

An extensive review of literature on data mining and its allied techniques was carried to ascertain the emerging procedures and techniques in the domain of data mining. Clarivate Analytic’s Web of Science and Sciverse Scopus were explored to discover the extent of literature published on Data Mining and its varied facets. Literature was searched against various keywords such as data mining; information mining; big data; big data and IoT; and educational data mining. Further, the works citing the literature on data mining were also explored to visualize a broad gamut of emerging techniques about this growing field.

Findings

The study validates that knowledge discovery in databases has rendered data mining as an emerging field; the data present in these databases paves the way for data mining techniques and analytics. This paper provides a unique view about the usage of data, and logical patterns derived from it, how new procedures, algorithms and mining techniques are being continuously upgraded for their multipurpose use for the betterment of human life and experiences.

Practical implications

The paper highlights different aspects of data mining, its different technological approaches, and how these emerging data technologies are used to derive logical insights from data and make data more meaningful.

Originality/value

The paper tries to highlight the current trends and facets of data mining.

Details

Digital Library Perspectives, vol. 37 no. 4
Type: Research Article
ISSN: 2059-5816

Keywords

Article
Publication date: 26 September 2023

Mohammed Ayoub Ledhem and Warda Moussaoui

This paper aims to apply several data mining techniques for predicting the daily precision improvement of Jakarta Islamic Index (JKII) prices based on big data of symmetric…

Abstract

Purpose

This paper aims to apply several data mining techniques for predicting the daily precision improvement of Jakarta Islamic Index (JKII) prices based on big data of symmetric volatility in Indonesia’s Islamic stock market.

Design/methodology/approach

This research uses big data mining techniques to predict daily precision improvement of JKII prices by applying the AdaBoost, K-nearest neighbor, random forest and artificial neural networks. This research uses big data with symmetric volatility as inputs in the predicting model, whereas the closing prices of JKII were used as the target outputs of daily precision improvement. For choosing the optimal prediction performance according to the criteria of the lowest prediction errors, this research uses four metrics of mean absolute error, mean squared error, root mean squared error and R-squared.

Findings

The experimental results determine that the optimal technique for predicting the daily precision improvement of the JKII prices in Indonesia’s Islamic stock market is the AdaBoost technique, which generates the optimal predicting performance with the lowest prediction errors, and provides the optimum knowledge from the big data of symmetric volatility in Indonesia’s Islamic stock market. In addition, the random forest technique is also considered another robust technique in predicting the daily precision improvement of the JKII prices as it delivers closer values to the optimal performance of the AdaBoost technique.

Practical implications

This research is filling the literature gap of the absence of using big data mining techniques in the prediction process of Islamic stock markets by delivering new operational techniques for predicting the daily stock precision improvement. Also, it helps investors to manage the optimal portfolios and to decrease the risk of trading in global Islamic stock markets based on using big data mining of symmetric volatility.

Originality/value

This research is a pioneer in using big data mining of symmetric volatility in the prediction of an Islamic stock market index.

Details

Journal of Modelling in Management, vol. 19 no. 3
Type: Research Article
ISSN: 1746-5664

Keywords

Article
Publication date: 21 December 2021

Laouni Djafri

This work can be used as a building block in other settings such as GPU, Map-Reduce, Spark or any other. Also, DDPML can be deployed on other distributed systems such as P2P…

384

Abstract

Purpose

This work can be used as a building block in other settings such as GPU, Map-Reduce, Spark or any other. Also, DDPML can be deployed on other distributed systems such as P2P networks, clusters, clouds computing or other technologies.

Design/methodology/approach

In the age of Big Data, all companies want to benefit from large amounts of data. These data can help them understand their internal and external environment and anticipate associated phenomena, as the data turn into knowledge that can be used for prediction later. Thus, this knowledge becomes a great asset in companies' hands. This is precisely the objective of data mining. But with the production of a large amount of data and knowledge at a faster pace, the authors are now talking about Big Data mining. For this reason, the authors’ proposed works mainly aim at solving the problem of volume, veracity, validity and velocity when classifying Big Data using distributed and parallel processing techniques. So, the problem that the authors are raising in this work is how the authors can make machine learning algorithms work in a distributed and parallel way at the same time without losing the accuracy of classification results. To solve this problem, the authors propose a system called Dynamic Distributed and Parallel Machine Learning (DDPML) algorithms. To build it, the authors divided their work into two parts. In the first, the authors propose a distributed architecture that is controlled by Map-Reduce algorithm which in turn depends on random sampling technique. So, the distributed architecture that the authors designed is specially directed to handle big data processing that operates in a coherent and efficient manner with the sampling strategy proposed in this work. This architecture also helps the authors to actually verify the classification results obtained using the representative learning base (RLB). In the second part, the authors have extracted the representative learning base by sampling at two levels using the stratified random sampling method. This sampling method is also applied to extract the shared learning base (SLB) and the partial learning base for the first level (PLBL1) and the partial learning base for the second level (PLBL2). The experimental results show the efficiency of our solution that the authors provided without significant loss of the classification results. Thus, in practical terms, the system DDPML is generally dedicated to big data mining processing, and works effectively in distributed systems with a simple structure, such as client-server networks.

Findings

The authors got very satisfactory classification results.

Originality/value

DDPML system is specially designed to smoothly handle big data mining classification.

Details

Data Technologies and Applications, vol. 56 no. 4
Type: Research Article
ISSN: 2514-9288

Keywords

Article
Publication date: 19 December 2022

Sukjin You, Soohyung Joo and Marie Katsurai

The purpose of this study is to explore to which extent data mining research would be associated with the library and information science (LIS) discipline. This study aims to…

Abstract

Purpose

The purpose of this study is to explore to which extent data mining research would be associated with the library and information science (LIS) discipline. This study aims to identify data mining related subject terms and topics in representative LIS scholarly publications.

Design/methodology/approach

A large set of bibliographic records over 38,000 was collected from a scholarly database representing the fields of LIS and the data mining, respectively. A multitude of text mining techniques were applied to investigate prevailing subject terms and research topics, such as influential term analysis and Dirichlet multinomial regression topic modeling.

Findings

The findings of this study revealed the relationship between the LIS and data mining research domains. Various data mining method terms were observed in recent LIS publications, such as machine learning, artificial intelligence and neural networks. The topic modeling result identified prevailing data mining related research topics in LIS, such as machine learning, deep learning, big data and among others. In addition, this study investigated the trends of popular topics in LIS over time in the recent decade.

Originality/value

This investigation is one of a few studies that empirically investigated the relationships between the LIS and data mining research domains. Multiple text mining techniques were employed to delineate to which extent the two research domains would be associated with each other based on both at the term-level and topic-level analysis. Methodologically, the study identified influential terms in each domain using multiple feature selection indices. In addition, Dirichlet multinomial regression was applied to explore LIS topics in relation to data mining.

Details

Aslib Journal of Information Management, vol. 76 no. 1
Type: Research Article
ISSN: 2050-3806

Keywords

Article
Publication date: 23 November 2021

Feifei Sun and Guohong Shi

This paper aims to effectively explore the application effect of big data techniques based on an α-support vector machine-stochastic gradient descent (SVMSGD) algorithm in…

Abstract

Purpose

This paper aims to effectively explore the application effect of big data techniques based on an α-support vector machine-stochastic gradient descent (SVMSGD) algorithm in third-party logistics, obtain the valuable information hidden in the logistics big data and promote the logistics enterprises to make more reasonable planning schemes.

Design/methodology/approach

In this paper, the forgetting factor is introduced without changing the algorithm's complexity and proposed an algorithm based on the forgetting factor called the α-SVMSGD algorithm. The algorithm selectively deletes or retains the historical data, which improves the adaptability of the classifier to the real-time new logistics data. The simulation results verify the application effect of the algorithm.

Findings

With the increase of training times, the test error percentages of gradient descent (GD) algorithm, gradient descent support (SGD) algorithm and the α-SVMSGD algorithm decrease gradually; in the process of logistics big data processing, the α-SVMSGD algorithm has the efficiency of SGD algorithm while ensuring that the GD direction approaches the optimal solution direction and can use a small amount of data to obtain more accurate results and enhance the convergence accuracy.

Research limitations/implications

The threshold setting of the forgetting factor still needs to be improved. Setting thresholds for different data types in self-learning has become a research direction. The number of forgotten data can be effectively controlled through big data processing technology to improve data support for the normal operation of third-party logistics.

Practical implications

It can effectively reduce the time-consuming of data mining, realize the rapid and accurate convergence of sample data without increasing the complexity of samples, improve the efficiency of logistics big data mining, reduce the redundancy of historical data, and has a certain reference value in promoting the development of logistics industry.

Originality/value

The classification algorithm proposed in this paper has feasibility and high convergence in third-party logistics big data mining. The α-SVMSGD algorithm proposed in this paper has a certain application value in real-time logistics data mining, but the design of the forgetting factor threshold needs to be improved. In the future, the authors will continue to study how to set different data type thresholds in self-learning.

Details

Journal of Enterprise Information Management, vol. 35 no. 4/5
Type: Research Article
ISSN: 1741-0398

Keywords

Article
Publication date: 18 October 2018

Arfan Majeed, Jingxiang Lv and Tao Peng

This paper aims to present an overall framework of big data-based analytics to optimize the production performance of additive manufacturing (AM) process.

1762

Abstract

Purpose

This paper aims to present an overall framework of big data-based analytics to optimize the production performance of additive manufacturing (AM) process.

Design/methodology/approach

Four components, namely, big data application, big data sensing and acquisition, big data processing and storage, model establishing, data mining and process optimization were presented to comprise the framework. Key technologies including the big data acquisition and integration, big data mining and knowledge sharing mechanism were developed for the big data analytics for AM.

Findings

The presented framework was demonstrated by an application scenario from a company of three-dimensional printing solutions. The results show that the proposed framework benefited customers, manufacturers, environment and even all aspects of manufacturing phase.

Research limitations/implications

This study only proposed a framework, and did not include the realization of the algorithm for data analysis, such as association, classification and clustering.

Practical implications

The proposed framework can be used to optimize the quality, energy consumption and production efficiency of the AM process.

Originality/value

This paper introduces the concept of big data in the field of AM. The proposed framework can be used to make better decisions based on the big data during manufacturing process.

Details

Rapid Prototyping Journal, vol. 25 no. 2
Type: Research Article
ISSN: 1355-2546

Keywords

Article
Publication date: 14 August 2017

Neha Verma and Jatinder Singh

The purpose of this paper is to explore various limitations of conventional mining systems in extracting useful buying patterns from retail transactional databases flooded with Big

1868

Abstract

Purpose

The purpose of this paper is to explore various limitations of conventional mining systems in extracting useful buying patterns from retail transactional databases flooded with Big Data. The key objective is to assist retail business owners to better understand the purchase needs of their customers and hence to attract customers to physical retail stores away from competitor e-commerce websites.

Design/methodology/approach

This paper employs a systematic and category-based review of relevant literature to explore the challenges possessed by Big Data for retail industry followed by discussion and implementation of association between MapReduce based Apriori association mining and Hadoop-based intelligent cloud architecture.

Findings

The findings reveal that conventional mining algorithms have not evolved to support Big Data analysis as required by modern retail businesses. They require a lot of resources such as memory and computational engines. This study aims to develop MR-Apriori algorithm in the form of IRM tool to address all these issues in an efficient manner.

Research limitations/implications

The paper suggests that a lot of research is yet to be done in market basket analysis, if full potential of cloud-based Big Data framework is required to be utilized.

Originality/value

This research arms the retail business owners with innovative IRM tool to easily extract comprehensive knowledge of useful buying patterns of customers to increase profits. This study experimentally verifies the effectiveness of proposed algorithm.

Details

Industrial Management & Data Systems, vol. 117 no. 7
Type: Research Article
ISSN: 0263-5577

Keywords

Article
Publication date: 6 January 2023

Lisa Higgins, Anthony Marshall, Kirsten Crysel and Jacob Dencik

Because of its effectiveness, process mining is rapidly becoming ubiquitous. A recent IBM Institute for Business Value (IBV) survey found that 65 percent of organizations report…

Abstract

Purpose

Because of its effectiveness, process mining is rapidly becoming ubiquitous. A recent IBM Institute for Business Value (IBV) survey found that 65 percent of organizations report actively using process mining to improve processes. And in partnership with software-as-a-service (SaaS) providers to add even greater insight into their processes, 69 percent compare their organization’s data with other SaaS customers. And as many as 77 percent of supply chain executives say they are at least at the implementation stage of process and task mining.

Design/methodology/approach

The IBM Institute for Business Value and APQC, in cooperation with Oxford Economics, surveyed 2,000 C-level executives in first half of 2022 from 13 countries in all major geographies and across 22 industries. The IBV and APQC implemented an in-depth analysis of how organizations use benchmarking and process mining tools, the benefits they gain from use of these tools and how they anticipate using them in the future.

Findings

Big data and digital technologies also creates new possibilities for measuring performance and revealing process improvement opportunities through process mining ? a relatively new discipline that applies data science to discover, validate and improve workflows in real time.

Practical/implications

By utilizing data from IT systems to create a process model and then examining the end-to-end process, process mining enables root causes of variations from norms to be identified using specialized algorithms, and these insights enable management to see if processes are functioning as intended and identify new opportunities to optimize them.

Originality/value

More recently, the scope of process mining initiatives has widened to encompass more sophisticated mission-critical functions, notably human capital, cybersecurity and sales. Organizations that embrace process mining outperform others across key business measures, including profitability, innovation, agility, customer satisfaction and technological sophistication. 10;

Details

Strategy & Leadership, vol. 51 no. 2
Type: Research Article
ISSN: 1087-8572

Book part
Publication date: 15 May 2023

Birol Yıldız and Şafak Ağdeniz

Purpose: The main aim of the study is to provide a tool for non-financial information in decision-making. We analysed the non-financial data in the annual reports in order to show…

Abstract

Purpose: The main aim of the study is to provide a tool for non-financial information in decision-making. We analysed the non-financial data in the annual reports in order to show the usage of this information in financial decision processes.

Need for the Study: Main financial reports such as balance sheets and income statements can be analysed by statistical methods. However, an expanded financial reporting framework needs new analysing methods due to unstructured and big data. The study offers a solution to the analysis problem that comes with non-financial reporting, which is an essential communication tool in corporate reporting.

Methodology: Text mining analysis of annual reports is conducted using software named R. To simplify the problem, we try to predict the companies’ corporate governance qualifications using text mining. K Nearest Neighbor, Naive Bayes and Decision Tree machine learning algorithms were used.

Findings: Our analysis illustrates that K Nearest Neighbor has classified the highest number of correct classifications by 85%, compared to 50% for the random walk. The empirical evidence suggests that text mining can be used by all stakeholders as a financial analysis method.

Practical Implications: Combining financial statement analyses with financial reporting analyses will decrease the information asymmetry between the company and stakeholders. So stakeholders can make more accurate decisions. Analysis of non-financial data with text mining will provide a decisive competitive advantage, especially for investors to make the right decisions. This method will lead to allocating scarce resources more effectively. Another contribution of the study is that stakeholders can predict the corporate governance qualification of the company from the annual reports even if it does not include in the Corporate Governance Index (CGI).

Details

Contemporary Studies of Risks in Emerging Technology, Part B
Type: Book
ISBN: 978-1-80455-567-5

Keywords

Article
Publication date: 2 October 2018

Vian Ahmed, Zeeshan Aziz, Algan Tezel and Zainab Riaz

The purpose of this paper is to explore the current challenges and drivers for data mining in the AEC sector.

1190

Abstract

Purpose

The purpose of this paper is to explore the current challenges and drivers for data mining in the AEC sector.

Design/methodology/approach

Following a comprehensive literature review, the data mining concept was investigated through a workshop with industry experts and academics.

Findings

The results showed that the key drivers for using data mining within the AEC sector is associated with the sustainability, process improvement, market intelligence, cost certainty and cost reduction, performance certainty and decision support systems agendas in the sector. As for the processes with the greatest potential for data mining application, design, construction, procurement, forensic analysis, sustainability and energy consumption and reuse of digital components were perceived as the main process areas. While the key challenges were perceived as being, data issues due to the fragmented nature of the construction process, the need for a cultural change, IT systems used in silos, skills requirements and having clearly defined business goals.

Originality/value

With the increasing abundance of data, business intelligence and analytics and its related concepts, data mining and Big Data have captured the attention of practitioners and academics for the last 20 years. On the other hand, and despite the growing amount of data in its business context, the AEC sector still lags behind in utilising those concepts in its end products and daily operations with limited research conducted to explore those issues at the sector level. This paper investigates the main opportunities and barriers for data mining in the AEC sector with a practical focus.

Details

Engineering, Construction and Architectural Management, vol. 25 no. 11
Type: Research Article
ISSN: 0969-9988

Keywords

1 – 10 of over 19000