Search results
1 – 10 of over 41000Parag C. Pendharkar and James A. Rodger
client/server(C/S) systems have revolutionized the systems development approach. Among the drivers of the C/S systems is the lower price/performance ratio compared to the…
Abstract
client/server(C/S) systems have revolutionized the systems development approach. Among the drivers of the C/S systems is the lower price/performance ratio compared to the mainframe‐based transaction processing systems. Data mining is a process of identifying patterns in corporate transactional and operational databases (also called data warehouses). As most Fortune 500 companies are moving quickly towards the client server systems, it is increasingly becoming important that a data mining approaches should be adapted for C/S systems. In the current paper, we describe different data mining approaches that are used in the C/S systems.
Details
Keywords
Richard S. Segall, Gauri S. Guha and Sarath A. Nonis
This paper seeks to present a complete set of graphical and numerical outputs of data mining performed for microarray databases of plant data as described in earlier research by…
Abstract
Purpose
This paper seeks to present a complete set of graphical and numerical outputs of data mining performed for microarray databases of plant data as described in earlier research by the authors. A brief description of data mining is also presented, as well as a brief background of previous research.
Design/methodology/approach
The paper uses applications of data mining using SAS Enterprise Miner Version 4 for plant data from the Osmotic Stress Microarray Information Database (OSMID) that is available on the web for both normalized and log(2) transformed data.
Findings
This paper illustrates that useful information about the effects of environmental stress tolerances (ESTs) on plants can be obtained by using data mining.
Research limitations/implications
Use of SAS Enterprise Miner was very effective for performing data mining of microarray databases with its modules of cluster analysis, decision trees, and descriptive and visual statistics.
Practical implications
The data used from the OSMID database are considered to be representative of those that could be used for biotech application such as the manufacture of plant‐made‐pharmaceuticals and genetically modified foods.
Originality/value
This paper contributes to the discussion on the use of data mining for microarray databases and specifically for studying the effects of ESTs on plants.
Details
Keywords
Mohammed Ayoub Ledhem and Warda Moussaoui
This paper aims to apply several data mining techniques for predicting the daily precision improvement of Jakarta Islamic Index (JKII) prices based on big data of symmetric…
Abstract
Purpose
This paper aims to apply several data mining techniques for predicting the daily precision improvement of Jakarta Islamic Index (JKII) prices based on big data of symmetric volatility in Indonesia’s Islamic stock market.
Design/methodology/approach
This research uses big data mining techniques to predict daily precision improvement of JKII prices by applying the AdaBoost, K-nearest neighbor, random forest and artificial neural networks. This research uses big data with symmetric volatility as inputs in the predicting model, whereas the closing prices of JKII were used as the target outputs of daily precision improvement. For choosing the optimal prediction performance according to the criteria of the lowest prediction errors, this research uses four metrics of mean absolute error, mean squared error, root mean squared error and R-squared.
Findings
The experimental results determine that the optimal technique for predicting the daily precision improvement of the JKII prices in Indonesia’s Islamic stock market is the AdaBoost technique, which generates the optimal predicting performance with the lowest prediction errors, and provides the optimum knowledge from the big data of symmetric volatility in Indonesia’s Islamic stock market. In addition, the random forest technique is also considered another robust technique in predicting the daily precision improvement of the JKII prices as it delivers closer values to the optimal performance of the AdaBoost technique.
Practical implications
This research is filling the literature gap of the absence of using big data mining techniques in the prediction process of Islamic stock markets by delivering new operational techniques for predicting the daily stock precision improvement. Also, it helps investors to manage the optimal portfolios and to decrease the risk of trading in global Islamic stock markets based on using big data mining of symmetric volatility.
Originality/value
This research is a pioneer in using big data mining of symmetric volatility in the prediction of an Islamic stock market index.
Details
Keywords
Sukjin You, Soohyung Joo and Marie Katsurai
The purpose of this study is to explore to which extent data mining research would be associated with the library and information science (LIS) discipline. This study aims to…
Abstract
Purpose
The purpose of this study is to explore to which extent data mining research would be associated with the library and information science (LIS) discipline. This study aims to identify data mining related subject terms and topics in representative LIS scholarly publications.
Design/methodology/approach
A large set of bibliographic records over 38,000 was collected from a scholarly database representing the fields of LIS and the data mining, respectively. A multitude of text mining techniques were applied to investigate prevailing subject terms and research topics, such as influential term analysis and Dirichlet multinomial regression topic modeling.
Findings
The findings of this study revealed the relationship between the LIS and data mining research domains. Various data mining method terms were observed in recent LIS publications, such as machine learning, artificial intelligence and neural networks. The topic modeling result identified prevailing data mining related research topics in LIS, such as machine learning, deep learning, big data and among others. In addition, this study investigated the trends of popular topics in LIS over time in the recent decade.
Originality/value
This investigation is one of a few studies that empirically investigated the relationships between the LIS and data mining research domains. Multiple text mining techniques were employed to delineate to which extent the two research domains would be associated with each other based on both at the term-level and topic-level analysis. Methodologically, the study identified influential terms in each domain using multiple feature selection indices. In addition, Dirichlet multinomial regression was applied to explore LIS topics in relation to data mining.
Details
Keywords
Process mining provides a generic collection of techniques to turn event data into valuable insights, improvement ideas, predictions, and recommendations. This paper uses…
Abstract
Purpose
Process mining provides a generic collection of techniques to turn event data into valuable insights, improvement ideas, predictions, and recommendations. This paper uses spreadsheets as a metaphor to introduce process mining as an essential tool for data scientists and business analysts. The purpose of this paper is to illustrate that process mining can do with events what spreadsheets can do with numbers.
Design/methodology/approach
The paper discusses the main concepts in both spreadsheets and process mining. Using a concrete data set as a running example, the different types of process mining are explained. Where spreadsheets work with numbers, process mining starts from event data with the aim to analyze processes.
Findings
Differences and commonalities between spreadsheets and process mining are described. Unlike process mining tools like ProM, spreadsheets programs cannot be used to discover processes, check compliance, analyze bottlenecks, animate event data, and provide operational process support. Pointers to existing process mining tools and their functionality are given.
Practical implications
Event logs and operational processes can be found everywhere and process mining techniques are not limited to specific application domains. Comparable to spreadsheet software widely used in finance, production, sales, education, and sports, process mining software can be used in a broad range of organizations.
Originality/value
The paper provides an original view on process mining by relating it to the spreadsheets. The value of spreadsheet-like technology tailored toward the analysis of behavior rather than numbers is illustrated by the over 20 commercial process mining tools available today and the growing adoption in a variety of application domains.
Details
Keywords
Ana Rocío Cárdenas Maita, Lucas Corrêa Martins, Carlos Ramón López Paz, Sarajane Marques Peres and Marcelo Fantinato
Process mining is a research area used to discover, monitor and improve real business processes by extracting knowledge from event logs available in process-aware information…
Abstract
Purpose
Process mining is a research area used to discover, monitor and improve real business processes by extracting knowledge from event logs available in process-aware information systems. The purpose of this paper is to evaluate the application of artificial neural networks (ANNs) and support vector machines (SVMs) in data mining tasks in the process mining context. The goal was to understand how these computational intelligence techniques are currently being applied in process mining.
Design/methodology/approach
The authors conducted a systematic literature review with three research questions formulated to evaluate the use of ANNs and SVMs in process mining.
Findings
The authors identified 11 papers as primary studies according to the criteria established in the review protocol. Most of them deal with process mining enhancement, mainly using ANNs. Regarding the data mining task, the authors identified three types of tasks used: categorical prediction (or classification); numeric prediction, considering the “regression” type, and clustering analysis.
Originality/value
Although there is scientific interest in process mining, little attention has been specifically given to ANNs and SVM. This scenario does not reflect the general context of data mining, where these two techniques are widely used. This low use may be possibly due to a relative lack of knowledge about their potential for this type of problem, which the authors seek to reverse with the completion of this study.
Details
Keywords
Tim France, Dave Yen, Jyun‐Cheng Wang and Chia‐Ming Chang
In recent years, the World Wide Web (WWW) has become incredibly popular in homes and offices alike. Consumers need to search for relevant information to help solve purchasing…
Abstract
In recent years, the World Wide Web (WWW) has become incredibly popular in homes and offices alike. Consumers need to search for relevant information to help solve purchasing problems on various Web sites. Although there is no question that great numbers of WWW users will continue using search engines for information retrieval, consumers still hesitate before making a final decision, often because only rough and limited information about the products is made available. Consequently, consumers need the help of data mining in order to help them make informed decisions. Herein we propose a new approach to integrating a search engine with data mining in an effort to help support customer‐oriented information search action. This approach also illustrates how to reduce the consumer’s information search perplexity.
Details
Keywords
Chan‐Chine Chang and Ruey‐Shun Chen
Traditional library catalogs have become inefficient and inconvenient in assisting library users. Readers may spend a lot of time searching library materials via printed catalogs…
Abstract
Purpose
Traditional library catalogs have become inefficient and inconvenient in assisting library users. Readers may spend a lot of time searching library materials via printed catalogs. Readers need an intelligent and innovative solution to overcome this problem. The paper seeks to examine data mining technology which is a good approach to fulfill readers' requirements.
Design/methodology/approach
Data mining is considered to be the non‐trivial extraction of implicit, previously unknown, and potentially useful information from data. This paper analyzes readers' borrowing records using the techniques of data analysis, building a data warehouse, and data mining.
Findings
The paper finds that after mining data, readers can be classified into different groups according to the publications in which they are interested. Some people on the campus also have a greater preference for multimedia data.
Originality/value
The data mining results shows that all readers can be categorized into five clusters, and each cluster has its own characteristics. The frequency with which graduates and associate researchers borrow multimedia data is much higher. This phenomenon shows that these readers have a higher preference for accepting digitized publications. Also, the number of readers borrowing multimedia data has increased over the years. This trend indicates that readers preferences are gradually shifting towards reading digital publications.
Details
Keywords
Sumeer Gul, Shohar Bano and Taseen Shah
Data mining along with its varied technologies like numerical mining, textual mining, multimedia mining, web mining, sentiment analysis and big data mining proves itself as an…
Abstract
Purpose
Data mining along with its varied technologies like numerical mining, textual mining, multimedia mining, web mining, sentiment analysis and big data mining proves itself as an emerging field and manifests itself in the form of different techniques such as information mining; big data mining; big data mining and Internet of Things (IoT); and educational data mining. This paper aims to discuss how these technologies and techniques are used to derive information and, eventually, knowledge from data.
Design/methodology/approach
An extensive review of literature on data mining and its allied techniques was carried to ascertain the emerging procedures and techniques in the domain of data mining. Clarivate Analytic’s Web of Science and Sciverse Scopus were explored to discover the extent of literature published on Data Mining and its varied facets. Literature was searched against various keywords such as data mining; information mining; big data; big data and IoT; and educational data mining. Further, the works citing the literature on data mining were also explored to visualize a broad gamut of emerging techniques about this growing field.
Findings
The study validates that knowledge discovery in databases has rendered data mining as an emerging field; the data present in these databases paves the way for data mining techniques and analytics. This paper provides a unique view about the usage of data, and logical patterns derived from it, how new procedures, algorithms and mining techniques are being continuously upgraded for their multipurpose use for the betterment of human life and experiences.
Practical implications
The paper highlights different aspects of data mining, its different technological approaches, and how these emerging data technologies are used to derive logical insights from data and make data more meaningful.
Originality/value
The paper tries to highlight the current trends and facets of data mining.
Details
Keywords
Yanyan Wang and Jin Zhang
Data mining has been a popular research area in the past decades. Many researchers study data-mining theories, methods, applications and trends; however, there are very few…
Abstract
Purpose
Data mining has been a popular research area in the past decades. Many researchers study data-mining theories, methods, applications and trends; however, there are very few studies on data-mining-related topics in social media. This paper aims to explore the topics related to data mining based on the data collected from Wikipedia.
Design/methodology/approach
In total, 402 data-mining-related articles were obtained from Wikipedia. These articles were manually classified into several categories by the coding method. Each category formed an article-term matrix. These matrices were analysed and visualized by the self-organizing map approach. Several clusters were observed in each category. Finally, the topics of these clusters were extracted by content analysis.
Findings
The articles obtained were classified into six categories: applications, foundation and concepts, methodologies, organizations, related fields and topics and technology support. Business, biology and security were the three prominent topics of the applications category. The technologies supporting data mining were software, systems, databases, programming languages and so forth. The general public was more interested in data-mining organizations than the researchers. They also focused on the applications of data mining in business more than in other fields.
Originality/value
This study will help researchers gain insight into the general public’s perceptions of data mining and discover the gap between the general public and themselves. It will assist researchers in finding new techniques and methods which will potentially provide them with new data-mining methods and research topics.
Details