Search results

1 – 10 of over 30000
To view the access options for this content please click here
Article
Publication date: 1 December 2000

Parag C. Pendharkar and James A. Rodger

client/server(C/S) systems have revolutionized the systems development approach. Among the drivers of the C/S systems is the lower price/performance ratio compared to the…

Abstract

client/server(C/S) systems have revolutionized the systems development approach. Among the drivers of the C/S systems is the lower price/performance ratio compared to the mainframe‐based transaction processing systems. Data mining is a process of identifying patterns in corporate transactional and operational databases (also called data warehouses). As most Fortune 500 companies are moving quickly towards the client server systems, it is increasingly becoming important that a data mining approaches should be adapted for C/S systems. In the current paper, we describe different data mining approaches that are used in the C/S systems.

Details

Journal of Systems and Information Technology, vol. 4 no. 2
Type: Research Article
ISSN: 1328-7265

Keywords

To view the access options for this content please click here
Article
Publication date: 15 February 2008

Richard S. Segall, Gauri S. Guha and Sarath A. Nonis

This paper seeks to present a complete set of graphical and numerical outputs of data mining performed for microarray databases of plant data as described in earlier…

Abstract

Purpose

This paper seeks to present a complete set of graphical and numerical outputs of data mining performed for microarray databases of plant data as described in earlier research by the authors. A brief description of data mining is also presented, as well as a brief background of previous research.

Design/methodology/approach

The paper uses applications of data mining using SAS Enterprise Miner Version 4 for plant data from the Osmotic Stress Microarray Information Database (OSMID) that is available on the web for both normalized and log(2) transformed data.

Findings

This paper illustrates that useful information about the effects of environmental stress tolerances (ESTs) on plants can be obtained by using data mining.

Research limitations/implications

Use of SAS Enterprise Miner was very effective for performing data mining of microarray databases with its modules of cluster analysis, decision trees, and descriptive and visual statistics.

Practical implications

The data used from the OSMID database are considered to be representative of those that could be used for biotech application such as the manufacture of plant‐made‐pharmaceuticals and genetically modified foods.

Originality/value

This paper contributes to the discussion on the use of data mining for microarray databases and specifically for studying the effects of ESTs on plants.

Details

Kybernetes, vol. 37 no. 1
Type: Research Article
ISSN: 0368-492X

Keywords

To view the access options for this content please click here
Article
Publication date: 17 April 2020

Kevin Watson and Dinah M. Payne

The purpose of this paper is to review current practice in sharing and mining medical data revealing benefits, costs and ethical issues. Based on stakeholder perspectives…

Abstract

Purpose

The purpose of this paper is to review current practice in sharing and mining medical data revealing benefits, costs and ethical issues. Based on stakeholder perspectives and values, the authors create an ethical code to regulate the sharing and mining of medical information.

Design/methodology/approach

The framework is based on a review of academic, practitioner and legal research.

Findings

Owing to the inability of current safeguards to protect consumers from risks related to the disclosure of medical information, the authors develop a framework for ethical sharing and mining of medical data, security, transparency, respect, accountability, community and quality (STRACQ), which espouses security, transparency, respect, accountability, community and quality as the basic tenets of ethical data sharing and mining practice.

Research limitations/implications

The STRACQ framework is an original, previously unpublished contribution that will require modification over time based on discussion and debate within and among the academy, medical community and public policymakers.

Social implications

The framework for sharing borrows from the Fair Credit Reporting Act, allowing the collection and dissemination of identified medical data but placing strict limitations on use. Following this framework, benefits of shared and mined medical data are freely available with appropriate safeguards for consumer privacy.

Originality/value

Mandates for adoption of electronic health-care records require an understanding of medical data mining. This paper presents a review of data mining techniques and reasons for engaging in the practice of identifying benefits, costs and ethical issues. The authors create an original framework, STRACQ, for ethical sharing and mining of medical information, allowing knowledge exploration while protecting consumer privacy.

Details

Journal of Information, Communication and Ethics in Society, vol. 19 no. 1
Type: Research Article
ISSN: 1477-996X

Keywords

Content available
Article
Publication date: 2 February 2018

Wil van der Aalst

Process mining provides a generic collection of techniques to turn event data into valuable insights, improvement ideas, predictions, and recommendations. This paper uses…

Abstract

Purpose

Process mining provides a generic collection of techniques to turn event data into valuable insights, improvement ideas, predictions, and recommendations. This paper uses spreadsheets as a metaphor to introduce process mining as an essential tool for data scientists and business analysts. The purpose of this paper is to illustrate that process mining can do with events what spreadsheets can do with numbers.

Design/methodology/approach

The paper discusses the main concepts in both spreadsheets and process mining. Using a concrete data set as a running example, the different types of process mining are explained. Where spreadsheets work with numbers, process mining starts from event data with the aim to analyze processes.

Findings

Differences and commonalities between spreadsheets and process mining are described. Unlike process mining tools like ProM, spreadsheets programs cannot be used to discover processes, check compliance, analyze bottlenecks, animate event data, and provide operational process support. Pointers to existing process mining tools and their functionality are given.

Practical implications

Event logs and operational processes can be found everywhere and process mining techniques are not limited to specific application domains. Comparable to spreadsheet software widely used in finance, production, sales, education, and sports, process mining software can be used in a broad range of organizations.

Originality/value

The paper provides an original view on process mining by relating it to the spreadsheets. The value of spreadsheet-like technology tailored toward the analysis of behavior rather than numbers is illustrated by the over 20 commercial process mining tools available today and the growing adoption in a variety of application domains.

Details

Business Process Management Journal, vol. 24 no. 1
Type: Research Article
ISSN: 1463-7154

Keywords

To view the access options for this content please click here
Article
Publication date: 2 November 2015

Ana Rocío Cárdenas Maita, Lucas Corrêa Martins, Carlos Ramón López Paz, Sarajane Marques Peres and Marcelo Fantinato

Process mining is a research area used to discover, monitor and improve real business processes by extracting knowledge from event logs available in process-aware…

Abstract

Purpose

Process mining is a research area used to discover, monitor and improve real business processes by extracting knowledge from event logs available in process-aware information systems. The purpose of this paper is to evaluate the application of artificial neural networks (ANNs) and support vector machines (SVMs) in data mining tasks in the process mining context. The goal was to understand how these computational intelligence techniques are currently being applied in process mining.

Design/methodology/approach

The authors conducted a systematic literature review with three research questions formulated to evaluate the use of ANNs and SVMs in process mining.

Findings

The authors identified 11 papers as primary studies according to the criteria established in the review protocol. Most of them deal with process mining enhancement, mainly using ANNs. Regarding the data mining task, the authors identified three types of tasks used: categorical prediction (or classification); numeric prediction, considering the “regression” type, and clustering analysis.

Originality/value

Although there is scientific interest in process mining, little attention has been specifically given to ANNs and SVM. This scenario does not reflect the general context of data mining, where these two techniques are widely used. This low use may be possibly due to a relative lack of knowledge about their potential for this type of problem, which the authors seek to reverse with the completion of this study.

Details

Business Process Management Journal, vol. 21 no. 6
Type: Research Article
ISSN: 1463-7154

Keywords

To view the access options for this content please click here
Article
Publication date: 1 December 2002

Tim France, Dave Yen, Jyun‐Cheng Wang and Chia‐Ming Chang

In recent years, the World Wide Web (WWW) has become incredibly popular in homes and offices alike. Consumers need to search for relevant information to help solve…

Abstract

In recent years, the World Wide Web (WWW) has become incredibly popular in homes and offices alike. Consumers need to search for relevant information to help solve purchasing problems on various Web sites. Although there is no question that great numbers of WWW users will continue using search engines for information retrieval, consumers still hesitate before making a final decision, often because only rough and limited information about the products is made available. Consequently, consumers need the help of data mining in order to help them make informed decisions. Herein we propose a new approach to integrating a search engine with data mining in an effort to help support customer‐oriented information search action. This approach also illustrates how to reduce the consumer’s information search perplexity.

Details

Information Management & Computer Security, vol. 10 no. 5
Type: Research Article
ISSN: 0968-5227

Keywords

To view the access options for this content please click here
Article
Publication date: 1 May 2006

Chan‐Chine Chang and Ruey‐Shun Chen

Traditional library catalogs have become inefficient and inconvenient in assisting library users. Readers may spend a lot of time searching library materials via printed…

Abstract

Purpose

Traditional library catalogs have become inefficient and inconvenient in assisting library users. Readers may spend a lot of time searching library materials via printed catalogs. Readers need an intelligent and innovative solution to overcome this problem. The paper seeks to examine data mining technology which is a good approach to fulfill readers' requirements.

Design/methodology/approach

Data mining is considered to be the non‐trivial extraction of implicit, previously unknown, and potentially useful information from data. This paper analyzes readers' borrowing records using the techniques of data analysis, building a data warehouse, and data mining.

Findings

The paper finds that after mining data, readers can be classified into different groups according to the publications in which they are interested. Some people on the campus also have a greater preference for multimedia data.

Originality/value

The data mining results shows that all readers can be categorized into five clusters, and each cluster has its own characteristics. The frequency with which graduates and associate researchers borrow multimedia data is much higher. This phenomenon shows that these readers have a higher preference for accepting digitized publications. Also, the number of readers borrowing multimedia data has increased over the years. This trend indicates that readers preferences are gradually shifting towards reading digital publications.

Details

The Electronic Library, vol. 24 no. 3
Type: Research Article
ISSN: 0264-0473

Keywords

To view the access options for this content please click here
Article
Publication date: 7 August 2017

Yanyan Wang and Jin Zhang

Data mining has been a popular research area in the past decades. Many researchers study data-mining theories, methods, applications and trends; however, there are very…

Abstract

Purpose

Data mining has been a popular research area in the past decades. Many researchers study data-mining theories, methods, applications and trends; however, there are very few studies on data-mining-related topics in social media. This paper aims to explore the topics related to data mining based on the data collected from Wikipedia.

Design/methodology/approach

In total, 402 data-mining-related articles were obtained from Wikipedia. These articles were manually classified into several categories by the coding method. Each category formed an article-term matrix. These matrices were analysed and visualized by the self-organizing map approach. Several clusters were observed in each category. Finally, the topics of these clusters were extracted by content analysis.

Findings

The articles obtained were classified into six categories: applications, foundation and concepts, methodologies, organizations, related fields and topics and technology support. Business, biology and security were the three prominent topics of the applications category. The technologies supporting data mining were software, systems, databases, programming languages and so forth. The general public was more interested in data-mining organizations than the researchers. They also focused on the applications of data mining in business more than in other fields.

Originality/value

This study will help researchers gain insight into the general public’s perceptions of data mining and discover the gap between the general public and themselves. It will assist researchers in finding new techniques and methods which will potentially provide them with new data-mining methods and research topics.

Details

The Electronic Library, vol. 35 no. 4
Type: Research Article
ISSN: 0264-0473

Keywords

To view the access options for this content please click here
Article
Publication date: 4 May 2010

Qingyu Zhang and Richard S. Segall

The purpose of this paper is to review and compare selected software for data mining, text mining (TM), and web mining that are not available as free open‐source software.

Abstract

Purpose

The purpose of this paper is to review and compare selected software for data mining, text mining (TM), and web mining that are not available as free open‐source software.

Design/methodology/approach

Selected softwares are compared with their common and unique features. The software for data mining are SAS® Enterprise Miner™, Megaputer PolyAnalyst® 5.0, NeuralWare Predict®, and BioDiscovery GeneSight®. The software for TM are CompareSuite, SAS® Text Miner, TextAnalyst, VisualText, Megaputer PolyAnalyst® 5.0, and WordStat. The software for web mining are Megaputer PolyAnalyst®, SPSS Clementine®, ClickTracks, and QL2.

Findings

This paper discusses and compares the existing features, characteristics, and algorithms of selected software for data mining, TM, and web mining, respectively. These softwares are also applied to available data sets.

Research limitations/implications

The limitations are the inclusion of selected software and datasets rather than considering the entire realm of these. This review could be used as a framework for comparing other data, text, and web mining software.

Practical implications

This paper can be helpful for an organization or individual when choosing proper software to meet their mining needs.

Originality/value

Each of the software selected for this research has its own unique characteristics, properties, and algorithms. No other paper compares these selected softwares both visually and descriptively for all the three types of data, text, and web mining.

Details

Kybernetes, vol. 39 no. 4
Type: Research Article
ISSN: 0368-492X

Keywords

To view the access options for this content please click here
Article
Publication date: 13 March 2009

Ranjit Bose

Advanced analytics‐driven data analyses allow enterprises to have a complete or “360 degrees” view of their operations and customers. The insight that they gain from such…

Abstract

Purpose

Advanced analytics‐driven data analyses allow enterprises to have a complete or “360 degrees” view of their operations and customers. The insight that they gain from such analyses is then used to direct, optimize, and automate their decision making to successfully achieve their organizational goals. Data, text, and web mining technologies are some of the key contributors to making advanced analytics possible. This paper aims to investigate these three mining technologies in terms of how they are used and the issues that are related to their effective implementation and management within the broader context of predictive or advanced analytics.

Design/methodology/approach

A range of recently published research literature on business intelligence (BI); predictive analytics; and data, text and web mining is reviewed to explore their current state, issues and challenges learned from their practice.

Findings

The findings are reported in two parts. The first part discusses a framework for BI using the data, text, and web mining technologies for advanced analytics; and the second part identifies and discusses the opportunities and challenges the business managers dealing with these technologies face for gaining competitive advantages for their businesses.

Originality/value

The study findings are intended to assist business managers to effectively understand the issues and emerging technologies behind advanced analytics implementation.

Details

Industrial Management & Data Systems, vol. 109 no. 2
Type: Research Article
ISSN: 0263-5577

Keywords

1 – 10 of over 30000