Search results

1 – 10 of over 102000
Article
Publication date: 29 April 2014

Wei-Chao Lin, Chih-Fong Tsai and Shih-Wen Ke

Churn prediction is a very important task for successful customer relationship management. In general, churn prediction can be achieved by many data mining techniques. However…

Abstract

Purpose

Churn prediction is a very important task for successful customer relationship management. In general, churn prediction can be achieved by many data mining techniques. However, during data mining, dimensionality reduction (or feature selection) and data reduction are the two important data preprocessing steps. In particular, the aims of feature selection and data reduction are to filter out irrelevant features and noisy data samples, respectively. The purpose of this paper, performing these data preprocessing tasks, is to make the mining algorithm produce good quality mining results.

Design/methodology/approach

Based on a real telecom customer churn data set, seven different preprocessed data sets based on performing feature selection and data reduction by different priorities are used to train the artificial neural network as the churn prediction model.

Findings

The results show that performing data reduction first by self-organizing maps and feature selection second by principal component analysis can allow the prediction model to provide the highest prediction accuracy. In addition, this priority allows the prediction model for more efficient learning since 66 and 62 percent of the original features and data samples are reduced, respectively.

Originality/value

The contribution of this paper is to understand the better procedure of performing the two important data preprocessing steps for telecom churn prediction.

Details

Kybernetes, vol. 43 no. 5
Type: Research Article
ISSN: 0368-492X

Keywords

Article
Publication date: 29 April 2014

Mohammad Amin Shayegan and Saeed Aghabozorgi

Pattern recognition systems often have to handle problem of large volume of training data sets including duplicate and similar training samples. This problem leads to large memory…

Abstract

Purpose

Pattern recognition systems often have to handle problem of large volume of training data sets including duplicate and similar training samples. This problem leads to large memory requirement for saving and processing data, and the time complexity for training algorithms. The purpose of the paper is to reduce the volume of training part of a data set – in order to increase the system speed, without any significant decrease in system accuracy.

Design/methodology/approach

A new technique for data set size reduction – using a version of modified frequency diagram approach – is presented. In order to reduce processing time, the proposed method compares the samples of a class to other samples in the same class, instead of comparing samples from different classes. It only removes patterns that are similar to the generated class template in each class. To achieve this aim, no feature extraction operation was carried out, in order to produce more precise assessment on the proposed data size reduction technique.

Findings

The results from the experiments, and according to one of the biggest handwritten numeral standard optical character recognition (OCR) data sets, Hoda, show a 14.88 percent decrease in data set volume without significant decrease in performance.

Practical implications

The proposed technique is effective for size reduction for all pictorial databases such as OCR data sets.

Originality/value

State-of-the-art algorithms currently used for data set size reduction usually remove samples near to class's centers, or support vector (SV) samples between different classes. However, the samples near to a class center have valuable information about class characteristics, and they are necessary to build a system model. Also, SV s are important samples to evaluate the system efficiency. The proposed technique, unlike the other available methods, keeps both outlier samples, as well as the samples close to the class centers.

Article
Publication date: 29 January 2018

Wasim Ahmad Bhat

The purpose of this paper is to investigate the prospects of current storage technologies for long-term preservation of big data in digital libraries.

3302

Abstract

Purpose

The purpose of this paper is to investigate the prospects of current storage technologies for long-term preservation of big data in digital libraries.

Design/methodology/approach

The study employs a systematic and critical review of the relevant literature to explore the prospects of current storage technologies for long-term preservation of big data in digital libraries. Online computer databases were searched to identify the relevant literature published between 2000 and 2016. A specific inclusion and exclusion criterion was formulated and applied in two distinct rounds to determine the most relevant papers.

Findings

The study concludes that the current storage technologies are not viable for long-term preservation of big data in digital libraries. They can neither fulfil all the storage demands nor alleviate the financial expenditures of digital libraries. The study also points out that migrating to emerging storage technologies in digital libraries is a long-term viable solution.

Research limitations/implications

The study suggests that continuous innovation and research efforts in current storage technologies are required to lessen the impact of storage shortage on digital libraries, and to allow emerging storage technologies to advance further and take over. At the same time, more aggressive research and development efforts are required by academics and industry to further advance the emerging storage technologies for their timely and swift adoption by digital libraries.

Practical implications

The study reveals that digital libraries, besides incurring significant financial expenditures, will suffer from potential loss of information due to storage shortage for long-term preservation of big data, if current storage technologies are employed by them. Therefore, policy makers and practitioners should meticulously choose storage technologies for long-term preservation of big data in digital libraries.

Originality/value

This type of holistic study that investigates the prospects of magnetic drive technology, solid-state drive technology, and data-reduction techniques for long-term preservation of big data in digital libraries has not been conducted in the field previously, and so provides a novel contribution. The study arms academics, practitioners, policy makers, and industry with the deep understanding of the problem, technical details to choose storage technologies meticulously, greater insight to frame sustainable policies, and opportunities to address various research problems.

Details

Library Hi Tech, vol. 36 no. 3
Type: Research Article
ISSN: 0737-8831

Keywords

Article
Publication date: 27 March 2008

H. Ahmadi‐Noubari, A. Pourshaghaghy, F. Kowsary and A. Hakkaki‐Fard

The purpose of this paper is to reduce the destructive effects of existing unavoidable noises contaminating temperature data in inverse heat conduction problems (IHCP) utilizing…

Abstract

Purpose

The purpose of this paper is to reduce the destructive effects of existing unavoidable noises contaminating temperature data in inverse heat conduction problems (IHCP) utilizing the wavelets.

Design/methodology/approach

For noise reduction, sensor data were treated as input to the filter bank used for signal decomposition and implementation of discrete wavelet transform. This is followed by the application of wavelet denoising algorithm that is applied on the wavelet coefficients of signal components at different resolution levels. Both noisy and de‐noised measurement temperatures are then used as input data to a numerical experiment of IHCP. The inverse problem deals with an estimation of unknown surface heat flux in a 2D slab and is solved by the variable metric method.

Findings

Comparison of estimated heat fluxes obtained using denoised data with those using original sensor data indicates that noise reduction by wavelet has a potential to be a powerful tool for improvement of IHCP results.

Originality/value

Noise reduction using wavelets, while it can be implemented very easily, may also significantly relegate (or even eliminate) conventional regularization schemes commonly used in IHCP.

Details

International Journal of Numerical Methods for Heat & Fluid Flow, vol. 18 no. 2
Type: Research Article
ISSN: 0961-5539

Keywords

Open Access
Article
Publication date: 29 July 2020

Walaa M. El-Sayed, Hazem M. El-Bakry and Salah M. El-Sayed

Wireless sensor networks (WSNs) are periodically collecting data through randomly dispersed sensors (motes), which typically consume high energy in radio communication that mainly…

1326

Abstract

Wireless sensor networks (WSNs) are periodically collecting data through randomly dispersed sensors (motes), which typically consume high energy in radio communication that mainly leans on data transmission within the network. Furthermore, dissemination mode in WSN usually produces noisy values, incorrect measurements or missing information that affect the behaviour of WSN. In this article, a Distributed Data Predictive Model (DDPM) was proposed to extend the network lifetime by decreasing the consumption in the energy of sensor nodes. It was built upon a distributive clustering model for predicting dissemination-faults in WSN. The proposed model was developed using Recursive least squares (RLS) adaptive filter integrated with a Finite Impulse Response (FIR) filter, for removing unwanted reflections and noise accompanying of the transferred signals among the sensors, aiming to minimize the size of transferred data for providing energy efficient. The experimental results demonstrated that DDPM reduced the rate of data transmission to ∼20%. Also, it decreased the energy consumption to 95% throughout the dataset sample and upgraded the performance of the sensory network by about 19.5%. Thus, it prolonged the lifetime of the network.

Details

Applied Computing and Informatics, vol. 19 no. 1/2
Type: Research Article
ISSN: 2634-1964

Keywords

Article
Publication date: 28 September 2021

Abdelrahman M. Farouk, Rahimi A. Rahman and Noor Suraya Romali

Sustainability involves ensuring that sufficient resources are available for current and future generations. Non-revenue water (NRW) creates a barrier to sustainability through…

Abstract

Purpose

Sustainability involves ensuring that sufficient resources are available for current and future generations. Non-revenue water (NRW) creates a barrier to sustainability through energy and water loss. However, a comprehensive overview of NRW reduction strategies is lacking. This study reviews the existing literature to identify available strategies for reducing NRW and its components and discusses their merits.

Design/methodology/approach

A systematic literature review was conducted to identify and analyze different strategies for reducing NRW. The initial search identified 158 articles, with 41 of these deemed suitably relevant following further examination. Finally, 14 NRW reduction strategies were identified from the selected articles.

Findings

The identified NRW reduction strategies were grouped into strategies for reducing (1) apparent losses (AL), (2) real losses (RL) and (3) water losses, with the latter involving the combination of AL and RL. The strategies adopted most frequently are “prevent water leakage” and “control water pressure.” In addition, water distribution network (WDN) rehabilitation has additional benefits over other RL reduction strategies, including saving water and energy, increasing hydraulic performance and enhancing reliability. Finally, utilizing decision support systems is the only strategy capable of reducing multiple NRW categories.

Originality/value

This review provides insights into the overall NRW problem and the strategies best equipped to address it. Authorities can use these findings to develop case-specific NRW reduction action plans that save water and energy, while providing other economic benefits. In addition, NRW reduction can improve WDN reliability.

Details

Smart and Sustainable Built Environment, vol. 12 no. 1
Type: Research Article
ISSN: 2046-6099

Keywords

Article
Publication date: 24 August 2012

Cebrail Çiflikli and Esra Kahya‐Özyirmidokuz

Data mining (DM) is used to improve the performance of manufacturing quality control activity, and reduces productivity loss. The purpose of this paper is to discover useful…

1291

Abstract

Purpose

Data mining (DM) is used to improve the performance of manufacturing quality control activity, and reduces productivity loss. The purpose of this paper is to discover useful hidden patterns from fabric data to reduce the amount of defective goods and increase overall quality.

Design/methodology/approach

This research examines the improvement of manufacturing process via DM techniques. The paper explores the use of different preprocessing and DM techniques (rough sets theory, attribute relevance analysis, anomaly detection analysis, decision trees and rule induction) in carpet manufacturing as the real world application problem. SPSS Clementine Programme, Rosetta Toolkit, ASP (Active Server Pages) and VBScript programming language are used.

Findings

The most important variables of attributes that are effective in product quality are determined. A decision tree (DT) and decision rules are generated. Therefore, the faults in the process are detected. An on‐line programme is generated and the model's results are used to ensure the prevention of faulty products.

Research limitations/implications

In time, this model will lose its validity. Therefore, it must be redeveloped periodically.

Practical implications

This study's productivity can be increased especially with the help of artificial intelligence technology. This research can also be applied to different industries.

Originality/value

The size and complexity of data make extraction difficult. Attribute relevance analysis is proposed for the selection of the attribute variables. The knowledge discovery in databases process is used. In addition, the system can be followed on‐line with this interactive ability.

Details

Industrial Management & Data Systems, vol. 112 no. 8
Type: Research Article
ISSN: 0263-5577

Keywords

Article
Publication date: 10 June 2020

Vincent Onyango and Neil Burford

The purpose of the study is to assess performance of local level planning policies that required new buildings to avoid a specified and rising proportion of projected greenhouse…

Abstract

Purpose

The purpose of the study is to assess performance of local level planning policies that required new buildings to avoid a specified and rising proportion of projected greenhouse gases (GHGs) from their use; it is calculated based on the approved design and plans for the specific development and through the installation and operation of low and zero-carbon generating technologies (LZCGTs).

Design/methodology/approach

Data were extracted from a random sample of 911 new builds from 403 planning applications and related documents, across five Scottish local planning authorities (LPAs) who adopted GHG reduction policies. The data included GHG reduction, LZCGT installation and performance, use of plan designs to meet GHG reductions and exemptions from the GHG policies. Descriptive statistics using SPSS software, complimented by qualitative responses from questionnaires, helped to explain observed performance.

Findings

The policies performed poorly, at the level of delivering low-hanging fruits, with significant room for improvement. Design-led opportunities in the GHG policies were not actively pursued; most LZCGT installation was exempted from GHG policies and the policies were poor in targeting the relationship between building unit size, GHG emission and reductions.

Research limitations/implications

The source documents, where the data came from, had varying quality and completeness and some LPAs are over-represented in the data. The study applied limited criteria to evaluate policy performance.

Practical implications

Areas for policymakers to further focus on when exploring how to enhance role and performance of LZCGT are highlighted, including practical suggestions.

Originality/value

One of the few studies assessing policy performance and distilling lessons, from early adopters of GHG policies at local level planning, offer performance benchmarks and raise points of concern for policymakers.

Details

Management of Environmental Quality: An International Journal, vol. 31 no. 4
Type: Research Article
ISSN: 1477-7835

Keywords

Article
Publication date: 20 November 2017

Jun Li, Ming Lu, Guowei Dou and Shanyong Wang

The purpose of this study is to introduce the concept of big data and provide a comprehensive overview to readers to understand big data application framework in libraries.

2568

Abstract

Purpose

The purpose of this study is to introduce the concept of big data and provide a comprehensive overview to readers to understand big data application framework in libraries.

Design/methodology/approach

The authors first used the text analysis and inductive analysis method to understand the concept of big data, summarize the challenges and opportunities of applying big data in libraries and further propose the big data application framework in libraries. Then they used questionnaire survey method to collect data from librarians to assess the feasibility of applying big data application framework in libraries.

Findings

The challenges of applying big data in libraries mainly include data accuracy, data reduction and compression, data confidentiality and security and big data processing system and technology. The opportunities of applying big data in libraries mainly include enrich the library database, enhance the skills of librarians, promote interlibrary loan service and provide personalized knowledge service. Big data application framework in libraries can be considered from five dimensions: human resource, literature resource, technology support, service innovation and infrastructure construction. Most libraries think that the big data application framework is feasible and tend to apply big data application framework. The main obstacles to prevent them from applying big data application framework is the human resource and information technology level.

Originality/value

This research offers several implications and practical solutions for libraries to apply big data application framework.

Details

Information Discovery and Delivery, vol. 45 no. 4
Type: Research Article
ISSN: 2398-6247

Keywords

Abstract

Details

Qualitative Research in the Study of Leadership
Type: Book
ISBN: 978-1-78560-651-9

1 – 10 of over 102000