Search results

1 – 10 of 992
Article
Publication date: 9 March 2020

Zahra Nematzadeh, Roliana Ibrahim, Ali Selamat and Vahdat Nazerian

The purpose of this study is to enhance data quality and overall accuracy and improve certainty by reducing the negative impacts of the FCM algorithm while clustering real-world…

Abstract

Purpose

The purpose of this study is to enhance data quality and overall accuracy and improve certainty by reducing the negative impacts of the FCM algorithm while clustering real-world data and also decreasing the inherent noise in data sets.

Design/methodology/approach

The present study proposed a new effective model based on fuzzy C-means (FCM), ensemble filtering (ENS) and machine learning algorithms, called an FCM-ENS model. This model is mainly composed of three parts: noise detection, noise filtering and noise classification.

Findings

The performance of the proposed model was tested by conducting experiments on six data sets from the UCI repository. As shown by the obtained results, the proposed noise detection model very effectively detected the class noise and enhanced performance in case the identified class noisy instances were removed.

Originality/value

To the best of the authors’ knowledge, no effort has been made to improve the FCM algorithm in relation to class noise detection issues. Thus, the novelty of existing research is combining the FCM algorithm as a noise detection technique with ENS to reduce the negative effect of inherent noise and increase data quality and accuracy.

Details

Engineering Computations, vol. 37 no. 7
Type: Research Article
ISSN: 0264-4401

Keywords

Article
Publication date: 22 October 2018

Fumiya Togashi, Takashi Misaka, Rainald Löhner and Shigeru Obayashi

It is of paramount importance to ensure safe and fast evacuation routes in cities in case of natural disasters, environmental accidents or acts of terrorism. The same applies to…

Abstract

Purpose

It is of paramount importance to ensure safe and fast evacuation routes in cities in case of natural disasters, environmental accidents or acts of terrorism. The same applies to large-scale events such as concerts, sport events and religious pilgrimages as airports and to traffic hubs such as airports and train stations. The prediction of pedestrian is notoriously difficult because it varies depending on circumstances (age group, cultural characteristics, etc.). In this study, the Ensemble Kalman Filter (EnKF) data assimilation technique, which uses the updated observation data to improve the accuracy of the simulation, was applied to improve the accuracy of numerical simulations of pedestrian flow.

Design/methodology/approach

The EnKF, one of the data assimilation techniques, was applied to the in-house numerical simulation code for pedestrian flow. Two cases were studied in this study. One was the simplified one-directional experimental pedestrian flow. The other was the real pedestrian flow at the Kaaba in Mecca. First, numerical simulations were conducted using the empirical input parameter sets. Then, using the observation data, the EnKF estimated the appropriate input parameter sets. Finally, the numerical simulations using the estimated parameter sets were conducted.

Findings

The EnKF worked on the numerical simulations of pedestrian flow very effectively. In both cases: simplified experiment and real pedestrian flow, the EnKF estimated the proper input parameter sets which greatly improved the accuracy of the numerical simulation. The authors believe that the technique such as EnKF could also be used effectively in other fields of computational engineering where simulations and data have to be merged.

Practical implications

This technique can be used to improve both design and operational implementations of pedestrian and crowd dynamics predictions. It should be of high interest to command and control centers for large crowd events such as concerts, airports, train stations and pilgrimage centers.

Originality/value

To the authors’ knowledge, the data assimilation technique has not been applied to a numerical simulation of pedestrian flow, especially to the real pedestrian flow handling millions pedestrian such as the Mataf at the Kaaba. This study validated the capability and the usefulness of the data assimilation technique to numerical simulations for pedestrian flow.

Details

Engineering Computations, vol. 35 no. 7
Type: Research Article
ISSN: 0264-4401

Keywords

Article
Publication date: 27 May 2021

Sara Tavassoli and Hamidreza Koosha

Customer churn prediction is one of the most well-known approaches to manage and improve customer retention. Machine learning techniques, especially classification algorithms, are…

Abstract

Purpose

Customer churn prediction is one of the most well-known approaches to manage and improve customer retention. Machine learning techniques, especially classification algorithms, are very popular tools to predict the churners. In this paper, three ensemble classifiers are proposed based on bagging and boosting for customer churn prediction.

Design/methodology/approach

In this paper, three ensemble classifiers are proposed based on bagging and boosting for customer churn prediction. The first classifier, which is called boosted bagging, uses boosting for each bagging sample. In this approach, before concluding the final results in a bagging algorithm, the authors try to improve the prediction by applying a boosting algorithm for each bootstrap sample. The second proposed ensemble classifier, which is called bagged bagging, combines bagging with itself. In the other words, the authors apply bagging for each sample of bagging algorithm. Finally, the third approach uses bagging of neural network with learning based on a genetic algorithm.

Findings

To examine the performance of all proposed ensemble classifiers, they are applied to two datasets. Numerical simulations illustrate that the proposed hybrid approaches outperform the simple bagging and boosting algorithms as well as base classifiers. Especially, bagged bagging provides high accuracy and precision results.

Originality/value

In this paper, three novel ensemble classifiers are proposed based on bagging and boosting for customer churn prediction. Not only the proposed approaches can be applied for customer churn prediction but also can be used for any other binary classification algorithms.

Details

Kybernetes, vol. 51 no. 3
Type: Research Article
ISSN: 0368-492X

Keywords

Article
Publication date: 2 May 2017

Candance Doerr-Stevens

This study/paper aims to explore civic participation within multimodal expression. With the rise of content produced and circulated within participatory cultures online, there has…

Abstract

Purpose

This study/paper aims to explore civic participation within multimodal expression. With the rise of content produced and circulated within participatory cultures online, there has been much attention raised regarding questions of audience and attention to this content. For example, does production of media content constitute having a voice if no one is paying attention?

Design/methodology/approach

Using multimodal analysis and mediated discourse analysis, this study examines adolescents’ school-based media production and use of multimodal ensembles to recruit and maintain audience attention to specific content in their radio and video documentaries.

Findings

Research findings reveal deliberate attempts to connect with audience needs when creating media as well as exploration of emerging civic identities.

Research limitations/implications

Questions for how researchers in literacy and learning can further investigate and articulate civic engagement and advocacy are suggested.

Practical implications

Implications for how teachers can use multimodality to create spaces for civic engagement are provided.

Originality/value

This study is original in that few studies have applied the concepts of participatory politics to media products and process conducted in school settings. This study begins to test the utility of these constructs.

Details

English Teaching: Practice & Critique, vol. 16 no. 1
Type: Research Article
ISSN: 1175-8708

Keywords

Article
Publication date: 10 November 2020

Samira Khodabandehlou, S. Alireza Hashemi Golpayegani and Mahmoud Zivari Rahman

Improving the performance of recommender systems (RSs) has always been a major challenge in the area of e-commerce because the systems face issues such as cold start, sparsity…

Abstract

Purpose

Improving the performance of recommender systems (RSs) has always been a major challenge in the area of e-commerce because the systems face issues such as cold start, sparsity, scalability and interest drift that affect their performance. Despite the efforts made to solve these problems, there is still no RS that can solve or reduce all the problems simultaneously. Therefore, the purpose of this study is to provide an effective and comprehensive RS to solve or reduce all of the above issues, which uses a combination of basic customer information as well as big data techniques.

Design/methodology/approach

The most important steps in the proposed RS are: (1) collecting demographic and behavioral data of customers from an e-clothing store; (2) assessing customer personality traits; (3) creating a new user-item matrix based on customer/user interest; (4) calculating the similarity between customers with efficient k-nearest neighbor (EKNN) algorithm based on locality-sensitive hashing (LSH) approach and (5) defining a new similarity function based on a combination of personality traits, demographic characteristics and time-based purchasing behavior that are the key incentives for customers' purchases.

Findings

The proposed method was compared with different baselines (matrix factorization and ensemble). The results showed that the proposed method in terms of all evaluation measures led to a significant improvement in traditional collaborative filtering (CF) performance, and with a significant difference (more than 40%), performed better than all baselines. According to the results, we find that our proposed method, which uses a combination of personality information and demographics, as well as tracking the recent interests and needs of the customer with the LSH approach, helps to improve the effectiveness of the recommendations more than the baselines. This is due to the fact that this method, which uses the above information in conjunction with the LSH technique, is more effective and more accurate in solving problems of cold start, scalability, sparsity and interest drift.

Research limitations/implications

The research data were limited to only one e-clothing store.

Practical implications

In order to achieve an accurate and real-time RS in e-commerce, it is essential to use a combination of customer information with efficient techniques. In this regard, according to the results of the research, the use of personality traits and demographic characteristics lead to a more accurate knowledge of customers' interests and thus better identification of similar customers. Therefore, this information should be considered as a solution to reduce the problems of cold start and sparsity. Also, a better judgment can be made about customers' interests by considering their recent purchases; therefore, in order to solve the problems of interest drifts, different weights should be assigned to purchases and launch time of products/items at different times (the more recent, the more weight). Finally, the LSH technique is used to increase the RS scalability in e-commerce. In total, a combination of personality traits, demographics and customer purchasing behavior over time with the LSH technique should be used to achieve an ideal RS. Using the RS proposed in this research, it is possible to create a comfortable and enjoyable shopping experience for customers by providing real-time recommendations that match customers' preferences and can result in an increase in the profitability of e-shops.

Originality/value

In this study, by considering a combination of personality traits, demographic characteristics and time-based purchasing behavior of customers along with the LSH technique, we were able for the first time to simultaneously solve the basic problems of CF, namely cold start, scalability, sparsity and interest drift, which led to a decrease in significant errors of recommendations and an increase in the accuracy of CF. The average errors of the recommendations provided to users based on the proposed model is only about 13%, and the accuracy and compliance of these recommendations with the interests of customers is about 92%. In addition, a 40% difference between the accuracy of the proposed method and the traditional CF method has been observed. This level of accuracy in RSs is very significant and special, which is certainly welcomed by e-business owners. This is also a new scientific finding that is very useful for programmers, users and researchers. In general, the main contributions of this research are: 1) proposing an accurate RS using personality traits, demographic characteristics and time-based purchasing behavior; 2) proposing an effective and comprehensive RS for a “clothing” online store; 3) improving the RS performance by solving the cold start issue using personality traits and demographic characteristics; 4) improving the scalability issue in RS through efficient k-nearest neighbors; 5) Mitigating the sparsity issue by using personality traits and demographic characteristics and also by densifying the user-item matrix and 6) improving the RS accuracy by solving the interest drift issue through developing a time-based user-item matrix.

Article
Publication date: 1 April 1972

GERHARD H.R. REISIG

The progressive utilization of information necessitates the evaluation of the information‐content of scientific documents over conventional bibliotechnical indexing‐operations…

Abstract

The progressive utilization of information necessitates the evaluation of the information‐content of scientific documents over conventional bibliotechnical indexing‐operations. The technique of optimal economical evaluation of information rests in the alternative between the keyword and subject‐index as the symbolic operator. From the analysis of 350 scientific documents the keyword‐concept was, by empirical decision, derived to be the superior operator. Conversely, the concept of subject‐indexing leads to ambiguities of subcategories, since certain branch‐nodes of the categorical trees (code trees) may belong to several simultaneous subject‐categories. This ambiguity can be resolved only by means of coordination‐tables correlating each keyword with the pertinent simultaneous subject‐categories. The application of synonym‐registers affects the concentration of information‐sources with typical completeness for each keyword‐entry in the system. The phrasing of syntactically associated keywords produces a significant precision of the information‐content of the documentary text. The efficiency of a documentary system is predominantly accomplished by information‐filtering in terms of relevant keyword‐phrases. This system‐efficiency is displayed as a function of utility‐rate versus precision‐rate, which is considered superior to Salton's efficiency‐rate of “precision versus recall.” The phrasing of key‐word‐phrases is utilized as the structure for the design of a truly topic‐related abstract to any document. Empirical findings from this study of economical optimization of documentary systems reveal the actual efficiency of the concept of information‐evaluation.

Details

Kybernetes, vol. 1 no. 4
Type: Research Article
ISSN: 0368-492X

Article
Publication date: 23 September 2020

Z.F. Zhang, Wei Liu, Egon Ostrosi, Yongjie Tian and Jianping Yi

During the production process of steel strip, some defects may appear on the surface, that is, traditional manual inspection could not meet the requirements of low-cost and…

Abstract

Purpose

During the production process of steel strip, some defects may appear on the surface, that is, traditional manual inspection could not meet the requirements of low-cost and high-efficiency production. The purpose of this paper is to propose a method of feature selection based on filter methods combined with hidden Bayesian classifier for improving the efficiency of defect recognition and reduce the complexity of calculation. The method can select the optimal hybrid model for realizing the accurate classification of steel strip surface defects.

Design/methodology/approach

A large image feature set was initially obtained based on the discrete wavelet transform feature extraction method. Three feature selection methods (including correlation-based feature selection, consistency subset evaluator [CSE] and information gain) were then used to optimize the feature space. Parameters for the feature selection methods were based on the classification accuracy results of hidden Naive Bayes (HNB) algorithm. The selected feature subset was then applied to the traditional NB classifier and leading extended NB classifiers.

Findings

The experimental results demonstrated that the HNB model combined with feature selection approaches has better classification performance than other models of defect recognition. Among the results of this study, the proposed hybrid model of CSE + HNB is the most robust and effective and of highest classification accuracy in identifying the optimal subset of the surface defect database.

Originality/value

The main contribution of this paper is the development of a hybrid model combining feature selection and multi-class classification algorithms for steel strip surface inspection. The proposed hybrid model is primarily robust and effective for steel strip surface inspection.

Details

Engineering Computations, vol. 38 no. 4
Type: Research Article
ISSN: 0264-4401

Keywords

Article
Publication date: 6 January 2022

Deepti Sisodia and Dilip Singh Sisodia

The problem of choosing the utmost useful features from hundreds of features from time-series user click data arises in online advertising toward fraudulent publisher's…

Abstract

Purpose

The problem of choosing the utmost useful features from hundreds of features from time-series user click data arises in online advertising toward fraudulent publisher's classification. Selecting feature subsets is a key issue in such classification tasks. Practically, the use of filter approaches is common; however, they neglect the correlations amid features. Conversely, wrapper approaches could not be applied due to their complexities. Moreover, in particular, existing feature selection methods could not handle such data, which is one of the major causes of instability of feature selection.

Design/methodology/approach

To overcome such issues, a majority voting-based hybrid feature selection method, namely feature distillation and accumulated selection (FDAS), is proposed to investigate the optimal subset of relevant features for analyzing the publisher's fraudulent conduct. FDAS works in two phases: (1) feature distillation, where significant features from standard filter and wrapper feature selection methods are obtained using majority voting; (2) accumulated selection, where we enumerated an accumulated evaluation of relevant feature subset to search for an optimal feature subset using effective machine learning (ML) models.

Findings

Empirical results prove enhanced classification performance with proposed features in average precision, recall, f1-score and AUC in publisher identification and classification.

Originality/value

The FDAS is evaluated on FDMA2012 user-click data and nine other benchmark datasets to gauge its generalizing characteristics, first, considering original features, second, with relevant feature subsets selected by feature selection (FS) methods, third, with optimal feature subset obtained by the proposed approach. ANOVA significance test is conducted to demonstrate significant differences between independent features.

Details

Data Technologies and Applications, vol. 56 no. 4
Type: Research Article
ISSN: 2514-9288

Keywords

Article
Publication date: 17 July 2009

Emmanuel Blanchard, Adrian Sandu and Corina Sandu

The purpose of this paper is to propose a new computational approach for parameter estimation in the Bayesian framework. A posteriori probability density functions are obtained…

Abstract

Purpose

The purpose of this paper is to propose a new computational approach for parameter estimation in the Bayesian framework. A posteriori probability density functions are obtained using the polynomial chaos theory for propagating uncertainties through system dynamics. The new method has the advantage of being able to deal with large parametric uncertainties, non‐Gaussian probability densities and nonlinear dynamics.

Design/methodology/approach

The maximum likelihood estimates are obtained by minimizing a cost function derived from the Bayesian theorem. Direct stochastic collocation is used as a less computationally expensive alternative to the traditional Galerkin approach to propagate the uncertainties through the system in the polynomial chaos framework.

Findings

The new approach is explained and is applied to very simple mechanical systems in order to illustrate how the Bayesian cost function can be affected by the noise level in the measurements, by undersampling, non‐identifiablily of the system, non‐observability and by excitation signals that are not rich enough. When the system is non‐identifiable and an a priori knowledge of the parameter uncertainties is available, regularization techniques can still yield most likely values among the possible combinations of uncertain parameters resulting in the same time responses than the ones observed.

Originality/value

The polynomial chaos method has been shown to be considerably more efficient than Monte Carlo in the simulation of systems with a small number of uncertain parameters. This is believed to be the first time the polynomial chaos theory has been applied to Bayesian estimation.

Details

Engineering Computations, vol. 26 no. 5
Type: Research Article
ISSN: 0264-4401

Keywords

Article
Publication date: 3 July 2017

Gaurav Kumar, Ashoke De and Harish Gopalan

Hybrid Reynolds-averaged Navier–Stokes large eddy simulation (RANS-LES) methods have become popular for simulation of massively separated flows at high Reynolds numbers due to…

Abstract

Purpose

Hybrid Reynolds-averaged Navier–Stokes large eddy simulation (RANS-LES) methods have become popular for simulation of massively separated flows at high Reynolds numbers due to their reduced computational cost and good accuracy. The current study aims to examine the performance of LES and hybrid RANS-LES model for a given grid resolution.

Design/methodology/approach

For better assessment and contrast of model performance, both mean and instantaneous flow fields have been investigated. For studying instantaneous flow, proper orthogonal decomposition has been used.

Findings

Current analysis shows that hybrid RANS-LES is capable of achieving similar accuracy in prediction of both mean and instantaneous flow fields at a very coarse grid as compared to LES.

Originality/value

Focusing mostly on the practical applications of computation, most of the attention has been given to the prediction of one-point flow statistics and little consideration has been put to two-point statistics. Here, two-point statistics has been considered using POD to investigate unsteady turbulent flow.

Details

International Journal of Numerical Methods for Heat & Fluid Flow, vol. 27 no. 7
Type: Research Article
ISSN: 0961-5539

Keywords

1 – 10 of 992