Search results

1 – 10 of over 20000
Article
Publication date: 23 August 2019

Janani Balakumar and S. Vijayarani Mohan

Owing to the huge volume of documents available on the internet, text classification becomes a necessary task to handle these documents. To achieve optimal text classification…

Abstract

Purpose

Owing to the huge volume of documents available on the internet, text classification becomes a necessary task to handle these documents. To achieve optimal text classification results, feature selection, an important stage, is used to curtail the dimensionality of text documents by choosing suitable features. The main purpose of this research work is to classify the personal computer documents based on their content.

Design/methodology/approach

This paper proposes a new algorithm for feature selection based on artificial bee colony (ABCFS) to enhance the text classification accuracy. The proposed algorithm (ABCFS) is scrutinized with the real and benchmark data sets, which is contrary to the other existing feature selection approaches such as information gain and χ2 statistic. To justify the efficiency of the proposed algorithm, the support vector machine (SVM) and improved SVM classifier are used in this paper.

Findings

The experiment was conducted on real and benchmark data sets. The real data set was collected in the form of documents that were stored in the personal computer, and the benchmark data set was collected from Reuters and 20 Newsgroups corpus. The results prove the performance of the proposed feature selection algorithm by enhancing the text document classification accuracy.

Originality/value

This paper proposes a new ABCFS algorithm for feature selection, evaluates the efficiency of the ABCFS algorithm and improves the support vector machine. In this paper, the ABCFS algorithm is used to select the features from text (unstructured) documents. Although, there is no text feature selection algorithm in the existing work, the ABCFS algorithm is used to select the data (structured) features. The proposed algorithm will classify the documents automatically based on their content.

Open Access
Article
Publication date: 28 July 2020

Noura AlNuaimi, Mohammad Mehedy Masud, Mohamed Adel Serhani and Nazar Zaki

Organizations in many domains generate a considerable amount of heterogeneous data every day. Such data can be processed to enhance these organizations’ decisions in real time…

3574

Abstract

Organizations in many domains generate a considerable amount of heterogeneous data every day. Such data can be processed to enhance these organizations’ decisions in real time. However, storing and processing large and varied datasets (known as big data) is challenging to do in real time. In machine learning, streaming feature selection has always been considered a superior technique for selecting the relevant subset features from highly dimensional data and thus reducing learning complexity. In the relevant literature, streaming feature selection refers to the features that arrive consecutively over time; despite a lack of exact figure on the number of features, numbers of instances are well-established. Many scholars in the field have proposed streaming-feature-selection algorithms in attempts to find the proper solution to this problem. This paper presents an exhaustive and methodological introduction of these techniques. This study provides a review of the traditional feature-selection algorithms and then scrutinizes the current algorithms that use streaming feature selection to determine their strengths and weaknesses. The survey also sheds light on the ongoing challenges in big-data research.

Details

Applied Computing and Informatics, vol. 18 no. 1/2
Type: Research Article
ISSN: 2634-1964

Keywords

Article
Publication date: 26 February 2024

Chong Wu, Xiaofang Chen and Yongjie Jiang

While the Chinese securities market is booming, the phenomenon of listed companies falling into financial distress is also emerging, which affects the operation and development of…

Abstract

Purpose

While the Chinese securities market is booming, the phenomenon of listed companies falling into financial distress is also emerging, which affects the operation and development of enterprises and also jeopardizes the interests of investors. Therefore, it is important to understand how to accurately and reasonably predict the financial distress of enterprises.

Design/methodology/approach

In the present study, ensemble feature selection (EFS) and improved stacking were used for financial distress prediction (FDP). Mutual information, analysis of variance (ANOVA), random forest (RF), genetic algorithms, and recursive feature elimination (RFE) were chosen for EFS to select features. Since there may be missing information when feeding the results of the base learner directly into the meta-learner, the features with high importance were fed into the meta-learner together. A screening layer was added to select the meta-learner with better performance. Finally, Optima hyperparameters were used for parameter tuning by the learners.

Findings

An empirical study was conducted with a sample of A-share listed companies in China. The F1-score of the model constructed using the features screened by EFS reached 84.55%, representing an improvement of 4.37% compared to the original features. To verify the effectiveness of improved stacking, benchmark model comparison experiments were conducted. Compared to the original stacking model, the accuracy of the improved stacking model was improved by 0.44%, and the F1-score was improved by 0.51%. In addition, the improved stacking model had the highest area under the curve (AUC) value (0.905) among all the compared models.

Originality/value

Compared to previous models, the proposed FDP model has better performance, thus bridging the research gap of feature selection. The present study provides new ideas for stacking improvement research and a reference for subsequent research in this field.

Details

Kybernetes, vol. ahead-of-print no. ahead-of-print
Type: Research Article
ISSN: 0368-492X

Keywords

Article
Publication date: 3 April 2017

Farshad Faezy Razi and Seyed Hooman Shariat

The purpose of this paper is twofold: the selection of project portfolios through hybrid artificial neural network algorithms, feature selection based on grey relational analysis…

Abstract

Purpose

The purpose of this paper is twofold: the selection of project portfolios through hybrid artificial neural network algorithms, feature selection based on grey relational analysis, decision tree and regression; and the identification of the features affecting project portfolio selection using the artificial neural network algorithm, decision tree and regression. The authors also aim to classify the available options using the decision tree algorithm.

Design/methodology/approach

In order to achieve the research goals, a project-oriented organization was selected and studied. In all, 49 project management indicators were chosen from A Guide to the Project Management Body of Knowledge (PMBOK Guide), and the most important indicators were identified using a feature selection algorithm and decision tree. After the extraction of rules, decision rule-based multi-criteria decision making matrices were produced. Each matrix was ranked through grey relational analysis, similarity to ideal solution method and multi-criteria optimization. Finally, a model for choosing the best ranking method was designed and implemented using the genetic algorithm. To analyze the responses, stability of the classes was investigated.

Findings

The results showed that projects ranked based on neural network weights by the grey relational analysis method prove to be better options for the selection of a project portfolio. The process of identification of the features affecting project portfolio selection resulted in the following factors: scope management, project charter, project management plan, stakeholders and risk.

Originality/value

This study presents the most effective features affecting project portfolio selection which is highly impressive in organizational decision making and must be considered seriously. Deploying sensitivity analysis, which is an innovation in such studies, played a constructive role in examining the accuracy and reliability of the proposed models, and it can be firmly argued that the results have had an important role in validating the findings of this study.

Details

Benchmarking: An International Journal, vol. 24 no. 3
Type: Research Article
ISSN: 1463-5771

Keywords

Article
Publication date: 1 April 2014

Svetlana Boudko, Wolfgang Leister and Stein Gjessing

Coexistence of various wireless access networks and the ability of mobile terminals to switch between them make an optimal selection of serving networks for multicast groups a…

Abstract

Purpose

Coexistence of various wireless access networks and the ability of mobile terminals to switch between them make an optimal selection of serving networks for multicast groups a challenging problem. Since optimal network selection requires large dimensions of data to be collected from several network locations and sent between several network components, the scalability can easily become a bottleneck in large-scale systems. Therefore, reducing data exchange within heterogeneous wireless networks is important. The paper aims to discuss these issues.

Design/methodology/approach

The authors study the decision-making process and the data that need to be sent between different network components. To analyze the operation of the wireless heterogeneous network, the authors built a mathematical model of the network. The objective is defined as a minimization of multicast streams in the system. To evaluate the heuristic solutions, the authors define the upper and lower bounds to their operation.

Findings

The proposed heuristic solutions substantially reduce the usage of bandwidth in mobile networks and exchange of information between the network components.

Originality/value

The authors proposed the approach that allows network selection in a decentralized manner with only limited information shared among the decision makers. The authors studied how different sets of information available to decision makers influenced the performance of the system. The work also investigates the usage of multiple paths for multicast in heterogeneous mobile environments.

Details

International Journal of Pervasive Computing and Communications, vol. 10 no. 1
Type: Research Article
ISSN: 1742-7371

Keywords

Article
Publication date: 8 December 2022

Jonathan S. Greipel, Regina M. Frank, Meike Huber, Ansgar Steland and Robert H. Schmitt

To ensure product quality within a manufacturing process, inspection processes are indispensable. One task of inspection planning is the selection of inspection characteristics…

Abstract

Purpose

To ensure product quality within a manufacturing process, inspection processes are indispensable. One task of inspection planning is the selection of inspection characteristics. For optimization of costs and benefits, key characteristics can be defined by which the product quality can be checked with sufficient accuracy. The manual selection of key characteristics requires substantial planning effort and becomes uneconomic if many product variants prevail. This paper, therefore, aims to show a method for the efficient determination of key characteristics.

Design/methodology/approach

The authors present a novel Algorithm for the Selection of Key Characteristics (ASKC) based on an auto-encoder and a risk analysis. Given historical measurement data and tolerances, the algorithm clusters characteristics with redundant information and selects key characteristics based on a risk assessment. The authors compare ASKC with the algorithm Principal Feature Analysis (PFA) using artificial and historical measurement data.

Findings

The authors find that ASKC delivers superior results than PFA. Findings show that the algorithms enable the cost-efficient selection of key characteristics while maintaining the informative value of the inspection concerning the quality.

Originality/value

This paper fills an identified gap for simplified inspection planning with the method for the efficient selection of key features via ASKC.

Details

International Journal of Quality & Reliability Management, vol. 40 no. 7
Type: Research Article
ISSN: 0265-671X

Keywords

Article
Publication date: 16 August 2018

Rama Rao A., Satyananda Reddy and Valli Kumari V.

Multimedia applications such as digital audio and video have stringent quality of service (QoS) requirement in mobile ad hoc network. To support wide range of QoS, complex routing…

Abstract

Purpose

Multimedia applications such as digital audio and video have stringent quality of service (QoS) requirement in mobile ad hoc network. To support wide range of QoS, complex routing protocols with multiple QoS constraints are necessary. In QoS routing, the basic problem is to find a path that satisfies multiple QoS constraints. Moreover, mobility, congestion and packet loss in dynamic topology of network also leads to QoS performance degradation of protocol.

Design/methodology/approach

In this paper, the authors proposed a multi-path selection scheme for QoS aware routing in mobile ad hoc network based on fractional cuckoo search algorithm (FCS-MQARP). Here, multiple QoS constraints energy, link life time, distance and delay are considered for path selection.

Findings

The experimentation of proposed FCS-MQARP is performed over existing QoS aware routing protocols AOMDV, MMQARP, CS-MQARP using measures such as normalized delay, energy and throughput. The extensive simulation study of the proposed FCS-based multipath selection shows that the proposed QoS aware routing protocol performs better than the existing routing protocol with maximal energy of 99.1501 and minimal delay of 0.0554.

Originality/value

This paper presents a hybrid optimization algorithm called the FCS algorithm for the multi-path selection. Also, a new fitness function is developed by considering the QoS constraints such as energy, link life time, distance and delay.

Details

Sensor Review, vol. 39 no. 2
Type: Research Article
ISSN: 0260-2288

Keywords

Article
Publication date: 18 May 2020

Abhishek Dixit, Ashish Mani and Rohit Bansal

Feature selection is an important step for data pre-processing specially in the case of high dimensional data set. Performance of the data model is reduced if the model is trained…

Abstract

Purpose

Feature selection is an important step for data pre-processing specially in the case of high dimensional data set. Performance of the data model is reduced if the model is trained with high dimensional data set, and it results in poor classification accuracy. Therefore, before training the model an important step to apply is the feature selection on the dataset to improve the performance and classification accuracy.

Design/methodology/approach

A novel optimization approach that hybridizes binary particle swarm optimization (BPSO) and differential evolution (DE) for fine tuning of SVM classifier is presented. The name of the implemented classifier is given as DEPSOSVM.

Findings

This approach is evaluated using 20 UCI benchmark text data classification data set. Further, the performance of the proposed technique is also evaluated on UCI benchmark image data set of cancer images. From the results, it can be observed that the proposed DEPSOSVM techniques have significant improvement in performance over other algorithms in the literature for feature selection. The proposed technique shows better classification accuracy as well.

Originality/value

The proposed approach is different from the previous work, as in all the previous work DE/(rand/1) mutation strategy is used whereas in this study DE/(rand/2) is used and the mutation strategy with BPSO is updated. Another difference is on the crossover approach in our case as we have used a novel approach of comparing best particle with sigmoid function. The core contribution of this paper is to hybridize DE with BPSO combined with SVM classifier (DEPSOSVM) to handle the feature selection problems.

Details

International Journal of Intelligent Computing and Cybernetics, vol. 13 no. 2
Type: Research Article
ISSN: 1756-378X

Keywords

Open Access
Article
Publication date: 11 April 2018

Mohamed A. Tawhid and Kevin B. Dsouza

In this paper, we present a new hybrid binary version of bat and enhanced particle swarm optimization algorithm in order to solve feature selection problems. The proposed algorithm

Abstract

In this paper, we present a new hybrid binary version of bat and enhanced particle swarm optimization algorithm in order to solve feature selection problems. The proposed algorithm is called Hybrid Binary Bat Enhanced Particle Swarm Optimization Algorithm (HBBEPSO). In the proposed HBBEPSO algorithm, we combine the bat algorithm with its capacity for echolocation helping explore the feature space and enhanced version of the particle swarm optimization with its ability to converge to the best global solution in the search space. In order to investigate the general performance of the proposed HBBEPSO algorithm, the proposed algorithm is compared with the original optimizers and other optimizers that have been used for feature selection in the past. A set of assessment indicators are used to evaluate and compare the different optimizers over 20 standard data sets obtained from the UCI repository. Results prove the ability of the proposed HBBEPSO algorithm to search the feature space for optimal feature combinations.

Details

Applied Computing and Informatics, vol. 16 no. 1/2
Type: Research Article
ISSN: 2634-1964

Keywords

Article
Publication date: 8 June 2010

Pablo A.D. Castro and Fernando J. Von Zuben

The purpose of this paper is to apply a multi‐objective Bayesian artificial immune system (MOBAIS) to feature selection in classification problems aiming at minimizing both the…

Abstract

Purpose

The purpose of this paper is to apply a multi‐objective Bayesian artificial immune system (MOBAIS) to feature selection in classification problems aiming at minimizing both the classification error and cardinality of the subset of features. The algorithm is able to perform a multimodal search maintaining population diversity and controlling automatically the population size according to the problem. In addition, it is capable of identifying and preserving building blocks (partial components of the whole solution) effectively.

Design/methodology/approach

The algorithm evolves candidate subsets of features by replacing the traditional mutation operator in immune‐inspired algorithms with a probabilistic model which represents the probability distribution of the promising solutions found so far. Then, the probabilistic model is used to generate new individuals. A Bayesian network is adopted as the probabilistic model due to its capability of capturing expressive interactions among the variables of the problem. In order to evaluate the proposal, it was applied to ten datasets and the results compared with those generated by state‐of‐the‐art algorithms.

Findings

The experiments demonstrate the effectiveness of the multi‐objective approach to feature selection. The algorithm found parsimonious subsets of features and the classifiers produced a significant improvement in the accuracy. In addition, the maintenance of building blocks avoids the disruption of partial solutions, leading to a quick convergence.

Originality/value

The originality of this paper relies on the proposal of a novel algorithm to multi‐objective feature selection.

Details

International Journal of Intelligent Computing and Cybernetics, vol. 3 no. 2
Type: Research Article
ISSN: 1756-378X

Keywords

1 – 10 of over 20000