Search results

1 – 10 of 819
Article
Publication date: 19 December 2023

Jinchao Huang

Single-shot multi-category clothing recognition and retrieval play a crucial role in online searching and offline settlement scenarios. Existing clothing recognition methods based…

Abstract

Purpose

Single-shot multi-category clothing recognition and retrieval play a crucial role in online searching and offline settlement scenarios. Existing clothing recognition methods based on RGBD clothing images often suffer from high-dimensional feature representations, leading to compromised performance and efficiency.

Design/methodology/approach

To address this issue, this paper proposes a novel method called Manifold Embedded Discriminative Feature Selection (MEDFS) to select global and local features, thereby reducing the dimensionality of the feature representation and improving performance. Specifically, by combining three global features and three local features, a low-dimensional embedding is constructed to capture the correlations between features and categories. The MEDFS method designs an optimization framework utilizing manifold mapping and sparse regularization to achieve feature selection. The optimization objective is solved using an alternating iterative strategy, ensuring convergence.

Findings

Empirical studies conducted on a publicly available RGBD clothing image dataset demonstrate that the proposed MEDFS method achieves highly competitive clothing classification performance while maintaining efficiency in clothing recognition and retrieval.

Originality/value

This paper introduces a novel approach for multi-category clothing recognition and retrieval, incorporating the selection of global and local features. The proposed method holds potential for practical applications in real-world clothing scenarios.

Details

International Journal of Intelligent Computing and Cybernetics, vol. ahead-of-print no. ahead-of-print
Type: Research Article
ISSN: 1756-378X

Keywords

Article
Publication date: 12 June 2020

Sandeepkumar Hegde and Monica R. Mundada

According to the World Health Organization, by 2025, the contribution of chronic disease is expected to rise by 73% compared to all deaths and it is considered as global burden of

Abstract

Purpose

According to the World Health Organization, by 2025, the contribution of chronic disease is expected to rise by 73% compared to all deaths and it is considered as global burden of disease with a rate of 60%. These diseases persist for a longer duration of time, which are almost incurable and can only be controlled. Cardiovascular disease, chronic kidney disease (CKD) and diabetes mellitus are considered as three major chronic diseases that will increase the risk among the adults, as they get older. CKD is considered a major disease among all these chronic diseases, which will increase the risk among the adults as they get older. Overall 10% of the population of the world is affected by CKD and it is likely to double in the year 2030. The paper aims to propose novel feature selection approach in combination with the machine-learning algorithm which can early predict the chronic disease with utmost accuracy. Hence, a novel feature selection adaptive probabilistic divergence-based feature selection (APDFS) algorithm is proposed in combination with the hyper-parameterized logistic regression model (HLRM) for the early prediction of chronic disease.

Design/methodology/approach

A novel feature selection APDFS algorithm is proposed which explicitly handles the feature associated with the class label by relevance and redundancy analysis. The algorithm applies the statistical divergence-based information theory to identify the relationship between the distant features of the chronic disease data set. The data set required to experiment is obtained from several medical labs and hospitals in India. The HLRM is used as a machine-learning classifier. The predictive ability of the framework is compared with the various algorithm and also with the various chronic disease data set. The experimental result illustrates that the proposed framework is efficient and achieved competitive results compared to the existing work in most of the cases.

Findings

The performance of the proposed framework is validated by using the metric such as recall, precision, F1 measure and ROC. The predictive performance of the proposed framework is analyzed by passing the data set belongs to various chronic disease such as CKD, diabetes and heart disease. The diagnostic ability of the proposed approach is demonstrated by comparing its result with existing algorithms. The experimental figures illustrated that the proposed framework performed exceptionally well in prior prediction of CKD disease with an accuracy of 91.6.

Originality/value

The capability of the machine learning algorithms depends on feature selection (FS) algorithms in identifying the relevant traits from the data set, which impact the predictive result. It is considered as a process of choosing the relevant features from the data set by removing redundant and irrelevant features. Although there are many approaches that have been already proposed toward this objective, they are computationally complex because of the strategy of following a one-step scheme in selecting the features. In this paper, a novel feature selection APDFS algorithm is proposed which explicitly handles the feature associated with the class label by relevance and redundancy analysis. The proposed algorithm handles the process of feature selection in two separate indices. Hence, the computational complexity of the algorithm is reduced to O(nk+1). The algorithm applies the statistical divergence-based information theory to identify the relationship between the distant features of the chronic disease data set. The data set required to experiment is obtained from several medical labs and hospitals of karkala taluk ,India. The HLRM is used as a machine learning classifier. The predictive ability of the framework is compared with the various algorithm and also with the various chronic disease data set. The experimental result illustrates that the proposed framework is efficient and achieved competitive results are compared to the existing work in most of the cases.

Details

International Journal of Pervasive Computing and Communications, vol. 17 no. 1
Type: Research Article
ISSN: 1742-7371

Keywords

Article
Publication date: 23 September 2020

Z.F. Zhang, Wei Liu, Egon Ostrosi, Yongjie Tian and Jianping Yi

During the production process of steel strip, some defects may appear on the surface, that is, traditional manual inspection could not meet the requirements of low-cost and…

Abstract

Purpose

During the production process of steel strip, some defects may appear on the surface, that is, traditional manual inspection could not meet the requirements of low-cost and high-efficiency production. The purpose of this paper is to propose a method of feature selection based on filter methods combined with hidden Bayesian classifier for improving the efficiency of defect recognition and reduce the complexity of calculation. The method can select the optimal hybrid model for realizing the accurate classification of steel strip surface defects.

Design/methodology/approach

A large image feature set was initially obtained based on the discrete wavelet transform feature extraction method. Three feature selection methods (including correlation-based feature selection, consistency subset evaluator [CSE] and information gain) were then used to optimize the feature space. Parameters for the feature selection methods were based on the classification accuracy results of hidden Naive Bayes (HNB) algorithm. The selected feature subset was then applied to the traditional NB classifier and leading extended NB classifiers.

Findings

The experimental results demonstrated that the HNB model combined with feature selection approaches has better classification performance than other models of defect recognition. Among the results of this study, the proposed hybrid model of CSE + HNB is the most robust and effective and of highest classification accuracy in identifying the optimal subset of the surface defect database.

Originality/value

The main contribution of this paper is the development of a hybrid model combining feature selection and multi-class classification algorithms for steel strip surface inspection. The proposed hybrid model is primarily robust and effective for steel strip surface inspection.

Details

Engineering Computations, vol. 38 no. 4
Type: Research Article
ISSN: 0264-4401

Keywords

Article
Publication date: 26 February 2024

Chong Wu, Xiaofang Chen and Yongjie Jiang

While the Chinese securities market is booming, the phenomenon of listed companies falling into financial distress is also emerging, which affects the operation and development of

Abstract

Purpose

While the Chinese securities market is booming, the phenomenon of listed companies falling into financial distress is also emerging, which affects the operation and development of enterprises and also jeopardizes the interests of investors. Therefore, it is important to understand how to accurately and reasonably predict the financial distress of enterprises.

Design/methodology/approach

In the present study, ensemble feature selection (EFS) and improved stacking were used for financial distress prediction (FDP). Mutual information, analysis of variance (ANOVA), random forest (RF), genetic algorithms, and recursive feature elimination (RFE) were chosen for EFS to select features. Since there may be missing information when feeding the results of the base learner directly into the meta-learner, the features with high importance were fed into the meta-learner together. A screening layer was added to select the meta-learner with better performance. Finally, Optima hyperparameters were used for parameter tuning by the learners.

Findings

An empirical study was conducted with a sample of A-share listed companies in China. The F1-score of the model constructed using the features screened by EFS reached 84.55%, representing an improvement of 4.37% compared to the original features. To verify the effectiveness of improved stacking, benchmark model comparison experiments were conducted. Compared to the original stacking model, the accuracy of the improved stacking model was improved by 0.44%, and the F1-score was improved by 0.51%. In addition, the improved stacking model had the highest area under the curve (AUC) value (0.905) among all the compared models.

Originality/value

Compared to previous models, the proposed FDP model has better performance, thus bridging the research gap of feature selection. The present study provides new ideas for stacking improvement research and a reference for subsequent research in this field.

Details

Kybernetes, vol. ahead-of-print no. ahead-of-print
Type: Research Article
ISSN: 0368-492X

Keywords

Article
Publication date: 15 March 2013

Zied Kechaou, Ali Wali, Mohamed Ben Ammar, Hichem Karray and Adel M. Alimi

Despite the actual prevalence of diverse types of multimedia information, research on video news is still in an early stage. Improving the accessibility of video news seems worth…

Abstract

Purpose

Despite the actual prevalence of diverse types of multimedia information, research on video news is still in an early stage. Improving the accessibility of video news seems worth investigating, therefore, the purpose of this paper is to present a new combination mode of video news text clustering and selection. This method is useful for sorting out and classifying various types of news videos and media texts based on sentiment analysis.

Design/methodology/approach

A novel system is proposed, whereby video news are identified and categorized into good or bad ones via the authors' suggested Hidden Markov Model (HMM) and Support Vector Machine (SVM) hybrid learning method. Actually, an exploratory video news sentiment analysis case study, conducted on various news databases, has proven that the feature‐selection‐combining method, encompassing the Information Gain (IG), Mutual Information (MI) and CHI‐statistic (CHI), performs the best classification, which testifies and highlights the designed framework's value.

Findings

In fact, the system turns out to be applicable to several areas, especially video news, where annotation and personal perspectives affect the accuracy aspect.

Research limitations/implications

The present work shows the way for further research pertaining to the personal attitudes and the application of different linguistic techniques during the classification.

Originality/value

The achieved results are so promising, encouraging and satisfactory, that they highlight the originality and efficiency of the authors' approach as an effective tool enabling to secure an easy access to video news and multi‐media texts.

Details

Journal of Systems and Information Technology, vol. 15 no. 1
Type: Research Article
ISSN: 1328-7265

Keywords

Article
Publication date: 7 July 2020

Jiaming Liu, Liuan Wang, Linan Zhang, Zeming Zhang and Sicheng Zhang

The primary objective of this study was to recognize critical indicators in predicting blood glucose (BG) through data-driven methods and to compare the prediction performance of

Abstract

Purpose

The primary objective of this study was to recognize critical indicators in predicting blood glucose (BG) through data-driven methods and to compare the prediction performance of four tree-based ensemble models, i.e. bagging with tree regressors (bagging-decision tree [Bagging-DT]), AdaBoost with tree regressors (Adaboost-DT), random forest (RF) and gradient boosting decision tree (GBDT).

Design/methodology/approach

This study proposed a majority voting feature selection method by combining lasso regression with the Akaike information criterion (AIC) (LR-AIC), lasso regression with the Bayesian information criterion (BIC) (LR-BIC) and RF to select indicators with excellent predictive performance from initial 38 indicators in 5,642 samples. The selected features were deployed to build the tree-based ensemble models. The 10-fold cross-validation (CV) method was used to evaluate the performance of each ensemble model.

Findings

The results of feature selection indicated that age, corpuscular hemoglobin concentration (CHC), red blood cell volume distribution width (RBCVDW), red blood cell volume and leucocyte count are five most important clinical/physical indicators in BG prediction. Furthermore, this study also found that the GBDT ensemble model combined with the proposed majority voting feature selection method is better than other three models with respect to prediction performance and stability.

Practical implications

This study proposed a novel BG prediction framework for better predictive analytics in health care.

Social implications

This study incorporated medical background and machine learning technology to reduce diabetes morbidity and formulate precise medical schemes.

Originality/value

The majority voting feature selection method combined with the GBDT ensemble model provides an effective decision-making tool for predicting BG and detecting diabetes risk in advance.

Open Access
Article
Publication date: 28 July 2020

Noura AlNuaimi, Mohammad Mehedy Masud, Mohamed Adel Serhani and Nazar Zaki

Organizations in many domains generate a considerable amount of heterogeneous data every day. Such data can be processed to enhance these organizations’ decisions in real time…

3576

Abstract

Organizations in many domains generate a considerable amount of heterogeneous data every day. Such data can be processed to enhance these organizations’ decisions in real time. However, storing and processing large and varied datasets (known as big data) is challenging to do in real time. In machine learning, streaming feature selection has always been considered a superior technique for selecting the relevant subset features from highly dimensional data and thus reducing learning complexity. In the relevant literature, streaming feature selection refers to the features that arrive consecutively over time; despite a lack of exact figure on the number of features, numbers of instances are well-established. Many scholars in the field have proposed streaming-feature-selection algorithms in attempts to find the proper solution to this problem. This paper presents an exhaustive and methodological introduction of these techniques. This study provides a review of the traditional feature-selection algorithms and then scrutinizes the current algorithms that use streaming feature selection to determine their strengths and weaknesses. The survey also sheds light on the ongoing challenges in big-data research.

Details

Applied Computing and Informatics, vol. 18 no. 1/2
Type: Research Article
ISSN: 2634-1964

Keywords

Open Access
Article
Publication date: 11 April 2018

Mohamed A. Tawhid and Kevin B. Dsouza

In this paper, we present a new hybrid binary version of bat and enhanced particle swarm optimization algorithm in order to solve feature selection problems. The proposed…

Abstract

In this paper, we present a new hybrid binary version of bat and enhanced particle swarm optimization algorithm in order to solve feature selection problems. The proposed algorithm is called Hybrid Binary Bat Enhanced Particle Swarm Optimization Algorithm (HBBEPSO). In the proposed HBBEPSO algorithm, we combine the bat algorithm with its capacity for echolocation helping explore the feature space and enhanced version of the particle swarm optimization with its ability to converge to the best global solution in the search space. In order to investigate the general performance of the proposed HBBEPSO algorithm, the proposed algorithm is compared with the original optimizers and other optimizers that have been used for feature selection in the past. A set of assessment indicators are used to evaluate and compare the different optimizers over 20 standard data sets obtained from the UCI repository. Results prove the ability of the proposed HBBEPSO algorithm to search the feature space for optimal feature combinations.

Details

Applied Computing and Informatics, vol. 16 no. 1/2
Type: Research Article
ISSN: 2634-1964

Keywords

Article
Publication date: 23 August 2019

Janani Balakumar and S. Vijayarani Mohan

Owing to the huge volume of documents available on the internet, text classification becomes a necessary task to handle these documents. To achieve optimal text classification…

Abstract

Purpose

Owing to the huge volume of documents available on the internet, text classification becomes a necessary task to handle these documents. To achieve optimal text classification results, feature selection, an important stage, is used to curtail the dimensionality of text documents by choosing suitable features. The main purpose of this research work is to classify the personal computer documents based on their content.

Design/methodology/approach

This paper proposes a new algorithm for feature selection based on artificial bee colony (ABCFS) to enhance the text classification accuracy. The proposed algorithm (ABCFS) is scrutinized with the real and benchmark data sets, which is contrary to the other existing feature selection approaches such as information gain and χ2 statistic. To justify the efficiency of the proposed algorithm, the support vector machine (SVM) and improved SVM classifier are used in this paper.

Findings

The experiment was conducted on real and benchmark data sets. The real data set was collected in the form of documents that were stored in the personal computer, and the benchmark data set was collected from Reuters and 20 Newsgroups corpus. The results prove the performance of the proposed feature selection algorithm by enhancing the text document classification accuracy.

Originality/value

This paper proposes a new ABCFS algorithm for feature selection, evaluates the efficiency of the ABCFS algorithm and improves the support vector machine. In this paper, the ABCFS algorithm is used to select the features from text (unstructured) documents. Although, there is no text feature selection algorithm in the existing work, the ABCFS algorithm is used to select the data (structured) features. The proposed algorithm will classify the documents automatically based on their content.

Article
Publication date: 4 May 2021

Nor Hamizah Miswan, Chee Seng Chan and Chong Guan Ng

This paper develops a robust hospital readmission prediction framework by combining the feature selection algorithm and machine learning (ML) classifiers. The improved feature

Abstract

Purpose

This paper develops a robust hospital readmission prediction framework by combining the feature selection algorithm and machine learning (ML) classifiers. The improved feature selection is proposed by considering the uncertainty in patient's attributes that leads to the output variable.

Design/methodology/approach

First, data preprocessing is conducted which includes how raw data is managed. Second, the impactful features are selected through feature selection process. It started with calculating the relational grade of each patient towards readmission using grey relational analysis (GRA) and the grade is used as the target values for feature selection. Then, the influenced features are selected using the Least Absolute Shrinkage and Selection Operator (LASSO) method. This proposed method is termed as Grey-LASSO feature selection. The final task is the readmission prediction using ML classifiers.

Findings

The proposed method offered good performances with a minimum feature subset up to 54–65% discarded features. Multi-Layer Perceptron with Grey-LASSO gave the best performance.

Research limitations/implications

The performance of Grey-LASSO is justified in two readmission datasets. Further research is required to examine the generalisability to other datasets.

Originality/value

In designing the feature selection algorithm, the selection on influenced input variables was based on the integration of GRA and LASSO. Specifically, GRA is a part of the grey system theory, which was employed to analyse the relation between systems under uncertain conditions. The LASSO approach was adopted due to its ability for sparse data representation.

Details

Grey Systems: Theory and Application, vol. 11 no. 4
Type: Research Article
ISSN: 2043-9377

Keywords

1 – 10 of 819