Search results

1 – 5 of 5

Open Access

Article

Publication date: 28 July 2020

Streaming feature selection algorithms for big data: A survey

Noura AlNuaimi, Mohammad Mehedy Masud, Mohamed Adel Serhani and Nazar Zaki

Organizations in many domains generate a considerable amount of heterogeneous data every day. Such data can be processed to enhance these organizations’ decisions in real time…

HTML

PDF (866 KB)

Downloads

3568

Abstract

Organizations in many domains generate a considerable amount of heterogeneous data every day. Such data can be processed to enhance these organizations’ decisions in real time. However, storing and processing large and varied datasets (known as big data) is challenging to do in real time. In machine learning, streaming feature selection has always been considered a superior technique for selecting the relevant subset features from highly dimensional data and thus reducing learning complexity. In the relevant literature, streaming feature selection refers to the features that arrive consecutively over time; despite a lack of exact figure on the number of features, numbers of instances are well-established. Many scholars in the field have proposed streaming-feature-selection algorithms in attempts to find the proper solution to this problem. This paper presents an exhaustive and methodological introduction of these techniques. This study provides a review of the traditional feature-selection algorithms and then scrutinizes the current algorithms that use streaming feature selection to determine their strengths and weaknesses. The survey also sheds light on the ongoing challenges in big-data research.

Details

Applied Computing and Informatics, vol. 18 no. 1/2

Type: Research Article

DOI:

ISSN: 2634-1964

Keywords

Open Access

Article

Publication date: 25 July 2022

Improving handwritten digit recognition using hybrid feature selection algorithm

Fung Yuen Chin, Kong Hoong Lem and Khye Mun Wong

The amount of features in handwritten digit data is often very large due to the different aspects in personal handwriting, leading to high-dimensional data. Therefore, the…

HTML

PDF (751 KB)

Downloads

997

Abstract

Purpose

The amount of features in handwritten digit data is often very large due to the different aspects in personal handwriting, leading to high-dimensional data. Therefore, the employment of a feature selection algorithm becomes crucial for successful classification modeling, because the inclusion of irrelevant or redundant features can mislead the modeling algorithms, resulting in overfitting and decrease in efficiency.

Design/methodology/approach

The minimum redundancy and maximum relevance (mRMR) and the recursive feature elimination (RFE) are two frequently used feature selection algorithms. While mRMR is capable of identifying a subset of features that are highly relevant to the targeted classification variable, mRMR still carries the weakness of capturing redundant features along with the algorithm. On the other hand, RFE is flawed by the fact that those features selected by RFE are not ranked by importance, albeit RFE can effectively eliminate the less important features and exclude redundant features.

Findings

The hybrid method was exemplified in a binary classification between digits “4” and “9” and between digits “6” and “8” from a multiple features dataset. The result showed that the hybrid mRMR + support vector machine recursive feature elimination (SVMRFE) is better than both the sole support vector machine (SVM) and mRMR.

Originality/value

In view of the respective strength and deficiency mRMR and RFE, this study combined both these methods and used an SVM as the underlying classifier anticipating the mRMR to make an excellent complement to the SVMRFE.

Details

Applied Computing and Informatics, vol. ahead-of-print no. ahead-of-print

Type: Research Article

DOI:

ISSN: 2634-1964

Keywords

Open Access

Article

Publication date: 28 July 2020

Sport analytics for cricket game results using machine learning: An experimental study

Kumash Kapadia, Hussein Abdel-Jaber, Fadi Thabtah and Wael Hadi

Indian Premier League (IPL) is one of the more popular cricket world tournaments, and its financial is increasing each season, its viewership has increased markedly and the…

HTML

PDF (470 KB)

Downloads

8991

Abstract

Indian Premier League (IPL) is one of the more popular cricket world tournaments, and its financial is increasing each season, its viewership has increased markedly and the betting market for IPL is growing significantly every year. With cricket being a very dynamic game, bettors and bookies are incentivised to bet on the match results because it is a game that changes ball-by-ball. This paper investigates machine learning technology to deal with the problem of predicting cricket match results based on historical match data of the IPL. Influential features of the dataset have been identified using filter-based methods including Correlation-based Feature Selection, Information Gain (IG), ReliefF and Wrapper. More importantly, machine learning techniques including Naïve Bayes, Random Forest, K-Nearest Neighbour (KNN) and Model Trees (classification via regression) have been adopted to generate predictive models from distinctive feature sets derived by the filter-based methods. Two featured subsets were formulated, one based on home team advantage and other based on Toss decision. Selected machine learning techniques were applied on both feature sets to determine a predictive model. Experimental tests show that tree-based models particularly Random Forest performed better in terms of accuracy, precision and recall metrics when compared to probabilistic and statistical models. However, on the Toss featured subset, none of the considered machine learning algorithms performed well in producing accurate predictive models.

Details

Applied Computing and Informatics, vol. 18 no. 3/4

Type: Research Article

DOI:

ISSN: 2634-1964

Keywords

Open Access

Article

Publication date: 25 May 2023

Analysis of barriers of mHealth adoption in the context of sustainable operational practices in health care supply chains

Suchismita Swain, Kamalakanta Muduli, Anil Kumar and Sunil Luthra

The goal of this research is to analyse the obstacles to the implementation of mobile health (mHealth) in India and to gain an understanding of the contextual inter-relationships…

HTML

PDF (1.6 MB)

Downloads

913

Abstract

Purpose

The goal of this research is to analyse the obstacles to the implementation of mobile health (mHealth) in India and to gain an understanding of the contextual inter-relationships that exist amongst those obstacles.

Design/methodology/approach

Potential barriers and their interrelationships in their respective contexts have been uncovered. Using MICMAC analysis, the categorization of these barriers was done based on their degree of reliance and driving power (DP). Furthermore, an interpretive structural modeling (ISM) framework for the barriers to mHealth activities in India has been proposed.

Findings

The study explores a total of 15 factors that reduce the efficiency of mHealth adoption in India. The findings of the Matrix Cross-Reference Multiplication Applied to a Classification (MICMAC) investigation show that the economic situation of the government, concerns regarding the safety of intellectual technologies and privacy issues are the primary obstacles because of the significant driving power they have in mHealth applications.

Practical implications

Promoters of mHealth practices may be able to make better plans if they understand the social barriers and how they affect each other; this leads to easier adoption of these practices. The findings of this study might be helpful for governments of developing nations to produce standards relating to the deployment of mHealth; this will increase the efficiency with which it is adopted.

Originality/value

At this time, there is no comprehensive analysis of the factors that influence the adoption of mobile health care with social cognitive theory in developing nations like India. In addition, there is a lack of research in investigating how each of these elements affects the success of mHealth activities and how the others interact with them. Because developed nations learnt the value of mHealth practices during the recent pandemic, this study, by investigating the obstacles to the adoption of mHealth and their inter-relationships, makes an important addition to both theory and practice.

Details

International Journal of Industrial Engineering and Operations Management, vol. 6 no. 2

Type: Research Article

DOI:

ISSN: 2690-6090

Keywords

Content available

Article

Publication date: 30 December 2020

Mutual information analysis of the factors influencing port throughput

Majid Eskafi, Milad Kowsari, Ali Dastgheib, Gudmundur F. Ulfarsson, Poonam Taneja and Ragnheidur I. Thorarinsdottir

Port throughput analysis is a challenging task, as it consists of intertwined interactions between a variety of cargos and numerous influencing factors. This study aims to propose…

HTML

PDF (1.5 MB)

Downloads

1567

Abstract

Purpose

Port throughput analysis is a challenging task, as it consists of intertwined interactions between a variety of cargos and numerous influencing factors. This study aims to propose a quantitative method to facilitate port throughput analysis by identification of important cargos and key macroeconomic variables.

Design/methodology/approach

Mutual information is applied to measure the linear and nonlinear correlation among variables. The method gives a unique measure of dependence between two variables by quantifying the amount of information held in one variable through another variable.

Findings

This study uses the mutual information to the Port of Isafjordur in Iceland to underpin the port throughput analysis. The results show that marine products are the main export cargo, whereas most imports are fuel oil, industrial materials and marine product. The aggregation of these cargos, handled in the port, meaningfully determines the non-containerized port throughput. The relation between non-containerized export and the national gross domestic product (GDP) is relatively high. However, non-containerized import is mostly related to the world GDP. The non-containerized throughput shows a strong relation to the national GDP. Furthermore, the results reveal that the volume of national export trade is the key influencing macroeconomic variable to the containerized throughput.

Originality/value

Application of the mutual information in port throughput analysis effectively reduces epistemic uncertainty in the identification of important cargos and key influencing macroeconomic variables. Thus, it increases the reliability of the port throughput forecast.

Details

Maritime Business Review, vol. 6 no. 2

Type: Research Article

DOI:

ISSN: 2397-3757

Keywords

Access

Year

Content type

1 – 5 of 5

Streaming feature selection algorithms for big data: A survey

Abstract

Details

Keywords

Improving handwritten digit recognition using hybrid feature selection algorithm

Abstract

Purpose

Design/methodology/approach

Findings

Originality/value

Details

Keywords

Sport analytics for cricket game results using machine learning: An experimental study

Abstract

Details

Keywords

Analysis of barriers of mHealth adoption in the context of sustainable operational practices in health care supply chains

Abstract

Purpose

Design/methodology/approach

Findings

Practical implications

Originality/value

Details

Keywords

Mutual information analysis of the factors influencing port throughput

Abstract

Purpose

Design/methodology/approach

Findings

Originality/value

Details

Keywords

Access

Year

Content type

We’re listening — tell us what you think

Something didn’t work…

All feedback is valuable

Join us on our journey

Platform update page

Questions & More Information