Search results
1 – 10 of 103Harleen Kaur and Vinita Kumari
Diabetes is a major metabolic disorder which can affect entire body system adversely. Undiagnosed diabetes can increase the risk of cardiac stroke, diabetic nephropathy and other…
Abstract
Diabetes is a major metabolic disorder which can affect entire body system adversely. Undiagnosed diabetes can increase the risk of cardiac stroke, diabetic nephropathy and other disorders. All over the world millions of people are affected by this disease. Early detection of diabetes is very important to maintain a healthy life. This disease is a reason of global concern as the cases of diabetes are rising rapidly. Machine learning (ML) is a computational method for automatic learning from experience and improves the performance to make more accurate predictions. In the current research we have utilized machine learning technique in Pima Indian diabetes dataset to develop trends and detect patterns with risk factors using R data manipulation tool. To classify the patients into diabetic and non-diabetic we have developed and analyzed five different predictive models using R data manipulation tool. For this purpose we used supervised machine learning algorithms namely linear kernel support vector machine (SVM-linear), radial basis function (RBF) kernel support vector machine, k-nearest neighbour (k-NN), artificial neural network (ANN) and multifactor dimensionality reduction (MDR).
Details
Keywords
M'hamed Bilal Abidine, Mourad Oussalah, Belkacem Fergani and Hakim Lounis
Mobile phone-based human activity recognition (HAR) consists of inferring user’s activity type from the analysis of the inertial mobile sensor data. This paper aims to mainly…
Abstract
Purpose
Mobile phone-based human activity recognition (HAR) consists of inferring user’s activity type from the analysis of the inertial mobile sensor data. This paper aims to mainly introduce a new classification approach called adaptive k-nearest neighbors (AKNN) for intelligent HAR using smartphone inertial sensors with a potential real-time implementation on smartphone platform.
Design/methodology/approach
The proposed method puts forward several modification on AKNN baseline by using kernel discriminant analysis for feature reduction and hybridizing weighted support vector machines and KNN to tackle imbalanced class data set.
Findings
Extensive experiments on a five large scale daily activity recognition data set have been performed to demonstrate the effectiveness of the method in terms of error rate, recall, precision, F1-score and computational/memory resources, with several comparison with state-of-the art methods and other hybridization modes. The results showed that the proposed method can achieve more than 50% improvement in error rate metric and up to 5.6% in F1-score. The training phase is also shown to be reduced by a factor of six compared to baseline, which provides solid assets for smartphone implementation.
Practical implications
This work builds a bridge to already growing work in machine learning related to learning with small data set. Besides, the availability of systems that are able to perform on flight activity recognition on smartphone will have a significant impact in the field of pervasive health care, supporting a variety of practical applications such as elderly care, ambient assisted living and remote monitoring.
Originality/value
The purpose of this study is to build and test an accurate offline model by using only a compact training data that can reduce the computational and memory complexity of the system. This provides grounds for developing new innovative hybridization modes in the context of daily activity recognition and smartphone-based implementation. This study demonstrates that the new AKNN is able to classify the data without any training step because it does not use any model for fitting and only uses memory resources to store the corresponding support vectors.
Details
Keywords
Arunit Maity, P. Prakasam and Sarthak Bhargava
Due to the continuous and rapid evolution of telecommunication equipment, the demand for more efficient and noise-robust detection of dual-tone multi-frequency (DTMF) signals is…
Abstract
Purpose
Due to the continuous and rapid evolution of telecommunication equipment, the demand for more efficient and noise-robust detection of dual-tone multi-frequency (DTMF) signals is most significant.
Design/methodology/approach
A novel machine learning-based approach to detect DTMF tones affected by noise, frequency and time variations by employing the k-nearest neighbour (KNN) algorithm is proposed. The features required for training the proposed KNN classifier are extracted using Goertzel's algorithm that estimates the absolute discrete Fourier transform (DFT) coefficient values for the fundamental DTMF frequencies with or without considering their second harmonic frequencies. The proposed KNN classifier model is configured in four different manners which differ in being trained with or without augmented data, as well as, with or without the inclusion of second harmonic frequency DFT coefficient values as features.
Findings
It is found that the model which is trained using the augmented data set and additionally includes the absolute DFT values of the second harmonic frequency values for the eight fundamental DTMF frequencies as the features, achieved the best performance with a macro classification F1 score of 0.980835, a five-fold stratified cross-validation accuracy of 98.47% and test data set detection accuracy of 98.1053%.
Originality/value
The generated DTMF signal has been classified and detected using the proposed KNN classifier which utilizes the DFT coefficient along with second harmonic frequencies for better classification. Additionally, the proposed KNN classifier has been compared with existing models to ascertain its superiority and proclaim its state-of-the-art performance.
Details
Keywords
Sudhaman Parthasarathy and S.T. Padmapriya
Algorithm bias refers to repetitive computer program errors that give some users more weight than others. The aim of this article is to provide a deeper insight of algorithm bias…
Abstract
Purpose
Algorithm bias refers to repetitive computer program errors that give some users more weight than others. The aim of this article is to provide a deeper insight of algorithm bias in AI-enabled ERP software customization. Although algorithmic bias in machine learning models has uneven, unfair and unjust impacts, research on it is mostly anecdotal and scattered.
Design/methodology/approach
As guided by the previous research (Akter et al., 2022), this study presents the possible design bias (model, data and method) one may experience with enterprise resource planning (ERP) software customization algorithm. This study then presents the artificial intelligence (AI) version of ERP customization algorithm using k-nearest neighbours algorithm.
Findings
This study illustrates the possible bias when the prioritized requirements customization estimation (PRCE) algorithm available in the ERP literature is executed without any AI. Then, the authors present their newly developed AI version of the PRCE algorithm that uses ML techniques. The authors then discuss its adjoining algorithmic bias with an illustration. Further, the authors also draw a roadmap for managing algorithmic bias during ERP customization in practice.
Originality/value
To the best of the authors’ knowledge, no prior research has attempted to understand the algorithmic bias that occurs during the execution of the ERP customization algorithm (with or without AI).
Details
Keywords
Financial health of a corporation is a great concern for every investor level and decision-makers. For many years, financial solvency prediction is a significant issue throughout…
Abstract
Purpose
Financial health of a corporation is a great concern for every investor level and decision-makers. For many years, financial solvency prediction is a significant issue throughout academia, precisely in finance. This requirement leads this study to check whether machine learning can be implemented in financial solvency prediction.
Design/methodology/approach
This study analyzed 244 Dhaka stock exchange public-listed companies over the 2015–2019 period, and two subsets of data are also developed as training and testing datasets. For machine learning model building, samples are classified as secure, healthy and insolvent by the Altman Z-score. R statistical software is used to make predictive models of five classifiers and all model performances are measured with different performance metrics such as logarithmic loss (logLoss), area under the curve (AUC), precision recall AUC (prAUC), accuracy, kappa, sensitivity and specificity.
Findings
This study found that the artificial neural network classifier has 88% accuracy and sensitivity rate; also, AUC for this model is 96%. However, the ensemble classifier outperforms all other models by considering logLoss and other metrics.
Research limitations/implications
The major result of this study can be implicated to the financial institution for credit scoring, credit rating and loan classification, etc. And other companies can implement machine learning models to their enterprise resource planning software to trace their financial solvency.
Practical implications
Finally, a predictive application is developed through training a model with 1,200 observations and making it available for all rational and novice investors (Abdullah, 2020).
Originality/value
This study found that, with the best of author expertise, the author did not find any studies regarding machine learning research of financial solvency that examines a comparable number of a dataset, with all these models in Bangladesh.
Details
Keywords
Hongyuan Wang and Jingcheng Wang
The purpose of this paper aims to design an optimization control for tunnel boring machine (TBM) based on geological identification. For unknown geological condition, the authors…
Abstract
Purpose
The purpose of this paper aims to design an optimization control for tunnel boring machine (TBM) based on geological identification. For unknown geological condition, the authors need to identify them before further optimization. For fully considering multiple crucial performance of TBM, the authors establish an optimization problem for TBM so that it can be adapted to varying geology. That is, TBM can operate optimally under corresponding geology, which is called geology-adaptability.
Design/methodology/approach
This paper adopted k-nearest neighbor (KNN) algorithm with modification to identify geological conditions. The modification includes adjustment of weights in voting procedure and similarity distance measurement, which at suitable for engineering and enhance accuracy of prediction. The authors also design several key performances of TBM during operation, and built a multi-objective function. Further, the multi-objective function has been transformed into a single objective function by weighted-combination. The reformulated optimization was solved by genetic algorithm in the end.
Findings
This paper provides a support for decision-making in TBM control. Through proposed optimization control, the advance speed of TBM has been enhanced dramatically in each geological condition, compared with the results before optimizing. Meanwhile, other performances are acceptable and the method is verified by in situ data.
Originality/value
This paper fulfills an optimization control of TBM considering several key performances during excavating. The optimization is conducted under different geological conditions so that TBM has geological-adaptability.
Details
Keywords
Kumash Kapadia, Hussein Abdel-Jaber, Fadi Thabtah and Wael Hadi
Indian Premier League (IPL) is one of the more popular cricket world tournaments, and its financial is increasing each season, its viewership has increased markedly and the…
Abstract
Indian Premier League (IPL) is one of the more popular cricket world tournaments, and its financial is increasing each season, its viewership has increased markedly and the betting market for IPL is growing significantly every year. With cricket being a very dynamic game, bettors and bookies are incentivised to bet on the match results because it is a game that changes ball-by-ball. This paper investigates machine learning technology to deal with the problem of predicting cricket match results based on historical match data of the IPL. Influential features of the dataset have been identified using filter-based methods including Correlation-based Feature Selection, Information Gain (IG), ReliefF and Wrapper. More importantly, machine learning techniques including Naïve Bayes, Random Forest, K-Nearest Neighbour (KNN) and Model Trees (classification via regression) have been adopted to generate predictive models from distinctive feature sets derived by the filter-based methods. Two featured subsets were formulated, one based on home team advantage and other based on Toss decision. Selected machine learning techniques were applied on both feature sets to determine a predictive model. Experimental tests show that tree-based models particularly Random Forest performed better in terms of accuracy, precision and recall metrics when compared to probabilistic and statistical models. However, on the Toss featured subset, none of the considered machine learning algorithms performed well in producing accurate predictive models.
Details
Keywords
Rahila Umer, Teo Susnjak, Anuradha Mathrani and Suriadi Suriadi
The purpose of this paper is to propose a process mining approach to help in making early predictions to improve students’ learning experience in massive open online courses…
Abstract
Purpose
The purpose of this paper is to propose a process mining approach to help in making early predictions to improve students’ learning experience in massive open online courses (MOOCs). It investigates the impact of various machine learning techniques in combination with process mining features to measure effectiveness of these techniques.
Design/methodology/approach
Student’s data (e.g. assessment grades, demographic information) and weekly interaction data based on event logs (e.g. video lecture interaction, solution submission time, time spent weekly) have guided this design. This study evaluates four machine learning classification techniques used in the literature (logistic regression (LR), Naïve Bayes (NB), random forest (RF) and K-nearest neighbor) to monitor weekly progression of students’ performance and to predict their overall performance outcome. Two data sets – one, with traditional features and second, with features obtained from process conformance testing – have been used.
Findings
The results show that techniques used in the study are able to make predictions on the performance of students. Overall accuracy (F1-score, area under curve) of machine learning techniques can be improved by integrating process mining features with standard features. Specifically, the use of LR and NB classifiers outperforms other techniques in a statistical significant way.
Practical implications
Although MOOCs provide a platform for learning in highly scalable and flexible manner, they are prone to early dropout and low completion rate. This study outlines a data-driven approach to improve students’ learning experience and decrease the dropout rate.
Social implications
Early predictions based on individual’s participation can help educators provide support to students who are struggling in the course.
Originality/value
This study outlines the innovative use of process mining techniques in education data mining to help educators gather data-driven insight on student performances in the enrolled courses.
Details
Keywords
This study focuses on the classification of targets with varying shapes using radar cross section (RCS), which is influenced by the target’s shape. This study aims to develop a…
Abstract
Purpose
This study focuses on the classification of targets with varying shapes using radar cross section (RCS), which is influenced by the target’s shape. This study aims to develop a robust classification method by considering an incident angle with minor random fluctuations and using a physical optics simulation to generate data sets.
Design/methodology/approach
The approach involves several supervised machine learning and classification methods, including traditional algorithms and a deep neural network classifier. It uses histogram-based definitions of the RCS for feature extraction, with an emphasis on resilience against noise in the RCS data. Data enrichment techniques are incorporated, including the use of noise-impacted histogram data sets.
Findings
The classification algorithms are extensively evaluated, highlighting their efficacy in feature extraction from RCS histograms. Among the studied algorithms, the K-nearest neighbour is found to be the most accurate of the traditional methods, but it is surpassed in accuracy by a deep learning network classifier. The results demonstrate the robustness of the feature extraction from the RCS histograms, motivated by mm-wave radar applications.
Originality/value
This study presents a novel approach to target classification that extends beyond traditional methods by integrating deep neural networks and focusing on histogram-based methodologies. It also incorporates data enrichment techniques to enhance the analysis, providing a comprehensive perspective for target detection using RCS.
Details
Keywords
In this research, the authors demonstrate the advantage of reinforcement learning (RL) based intrusion detection systems (IDS) to solve very complex problems (e.g. selecting input…
Abstract
Purpose
In this research, the authors demonstrate the advantage of reinforcement learning (RL) based intrusion detection systems (IDS) to solve very complex problems (e.g. selecting input features, considering scarce resources and constrains) that cannot be solved by classical machine learning. The authors include a comparative study to build intrusion detection based on statistical machine learning and representational learning, using knowledge discovery in databases (KDD) Cup99 and Installation Support Center of Expertise (ISCX) 2012.
Design/methodology/approach
The methodology applies a data analytics approach, consisting of data exploration and machine learning model training and evaluation. To build a network-based intrusion detection system, the authors apply dueling double deep Q-networks architecture enabled with costly features, k-nearest neighbors (K-NN), support-vector machines (SVM) and convolution neural networks (CNN).
Findings
Machine learning-based intrusion detection are trained on historical datasets which lead to model drift and lack of generalization whereas RL is trained with data collected through interactions. RL is bound to learn from its interactions with a stochastic environment in the absence of a training dataset whereas supervised learning simply learns from collected data and require less computational resources.
Research limitations/implications
All machine learning models have achieved high accuracy values and performance. One potential reason is that both datasets are simulated, and not realistic. It was not clear whether a validation was ever performed to show that data were collected from real network traffics.
Practical implications
The study provides guidelines to implement IDS with classical supervised learning, deep learning and RL.
Originality/value
The research applied the dueling double deep Q-networks architecture enabled with costly features to build network-based intrusion detection from network traffics. This research presents a comparative study of reinforcement-based instruction detection with counterparts built with statistical and representational machine learning.
Details