Search results

1 – 10 of 49
Book part
Publication date: 1 September 2021

Son Nguyen, Phyllis Schumacher, Alan Olinsky and John Quinn

We study the performances of various predictive models including decision trees, random forests, neural networks, and linear discriminant analysis on an imbalanced data set of…

Abstract

We study the performances of various predictive models including decision trees, random forests, neural networks, and linear discriminant analysis on an imbalanced data set of home loan applications. During the process, we propose our undersampling algorithm to cope with the issues created by the imbalance of the data. Our technique is shown to work competitively against popular resampling techniques such as random oversampling, undersampling, synthetic minority oversampling technique (SMOTE), and random oversampling examples (ROSE). We also investigate the relation between the true positive rate, true negative rate, and the imbalance of the data.

Book part
Publication date: 26 October 2017

Son Nguyen, John Quinn and Alan Olinsky

We propose an oversampling technique to increase the true positive rate (sensitivity) in classifying imbalanced datasets (i.e., those with a value for the target variable that…

Abstract

We propose an oversampling technique to increase the true positive rate (sensitivity) in classifying imbalanced datasets (i.e., those with a value for the target variable that occurs with a small frequency) and hence boost the overall performance measurements such as balanced accuracy, G-mean and area under the receiver operating characteristic (ROC) curve, AUC. This oversampling method is based on the idea of applying the Synthetic Minority Oversampling Technique (SMOTE) on only a selective portion of the dataset instead of the entire dataset. We demonstrate the effectiveness of our oversampling method with four real and simulated datasets generated from three models.

Details

Advances in Business and Management Forecasting
Type: Book
ISBN: 978-1-78743-069-3

Keywords

Book part
Publication date: 6 September 2019

Son Nguyen, Edward Golas, William Zywiak and Kristin Kennedy

Bankruptcy prediction has attracted a great deal of research in the data mining/machine learning community, due to its significance in the world of accounting, finance, and…

Abstract

Bankruptcy prediction has attracted a great deal of research in the data mining/machine learning community, due to its significance in the world of accounting, finance, and investment. This chapter examines the influence of different dimension reduction techniques on decision tree model applied to the bankruptcy prediction problem. The studied techniques are principal component analysis (PCA), sliced inversed regression (SIR), sliced average variance estimation (SAVE), and factor analysis (FA). To focus on the impact of the dimension reduction techniques, we chose only to use decision tree as our predictive model and “undersampling” as the solution to the issue of data imbalance. Our computation shows that the choice of dimension reduction technique greatly affects the performances of predictive models and that one could use dimension reduction techniques to improve the predictive power of the decision tree model. Also, in this study, we propose a method to estimate the true dimension of the data.

Details

Advances in Business and Management Forecasting
Type: Book
ISBN: 978-1-78754-290-7

Keywords

Content available
Book part
Publication date: 6 September 2019

Abstract

Details

Advances in Business and Management Forecasting
Type: Book
ISBN: 978-1-78754-290-7

Content available
Book part
Publication date: 1 September 2021

Abstract

Details

Advances in Business and Management Forecasting
Type: Book
ISBN: 978-1-83982-091-5

Content available
Book part
Publication date: 23 August 2018

Abstract

Details

Voluntary and Involuntary Childlessness
Type: Book
ISBN: 978-1-78754-362-1

Book part
Publication date: 1 September 2021

Alicia T. Lamere, Son Nguyen, Gao Niu, Alan Olinsky and John Quinn

Predicting a patient's length of stay (LOS) in a hospital setting has been widely researched. Accurately predicting an individual's LOS can have a significant impact on a…

Abstract

Predicting a patient's length of stay (LOS) in a hospital setting has been widely researched. Accurately predicting an individual's LOS can have a significant impact on a healthcare provider's ability to care for individuals by allowing them to properly prepare and manage resources. A hospital's productivity requires a delicate balance of maintaining enough staffing and resources without being overly equipped or wasteful. This has become even more important in light of the current COVID-19 pandemic, during which emergency departments around the globe have been inundated with patients and are struggling to manage their resources.

In this study, the authors focus on the prediction of LOS at the time of admission in emergency departments at Rhode Island hospitals through discharge data obtained from the Rhode Island Department of Health over the time period of 2012 and 2013. This work also explores the distribution of discharge dispositions in an effort to better characterize the resources patients require upon leaving the emergency department.

Book part
Publication date: 22 July 2021

Chien-Hung Chang

This chapter introduces a risk control framework on credit card fraud instead of providing a solely binary classifier model. The anomaly detection approach is adopted to identify…

Abstract

This chapter introduces a risk control framework on credit card fraud instead of providing a solely binary classifier model. The anomaly detection approach is adopted to identify fraud events as the outliers of the reconstruction error of a trained autoencoder (AE). The trained AE shows fitness and robustness on the normal transactions and heterogeneous behavior on fraud activities. The cost of false-positive normal transactions is controlled, and the loss of false-negative frauds can be evaluated by the thresholds from the percentiles of reconstruction error of trained AE on normal transactions. To align the risk assessment of the economic and financial situation, the risk manager can adjust the threshold to meet the risk control requirements. Using the 95th percentile as the threshold, the rate of wrongly detecting normal transactions is controlled at 5% and the true positive rate is 86%. For the 99th percentile threshold, the well-controlled false positive rate is around 1% and 83% for the truly detecting fraud activities. The performance of a false positive rate and the true positive rate is competitive with other supervised learning algorithms.

Details

Advances in Pacific Basin Business, Economics and Finance
Type: Book
ISBN: 978-1-80043-870-5

Keywords

Book part
Publication date: 25 November 2003

Donna J Brogan

Health care insurance companies often conduct sample surveys of health plan members. Survey purposes include: consumer satisfaction with the plan and members’ health status…

Abstract

Health care insurance companies often conduct sample surveys of health plan members. Survey purposes include: consumer satisfaction with the plan and members’ health status, functional status, health literacy and/or health services utilization outside of the plan. Vendors or contractors typically conduct these surveys for insurers. Survey results may be used for plans’ accreditation, evaluation, quality improvement and/or marketing. This article describes typical sampling plans and data analysis strategies used in these surveys, showing how these methods may result in biased estimators of population parameters (e.g. percentage of plan members who are satisfied). Practical suggestions are given to improve these surveys: alternate sampling plans, increasing the response rate, component calculation for the survey response rate, weighted analyses, and adjustments for unit non-response. Since policy, regulation, accreditation, management and marketing decisions are based, in part, on results from these member surveys, these important and numerous surveys need to be of higher quality.

Details

Reorganizing Health Care Delivery Systems: Problems of Managed
Type: Book
ISBN: 978-1-84950-247-4

Book part
Publication date: 18 July 2022

Yakub Kayode Saheed, Usman Ahmad Baba and Mustafa Ayobami Raji

Purpose: This chapter aims to examine machine learning (ML) models for predicting credit card fraud (CCF).Need for the study: With the advance of technology, the world is…

Abstract

Purpose: This chapter aims to examine machine learning (ML) models for predicting credit card fraud (CCF).

Need for the study: With the advance of technology, the world is increasingly relying on credit cards rather than cash in daily life. This creates a slew of new opportunities for fraudulent individuals to abuse these cards. As of December 2020, global card losses reached $28.65billion, up 2.9% from $27.85 billion in 2018, according to the Nilson 2019 research. To safeguard the safety of credit card users, the credit card issuer should include a service that protects customers from potential risks. CCF has become a severe threat as internet buying has grown. To this goal, various studies in the field of automatic and real-time fraud detection are required. Due to their advantageous properties, the most recent ones employ a variety of ML algorithms and techniques to construct a well-fitting model to detect fraudulent transactions. When it comes to recognising credit card risk is huge and high-dimensional data, feature selection (FS) is critical for improving classification accuracy and fraud detection.

Methodology/design/approach: The objectives of this chapter are to construct a new model for credit card fraud detection (CCFD) based on principal component analysis (PCA) for FS and using supervised ML techniques such as K-nearest neighbour (KNN), ridge classifier, gradient boosting, quadratic discriminant analysis, AdaBoost, and random forest for classification of fraudulent and legitimate transactions. When compared to earlier experiments, the suggested approach demonstrates a high capacity for detecting fraudulent transactions. To be more precise, our model’s resilience is constructed by integrating the power of PCA for determining the most useful predictive features. The experimental analysis was performed on German credit card and Taiwan credit card data sets.

Findings: The experimental findings revealed that the KNN achieved an accuracy of 96.29%, recall of 100%, and precision of 96.29%, which is the best performing model on the German data set. While the ridge classifier was the best performing model on Taiwan Credit data with an accuracy of 81.75%, recall of 34.89, and precision of 66.61%.

Practical implications: The poor performance of the models on the Taiwan data revealed that it is an imbalanced credit card data set. The comparison of our proposed models with state-of-the-art credit card ML models showed that our results were competitive.

1 – 10 of 49