Search results

Across disciplines, researchers and practitioners employ decision tree ensembles such as random forests and XGBoost with great success. What explains their popularity? This…

HTML

PDF (423 KB)

EPUB (683 KB)

Abstract

Across disciplines, researchers and practitioners employ decision tree ensembles such as random forests and XGBoost with great success. What explains their popularity? This chapter showcases how marketing scholars and decision-makers can harness the power of decision tree ensembles for academic and practical applications. The author discusses the origin of decision tree ensembles, explains their theoretical underpinnings, and illustrates them empirically using a real-world telemarketing case, with the objective of predicting customer conversions. Readers unfamiliar with decision tree ensembles will learn to appreciate them for their versatility, competitive accuracy, ease of application, and computational efficiency and will gain a comprehensive understanding why decision tree ensembles contribute to every data scientist's methodological toolbox.

Details

The Machine Age of Customer Insight

Type: Book

DOI:

ISBN: 978-1-83909-697-6

Keywords

View access options

Book part

Publication date: 6 September 2019

Detecting Non-injured Passengers and Drivers in Car Accidents: A New Under-resampling Method for Imbalanced Classification

Son Nguyen, Gao Niu, John Quinn, Alan Olinsky, Jonathan Ormsbee, Richard M. Smith and James Bishop

In recent years, the problem of classification with imbalanced data has been growing in popularity in the data-mining and machine-learning communities due to the emergence of an…

HTML

PDF (716 KB)

EPUB (432 KB)

Abstract

In recent years, the problem of classification with imbalanced data has been growing in popularity in the data-mining and machine-learning communities due to the emergence of an abundance of imbalanced data in many fields. In this chapter, we compare the performance of six classification methods on an imbalanced dataset under the influence of four resampling techniques. These classification methods are the random forest, the support vector machine, logistic regression, k-nearest neighbor (KNN), the decision tree, and AdaBoost. Our study has shown that all of the classification methods have difficulty when working with the imbalanced data, with the KNN performing the worst, detecting only 27.4% of the minority class. However, with the help of resampling techniques, all of the classification methods experience improvement on overall performances. In particular, the Random Forest, in combination with the random over-sampling technique, performs the best, achieving 82.8% balanced accuracy (the average of the true-positive rate and true-negative rate).

We then propose a new procedure to resample the data. Our method is based on the idea of eliminating “easy” majority observations before under-sampling them. It has further improved the balanced accuracy of the Random Forest to 83.7%, making it the best approach for the imbalanced data.

Details

Advances in Business and Management Forecasting

Type: Book

DOI:

ISBN: 978-1-78754-290-7

Keywords

View access options

Book part

Publication date: 30 September 2020

Use of Classification Algorithms in Health Care

Hera Khan, Ayush Srivastav and Amit Kumar Mishra

A detailed description will be provided of all the classification algorithms that have been widely used in the domain of medical science. The foundation will be laid by giving a…

HTML

PDF (852 KB)

EPUB (1 MB)

Abstract

A detailed description will be provided of all the classification algorithms that have been widely used in the domain of medical science. The foundation will be laid by giving a comprehensive overview pertaining to the background and history of the classification algorithms. This will be followed by an extensive discussion regarding various techniques of classification algorithm in machine learning (ML) hence concluding with their relevant applications in data analysis in medical science and health care. To begin with, the initials of this chapter will deal with the basic fundamentals required for a profound understanding of the classification techniques in ML which will comprise of the underlying differences between Unsupervised and Supervised Learning followed by the basic terminologies of classification and its history. Further, it will include the types of classification algorithms ranging from linear classifiers like Logistic Regression, Naïve Bayes to Nearest Neighbour, Support Vector Machine, Tree-based Classifiers, and Neural Networks, and their respective mathematics. Ensemble algorithms such as Majority Voting, Boosting, Bagging, Stacking will also be discussed at great length along with their relevant applications. Furthermore, this chapter will also incorporate comprehensive elucidation regarding the areas of application of such classification algorithms in the field of biomedicine and health care and their contribution to decision-making systems and predictive analysis. To conclude, this chapter will devote highly in the field of research and development as it will provide a thorough insight to the classification algorithms and their relevant applications used in the cases of the healthcare development sector.

Details

Big Data Analytics and Intelligence: A Perspective for Health Care

Type: Book

DOI:

ISBN: 978-1-83909-099-8

Keywords

View access options

Book part

Publication date: 1 September 2021

Effects of Resampling Techniques on Imbalanced Data Classification: A New Under-resampling Method

Son Nguyen, Phyllis Schumacher, Alan Olinsky and John Quinn

We study the performances of various predictive models including decision trees, random forests, neural networks, and linear discriminant analysis on an imbalanced data set of…

HTML

PDF (1.3 MB)

EPUB (10.6 MB)

Abstract

We study the performances of various predictive models including decision trees, random forests, neural networks, and linear discriminant analysis on an imbalanced data set of home loan applications. During the process, we propose our undersampling algorithm to cope with the issues created by the imbalance of the data. Our technique is shown to work competitively against popular resampling techniques such as random oversampling, undersampling, synthetic minority oversampling technique (SMOTE), and random oversampling examples (ROSE). We also investigate the relation between the true positive rate, true negative rate, and the imbalance of the data.

Details

Advances in Business and Management Forecasting

Type: Book

DOI:

ISBN: 978-1-83982-091-5

Keywords

View access options

Book part

Publication date: 28 September 2023

A Study on the Impact of COVID-19 on the Stock Market in BRIC Countries

M Anand Shankar Raja, Keerthana Shekar, B Harshith and Purvi Rastogi

The COVID-19 pandemic has recently had an impact on the stock market all over the globe. A thorough review of the literature that included the most cited articles and articles…

HTML

PDF (1.8 MB)

EPUB (1.1 MB)

Abstract

The COVID-19 pandemic has recently had an impact on the stock market all over the globe. A thorough review of the literature that included the most cited articles and articles from well-known databases revealed that earlier research in the field had not specifically addressed how the BRIC stock markets responded to the COVID-19 pandemic. The data regarding COVID-19 were collected from the World Health Organization (WHO) website, and the stock market data were collected from Yahoo Finance and the respective country’s stock exchange. A random forest regression algorithm takes the closing price of respective stock indices as target variables and COVID-19 variables as input variables. Using this algorithm, a model is fit to the data and is visualised using line plots. This study’s findings highlight a relationship between the COVID-19 variables and stock market indices. In addition, the stock market of BRIC countries showed a high correlation, especially with the Shanghai Composite Stock Index with a correlation value of 0.7 and above. Brazil took the worst hit in the studied duration by declining approximately 45.99%, followed by India by 37.76%. Finally, the data set’s model fit, which employed the random forest machine learning method, produced R² values of 0.972, 0.005, 0.997, and 0.983 and mean percentage errors of 1.4, 0.8, 0.9, and 0.8 for Brazil, Russia, India, and China (BRIC), respectively. Even now, two years after the coronavirus pandemic started, the Brazilian stock index has not yet returned to its pre-pandemic level.

Details

Digital Transformation, Strategic Resilience, Cyber Security and Risk Management

Type: Book

DOI:

ISBN: 978-1-83797-009-4

Keywords

View access options

Book part

Publication date: 8 November 2021

Do Machine Learning Models Hold the Key to Better Money Demand Forecasting?

Taniya Ghosh and Sakshi Agarwal

Significant evidence in the literature points to money demand instability and therefore inaccurate forecasting. In view of this issue, this chapter seeks to use a method…

HTML

PDF (1.3 MB)

EPUB (602 KB)

Abstract

Significant evidence in the literature points to money demand instability and therefore inaccurate forecasting. In view of this issue, this chapter seeks to use a method, innovative for money demand literature, that is, the machine learning model to predict money demand. Specifically, this chapter uses Random Forest Regression to predict money demand using monthly data in the Indian context over the period April-1996 to December-2018 using the variables usually used in literature. The chapter finds that in money demand prediction, the Random Forest Regression performs fairly well. The results are also compared to traditional models and it is found that the Random Forest Regression model has the potential to enhance the prediction of money demand over what traditional models predicts.

Details

Environmental, Social, and Governance Perspectives on Economic Development in Asia

Type: Book

DOI:

ISBN: 978-1-80117-594-4

Keywords

View access options

Book part

Publication date: 1 September 2021

Predicting the Length of Stay in Hospital Emergency Rooms in Rhode Island

Alicia T. Lamere, Son Nguyen, Gao Niu, Alan Olinsky and John Quinn

Predicting a patient's length of stay (LOS) in a hospital setting has been widely researched. Accurately predicting an individual's LOS can have a significant impact on a…

HTML

PDF (921 KB)

EPUB (10.6 MB)

Abstract

Predicting a patient's length of stay (LOS) in a hospital setting has been widely researched. Accurately predicting an individual's LOS can have a significant impact on a healthcare provider's ability to care for individuals by allowing them to properly prepare and manage resources. A hospital's productivity requires a delicate balance of maintaining enough staffing and resources without being overly equipped or wasteful. This has become even more important in light of the current COVID-19 pandemic, during which emergency departments around the globe have been inundated with patients and are struggling to manage their resources.

In this study, the authors focus on the prediction of LOS at the time of admission in emergency departments at Rhode Island hospitals through discharge data obtained from the Rhode Island Department of Health over the time period of 2012 and 2013. This work also explores the distribution of discharge dispositions in an effort to better characterize the resources patients require upon leaving the emergency department.

Details

Advances in Business and Management Forecasting

Type: Book

DOI:

ISBN: 978-1-83982-091-5

Keywords

View access options

Book part

Publication date: 4 December 2020

Predictive Analysis: Comprehensive Study of Popular Open-Source Tools

Gauri Rajendra Virkar and Supriya Sunil Shinde

Predictive analytics is the science of decision-making that eliminates guesswork out of the decision-making process and applies proven scientific procedures to find right…

HTML

PDF (1 MB)

EPUB (2.3 MB)

Abstract

Predictive analytics is the science of decision-making that eliminates guesswork out of the decision-making process and applies proven scientific procedures to find right solutions. Predictive analytics provides ideas on the occurrences of future downtimes and rejections thereby aids in taking preventive actions before abnormalities occur. Considering these advantages, predictive analytics is adopted in various diverse fields such as health care, finance, education, marketing, automotive, etc. Predictive analytics tools can be used to predict various behaviors and patterns, thereby saving the time and money of its users. Many open-source predictive analysis tools namely R, scikit-learn, Konstanz Information Miner (KNIME), Orange, RapidMiner, Waikato Environment for Knowledge Analysis (WEKA), etc. are freely available for the users. This chapter aims to reveal the best accurate tools and techniques for the classification task that aid in decision-making. Our experimental results show that no specific tool provides the best results in all scenarios; rather it depends upon the datasets and the classifier.

Details

Data Science and Analytics

Type: Book

DOI:

ISBN: 978-1-80043-877-4

Keywords

View access options

Book part

Publication date: 10 November 2017

Exploring Rural Public Library Assets for Asset-Based Community Development

Karen Miller

This chapter explores differences in fringe, distant, and remote rural public library assets for asset-based community development (ABCD) and the relationships of those assets to…

HTML

PDF (619 KB)

EPUB (1.8 MB)

Abstract

This chapter explores differences in fringe, distant, and remote rural public library assets for asset-based community development (ABCD) and the relationships of those assets to geographic regions, governance structures, and demographics.

The author analyzes 2013 data from the Institute of Museum and Library Services (IMLS) and U.S. Department of Agriculture using nonparametric statistics and data mining random forest supervised classification algorithms.

There are statistically significant differences between fringe, distant, and remote library assets. Unexpectedly, median per capita outlets (along with service hours and staff) increase as distances from urban areas increase. The Southeast region ranks high in unemployment and poverty and low in median household income, which aligns with the Southeast’s low median per capita library expenditures, staff, hours, inventory, and programs. However, the Southeast’s relatively high percentage of rural libraries with at least one staff member with a Master of Library and Information Science promises future asset growth in those libraries. State and federal contributions to Alaska libraries propelled the remote Far West to the number one ranking in median per capita staff, inventory, and programs.

This study is based on IMLS library system-wide data and does not include rural library branches operated by nonrural central libraries.

State and federal contributions to rural libraries increase economic, cultural, and social capital creation in the most remote communities. On a per capita basis, economic capital from state and federal agencies assists small, remote rural libraries in providing infrastructure and services that are more closely aligned with libraries in more populated areas and increases library assets available for ABCD initiatives in otherwise underserved communities.

Even the smallest rural library can contribute to ABCD initiatives by connecting their communities to outside resources and creating new economic, cultural, and social assets.

Analyzing rural public library assets within their geographic, political, and demographic contexts highlights their potential contributions to ABCD initiatives.

Details

Rural and Small Public Libraries: Challenges and Opportunities

Type: Book

DOI:

ISBN: 978-1-78743-112-6

Keywords

Access

Year

Content type

Book part (203)

1 – 10 of 203

Abstract

Details

Abstract

Details

Keywords

Abstract

Details

Keywords

Abstract

Details

Keywords

Abstract

Details

Keywords

Abstract

Details

Keywords

Abstract

Details

Keywords

Abstract

Details

Keywords

Abstract

Details

Keywords

Abstract

Details

Keywords

Access

Year

Content type

We’re listening — tell us what you think

Something didn’t work…

All feedback is valuable

Join us on our journey

Platform update page

Questions & More Information