Search results

1 – 10 of over 3000
Book part
Publication date: 15 March 2021

Jochen Hartmann

Across disciplines, researchers and practitioners employ decision tree ensembles such as random forests and XGBoost with great success. What explains their popularity? This…

Abstract

Across disciplines, researchers and practitioners employ decision tree ensembles such as random forests and XGBoost with great success. What explains their popularity? This chapter showcases how marketing scholars and decision-makers can harness the power of decision tree ensembles for academic and practical applications. The author discusses the origin of decision tree ensembles, explains their theoretical underpinnings, and illustrates them empirically using a real-world telemarketing case, with the objective of predicting customer conversions. Readers unfamiliar with decision tree ensembles will learn to appreciate them for their versatility, competitive accuracy, ease of application, and computational efficiency and will gain a comprehensive understanding why decision tree ensembles contribute to every data scientist's methodological toolbox.

Details

The Machine Age of Customer Insight
Type: Book
ISBN: 978-1-83909-697-6

Keywords

Book part
Publication date: 31 January 2015

Davy Janssens and Geert Wets

Several activity-based transportation models are now becoming operational and are entering the stage of application for the modelling of travel demand. In our application, we will…

Abstract

Several activity-based transportation models are now becoming operational and are entering the stage of application for the modelling of travel demand. In our application, we will use decision rules to support the decision-making of the model instead of principles of utility maximization, which means our work can be interpreted as an application of the concept of bounded rationality in the transportation domain. In this chapter we explored a novel idea of combining decision trees and Bayesian networks to improve decision-making in order to maintain the potential advantages of both techniques. The results of this study suggest that integrated Bayesian networks and decision trees can be used for modelling the different choice facets of a travel demand model with better predictive power than CHAID decision trees. Another conclusion is that there are initial indications that the new way of integrating decision trees and Bayesian networks has produced a decision tree that is structurally more stable.

Details

Bounded Rational Choice Behaviour: Applications in Transport
Type: Book
ISBN: 978-1-78441-071-1

Keywords

Abstract

Details

Machine Learning and Artificial Intelligence in Marketing and Sales
Type: Book
ISBN: 978-1-80043-881-1

Book part
Publication date: 16 October 2020

Dawn Anderson and Donald (Don) Wengler

Auditing textbooks include summary level coverage of the American Institute of Certified Public Accountants (AICPA) Code of Professional Conduct, but textbook coverage is too…

Abstract

Auditing textbooks include summary level coverage of the American Institute of Certified Public Accountants (AICPA) Code of Professional Conduct, but textbook coverage is too brief to support a strong understanding of auditor independence. Independence rules have the force of professional law for the independent auditor (PCAOB, 2015). Threats to firm independence can arise from events and circumstances such as investments in the client, loans from the client, past-due fees, contingent fees, deposits in the client, gifts and job offers. Student test results from a five-year rotation of alternative auditor independence lecture support materials demonstrate that using the actual AICPA Code of Professional Conduct reduces student performance. However, this drag on student performance was mostly offset by the positive impacts of simultaneous use of an independence decision tree developed for this chapter and tested as a teaching material for classrooms use.

Details

Research on Professional Responsibility and Ethics in Accounting
Type: Book
ISBN: 978-1-83867-669-8

Keywords

Book part
Publication date: 30 September 2020

Hera Khan, Ayush Srivastav and Amit Kumar Mishra

A detailed description will be provided of all the classification algorithms that have been widely used in the domain of medical science. The foundation will be laid by giving a…

Abstract

A detailed description will be provided of all the classification algorithms that have been widely used in the domain of medical science. The foundation will be laid by giving a comprehensive overview pertaining to the background and history of the classification algorithms. This will be followed by an extensive discussion regarding various techniques of classification algorithm in machine learning (ML) hence concluding with their relevant applications in data analysis in medical science and health care. To begin with, the initials of this chapter will deal with the basic fundamentals required for a profound understanding of the classification techniques in ML which will comprise of the underlying differences between Unsupervised and Supervised Learning followed by the basic terminologies of classification and its history. Further, it will include the types of classification algorithms ranging from linear classifiers like Logistic Regression, Naïve Bayes to Nearest Neighbour, Support Vector Machine, Tree-based Classifiers, and Neural Networks, and their respective mathematics. Ensemble algorithms such as Majority Voting, Boosting, Bagging, Stacking will also be discussed at great length along with their relevant applications. Furthermore, this chapter will also incorporate comprehensive elucidation regarding the areas of application of such classification algorithms in the field of biomedicine and health care and their contribution to decision-making systems and predictive analysis. To conclude, this chapter will devote highly in the field of research and development as it will provide a thorough insight to the classification algorithms and their relevant applications used in the cases of the healthcare development sector.

Details

Big Data Analytics and Intelligence: A Perspective for Health Care
Type: Book
ISBN: 978-1-83909-099-8

Keywords

Book part
Publication date: 11 September 2020

D. K. Malhotra, Kunal Malhotra and Rashmi Malhotra

Traditionally, loan officers use different credit scoring models to complement judgmental methods to classify consumer loan applications. This study explores the use of decision…

Abstract

Traditionally, loan officers use different credit scoring models to complement judgmental methods to classify consumer loan applications. This study explores the use of decision trees, AdaBoost, and support vector machines (SVMs) to identify potential bad loans. Our results show that AdaBoost does provide an improvement over simple decision trees as well as SVM models in predicting good credit clients and bad credit clients. To cross-validate our results, we use k-fold classification methodology.

Abstract

Details

Fundamentals of HR Analytics
Type: Book
ISBN: 978-1-78973-964-0

Book part
Publication date: 6 September 2019

Son Nguyen, Gao Niu, John Quinn, Alan Olinsky, Jonathan Ormsbee, Richard M. Smith and James Bishop

In recent years, the problem of classification with imbalanced data has been growing in popularity in the data-mining and machine-learning communities due to the emergence of an…

Abstract

In recent years, the problem of classification with imbalanced data has been growing in popularity in the data-mining and machine-learning communities due to the emergence of an abundance of imbalanced data in many fields. In this chapter, we compare the performance of six classification methods on an imbalanced dataset under the influence of four resampling techniques. These classification methods are the random forest, the support vector machine, logistic regression, k-nearest neighbor (KNN), the decision tree, and AdaBoost. Our study has shown that all of the classification methods have difficulty when working with the imbalanced data, with the KNN performing the worst, detecting only 27.4% of the minority class. However, with the help of resampling techniques, all of the classification methods experience improvement on overall performances. In particular, the Random Forest, in combination with the random over-sampling technique, performs the best, achieving 82.8% balanced accuracy (the average of the true-positive rate and true-negative rate).

We then propose a new procedure to resample the data. Our method is based on the idea of eliminating “easy” majority observations before under-sampling them. It has further improved the balanced accuracy of the Random Forest to 83.7%, making it the best approach for the imbalanced data.

Details

Advances in Business and Management Forecasting
Type: Book
ISBN: 978-1-78754-290-7

Keywords

Book part
Publication date: 1 September 2021

Son Nguyen, Phyllis Schumacher, Alan Olinsky and John Quinn

We study the performances of various predictive models including decision trees, random forests, neural networks, and linear discriminant analysis on an imbalanced data set of…

Abstract

We study the performances of various predictive models including decision trees, random forests, neural networks, and linear discriminant analysis on an imbalanced data set of home loan applications. During the process, we propose our undersampling algorithm to cope with the issues created by the imbalance of the data. Our technique is shown to work competitively against popular resampling techniques such as random oversampling, undersampling, synthetic minority oversampling technique (SMOTE), and random oversampling examples (ROSE). We also investigate the relation between the true positive rate, true negative rate, and the imbalance of the data.

Book part
Publication date: 18 October 2019

Edward George, Purushottam Laud, Brent Logan, Robert McCulloch and Rodney Sparapani

Bayesian additive regression trees (BART) is a fully Bayesian approach to modeling with ensembles of trees. BART can uncover complex regression functions with high-dimensional…

Abstract

Bayesian additive regression trees (BART) is a fully Bayesian approach to modeling with ensembles of trees. BART can uncover complex regression functions with high-dimensional regressors in a fairly automatic way and provide Bayesian quantification of the uncertainty through the posterior. However, BART assumes independent and identical distributed (i.i.d) normal errors. This strong parametric assumption can lead to misleading inference and uncertainty quantification. In this chapter we use the classic Dirichlet process mixture (DPM) mechanism to nonparametrically model the error distribution. A key strength of BART is that default prior settings work reasonably well in a variety of problems. The challenge in extending BART is to choose the parameters of the DPM so that the strengths of the standard BART approach is not lost when the errors are close to normal, but the DPM has the ability to adapt to non-normal errors.

Details

Topics in Identification, Limited Dependent Variables, Partial Observability, Experimentation, and Flexible Modeling: Part B
Type: Book
ISBN: 978-1-83867-419-9

1 – 10 of over 3000