Search results

1 – 10 of over 33000
Book part
Publication date: 30 September 2020

Suryakanthi Tangirala

With the advent of Big Data, the ability to store and use the unprecedented amount of clinical information is now feasible via Electronic Health Records (EHRs). The massive…

Abstract

With the advent of Big Data, the ability to store and use the unprecedented amount of clinical information is now feasible via Electronic Health Records (EHRs). The massive collection of clinical data by health care systems and treatment canters can be productively used to perform predictive analytics on treatment plans to improve patient health outcomes. These massive data sets have stimulated opportunities to adapt computational algorithms to track and identify target areas for quality improvement in health care.

According to a report from Association of American Medical Colleges, there will be an alarming gap between demand and supply of health care work force in near future. The projections show that, by 2032 there is will be a shortfall of between 46,900 and 121,900 physicians in US (AAMC, 2019). Therefore, early prediction of health care risks is a demanding requirement to improve health care quality and reduce health care costs. Predictive analytics uses historical data and algorithms based on either statistics or machine learning to develop predictive models that capture important trends. These models have the ability to predict the likelihood of the future events. Predictive models developed using supervised machine learning approaches are commonly applied for various health care problems such as disease diagnosis, treatment selection, and treatment personalization.

This chapter provides an overview of various machine learning and statistical techniques for developing predictive models. Case examples from the extant literature are provided to illustrate the role of predictive modeling in health care research. Together with adaptation of these predictive modeling techniques with Big Data analytics underscores the need for standardization and transparency while recognizing the opportunities and challenges ahead.

Details

Big Data Analytics and Intelligence: A Perspective for Health Care
Type: Book
ISBN: 978-1-83909-099-8

Keywords

Article
Publication date: 14 July 2022

Pratyush N. Sharma, Benjamin D. Liengaard, Joseph F. Hair, Marko Sarstedt and Christian M. Ringle

Researchers often stress the predictive goals of their partial least squares structural equation modeling (PLS-SEM) analyses. However, the method has long lacked a statistical…

2619

Abstract

Purpose

Researchers often stress the predictive goals of their partial least squares structural equation modeling (PLS-SEM) analyses. However, the method has long lacked a statistical test to compare different models in terms of their predictive accuracy and to establish whether a proposed model offers a significantly better out-of-sample predictive accuracy than a naïve benchmark. This paper aims to address this methodological research gap in predictive model assessment and selection in composite-based modeling.

Design/methodology/approach

Recent research has proposed the cross-validated predictive ability test (CVPAT) to compare theoretically established models. This paper proposes several extensions that broaden the scope of CVPAT and explains the key choices researchers must make when using them. A popular marketing model is used to illustrate the CVPAT extensions’ use and to make recommendations for the interpretation and benchmarking of the results.

Findings

This research asserts that prediction-oriented model assessments and comparisons are essential for theory development and validation. It recommends that researchers routinely consider the application of CVPAT and its extensions when analyzing their theoretical models.

Research limitations/implications

The findings offer several avenues for future research to extend and strengthen prediction-oriented model assessment and comparison in PLS-SEM.

Practical implications

Guidelines are provided for applying CVPAT extensions and reporting the results to help researchers substantiate their modelspredictive capabilities.

Originality/value

This research contributes to strengthening the predictive model validation practice in PLS-SEM, which is essential to derive managerial implications that are typically predictive in nature.

Details

European Journal of Marketing, vol. 57 no. 6
Type: Research Article
ISSN: 0309-0566

Keywords

Article
Publication date: 6 August 2020

Wynne Chin, Jun-Hwa Cheah, Yide Liu, Hiram Ting, Xin-Jean Lim and Tat Huei Cham

Partial least squares structural equation modeling (PLS-SEM) has become popular in the information systems (IS) field for modeling structural relationships between latent…

3733

Abstract

Purpose

Partial least squares structural equation modeling (PLS-SEM) has become popular in the information systems (IS) field for modeling structural relationships between latent variables as measured by manifest variables. However, while researchers using PLS-SEM routinely stress the causal-predictive nature of their analyses, the model evaluation assessment relies exclusively on criteria designed to assess the path model's explanatory power. To take full advantage of the purpose of causal prediction in PLS-SEM, it is imperative for researchers to comprehend the efficacy of various quality criteria, such as traditional PLS-SEM criteria, model fit, PLSpredict, cross-validated predictive ability test (CVPAT) and model selection criteria.

Design/methodology/approach

A systematic review was conducted to understand empirical studies employing the use of the causal prediction criteria available for PLS-SEM in the database of Industrial Management and Data Systems (IMDS) and Management Information Systems Quarterly (MISQ). Furthermore, this study discusses the details of each of the procedures for the causal prediction criteria available for PLS-SEM, as well as how these criteria should be interpreted. While the focus of the paper is on demystifying the role of causal prediction modeling in PLS-SEM, the overarching aim is to compare the performance of different quality criteria and to select the appropriate causal-predictive model from a cohort of competing models in the IS field.

Findings

The study found that the traditional PLS-SEM criteria (goodness of fit (GoF) by Tenenhaus, R2 and Q2) and model fit have difficulty determining the appropriate causal-predictive model. In contrast, PLSpredict, CVPAT and model selection criteria (i.e. Bayesian information criterion (BIC), BIC weight, Geweke–Meese criterion (GM), GM weight, HQ and HQC) were found to outperform the traditional criteria in determining the appropriate causal-predictive model, because these criteria provided both in-sample and out-of-sample predictions in PLS-SEM.

Originality/value

This research substantiates the use of the PLSpredict, CVPAT and the model selection criteria (i.e. BIC, BIC weight, GM, GM weight, HQ and HQC). It provides IS researchers and practitioners with the knowledge they need to properly assess, report on and interpret PLS-SEM results when the goal is only causal prediction, thereby contributing to safeguarding the goal of using PLS-SEM in IS studies.

Details

Industrial Management & Data Systems, vol. 120 no. 12
Type: Research Article
ISSN: 0263-5577

Keywords

Article
Publication date: 15 March 2023

Indranil Ghosh, Rabin K. Jana and Mohammad Zoynul Abedin

The prediction of Airbnb listing prices predominantly uses a set of amenity-driven features. Choosing an appropriate set of features from thousands of available amenity-driven…

Abstract

Purpose

The prediction of Airbnb listing prices predominantly uses a set of amenity-driven features. Choosing an appropriate set of features from thousands of available amenity-driven features makes the prediction task difficult. This paper aims to propose a scalable, robust framework to predict listing prices of Airbnb units without using amenity-driven features.

Design/methodology/approach

The authors propose an artificial intelligence (AI)-based framework to predict Airbnb listing prices. The authors consider 75 thousand Airbnb listings from the five US cities with more than 1.9 million observations. The proposed framework integrates (i) feature screening, (ii) stacking that combines gradient boosting, bagging, random forest, (iii) particle swarm optimization and (iv) explainable AI to accomplish the research objective.

Findings

The key findings have three aspects – prediction accuracy, homogeneity and identification of best and least predictable cities. The proposed framework yields predictions of supreme precision. The predictability of listing prices varies significantly across cities. The listing prices are the best predictable for Boston and the least predictable for Chicago.

Practical implications

The framework and findings of the research can be leveraged by the hosts to determine rental prices and augment the service offerings by emphasizing key features, respectively.

Originality/value

Although individual components are known, the way they have been integrated into the proposed framework to derive a high-quality forecast of Airbnb listing prices is unique. It is scalable. The Airbnb listing price modeling literature rarely witnesses such a framework.

Details

International Journal of Contemporary Hospitality Management, vol. 35 no. 10
Type: Research Article
ISSN: 0959-6119

Keywords

Article
Publication date: 7 November 2023

Christian Nnaemeka Egwim, Hafiz Alaka, Youlu Pan, Habeeb Balogun, Saheed Ajayi, Abdul Hye and Oluwapelumi Oluwaseun Egunjobi

The study aims to develop a multilayer high-effective ensemble of ensembles predictive model (stacking ensemble) using several hyperparameter optimized ensemble machine learning…

66

Abstract

Purpose

The study aims to develop a multilayer high-effective ensemble of ensembles predictive model (stacking ensemble) using several hyperparameter optimized ensemble machine learning (ML) methods (bagging and boosting ensembles) trained with high-volume data points retrieved from Internet of Things (IoT) emission sensors, time-corresponding meteorology and traffic data.

Design/methodology/approach

For a start, the study experimented big data hypothesis theory by developing sample ensemble predictive models on different data sample sizes and compared their results. Second, it developed a standalone model and several bagging and boosting ensemble models and compared their results. Finally, it used the best performing bagging and boosting predictive models as input estimators to develop a novel multilayer high-effective stacking ensemble predictive model.

Findings

Results proved data size to be one of the main determinants to ensemble ML predictive power. Second, it proved that, as compared to using a single algorithm, the cumulative result from ensemble ML algorithms is usually always better in terms of predicted accuracy. Finally, it proved stacking ensemble to be a better model for predicting PM2.5 concentration level than bagging and boosting ensemble models.

Research limitations/implications

A limitation of this study is the trade-off between performance of this novel model and the computational time required to train it. Whether this gap can be closed remains an open research question. As a result, future research should attempt to close this gap. Also, future studies can integrate this novel model to a personal air quality messaging system to inform public of pollution levels and improve public access to air quality forecast.

Practical implications

The outcome of this study will aid the public to proactively identify highly polluted areas thus potentially reducing pollution-associated/ triggered COVID-19 (and other lung diseases) deaths/ complications/ transmission by encouraging avoidance behavior and support informed decision to lock down by government bodies when integrated into an air pollution monitoring system

Originality/value

This study fills a gap in literature by providing a justification for selecting appropriate ensemble ML algorithms for PM2.5 concentration level predictive modeling. Second, it contributes to the big data hypothesis theory, which suggests that data size is one of the most important factors of ML predictive capability. Third, it supports the premise that when using ensemble ML algorithms, the cumulative output is usually always better in terms of predicted accuracy than using a single algorithm. Finally developing a novel multilayer high-performant hyperparameter optimized ensemble of ensembles predictive model that can accurately predict PM2.5 concentration levels with improved model interpretability and enhanced generalizability, as well as the provision of a novel databank of historic pollution data from IoT emission sensors that can be purchased for research, consultancy and policymaking.

Details

Journal of Engineering, Design and Technology , vol. ahead-of-print no. ahead-of-print
Type: Research Article
ISSN: 1726-0531

Keywords

Article
Publication date: 27 November 2017

Serhat Peker, Altan Kocyigit and P. Erhan Eren

Predicting customers’ purchase behaviors is a challenging task. The literature has introduced the individual-level and the segment-based predictive modeling approaches for this…

1191

Abstract

Purpose

Predicting customers’ purchase behaviors is a challenging task. The literature has introduced the individual-level and the segment-based predictive modeling approaches for this purpose. Each method has its own advantages and drawbacks, and performs in certain cases. The purpose of this paper is to propose a hybrid approach which predicts customers’ individual purchase behaviors and reduces the limitations of these two methods by combining the advantages of them.

Design/methodology/approach

The proposed hybrid approach is established based on individual-level and segment-based approaches and utilizes the historical transactional data and predictive algorithms to generate predictions. The effectiveness of the proposed approach is experimentally evaluated in the domain of supermarket shopping by using real-world data and using five popular machine learning classification algorithms including logistic regression, decision trees, support vector machines, neural networks and random forests.

Findings

A comparison of results shows that the proposed hybrid approach substantially outperforms the individual-level and the segment-based approaches in terms of prediction coverage while maintaining roughly comparable prediction accuracy to the individual-level method. Moreover, the experimental results demonstrate that logistic regression performs better than the other classifiers in predicting customer purchase behavior.

Practical implications

The study concludes that the proposed approach would be beneficial for enterprises in terms of designing customized services and one-to-one marketing strategies.

Originality/value

This study is the first attempt to adopt a hybrid approach combining individual-level and segment-based approaches to predict customers’ individual purchase behaviors.

Article
Publication date: 28 March 2022

Gyeongcheol Cho, Sunmee Kim, Jonathan Lee, Heungsun Hwang, Marko Sarstedt and Christian M. Ringle

Generalized structured component analysis (GSCA) and partial least squares path modeling (PLSPM) are two key component-based approaches to structural equation modeling that…

Abstract

Purpose

Generalized structured component analysis (GSCA) and partial least squares path modeling (PLSPM) are two key component-based approaches to structural equation modeling that facilitate the analysis of theoretically established models in terms of both explanation and prediction. This study aims to offer a comparative evaluation of GSCA and PLSPM in a predictive modeling framework.

Design/methodology/approach

A simulation study compares the predictive performance of GSCA and PLSPM under various simulation conditions and different prediction types of correctly specified and misspecified models.

Findings

The results suggest that GSCA with reflective composite indicators (GSCAR) is the most versatile approach. For observed prediction, which uses the component scores to generate prediction for the indicators, GSCAR performs slightly better than PLSPM with mode A. For operative prediction, which considers all parameter estimates to generate predictions, both methods perform equally well. GSCA with formative composite indicators and PLSPM with mode B generally lag behind the other methods.

Research limitations/implications

Future research may further assess the methods’ prediction precision, considering more experimental factors with a wider range of levels, including more extreme ones.

Practical implications

When prediction is the primary study aim, researchers should generally revert to GSCAR, considering its performance for observed and operative prediction together.

Originality/value

This research is the first to compare the relative efficacy of GSCA and PLSPM in terms of predictive power.

Details

European Journal of Marketing, vol. 57 no. 6
Type: Research Article
ISSN: 0309-0566

Keywords

Article
Publication date: 2 April 2019

Ying Cui, Fu Chen, Ali Shiri and Yaqin Fan

Many higher education institutions are investigating the possibility of developing predictive student success models that use different sources of data available to identify…

1798

Abstract

Purpose

Many higher education institutions are investigating the possibility of developing predictive student success models that use different sources of data available to identify students that might be at risk of failing a course or program. The purpose of this paper is to review the methodological components related to the predictive models that have been developed or currently implemented in learning analytics applications in higher education.

Design/methodology/approach

Literature review was completed in three stages. First, the authors conducted searches and collected related full-text documents using various search terms and keywords. Second, they developed inclusion and exclusion criteria to identify the most relevant citations for the purpose of the current review. Third, they reviewed each document from the final compiled bibliography and focused on identifying information that was needed to answer the research questions

Findings

In this review, the authors identify methodological strengths and weaknesses of current predictive learning analytics applications and provide the most up-to-date recommendations on predictive model development, use and evaluation. The review results can inform important future areas of research that could strengthen the development of predictive learning analytics for the purpose of generating valuable feedback to students to help them succeed in higher education.

Originality/value

This review provides an overview of the methodological considerations for researchers and practitioners who are planning to develop or currently in the process of developing predictive student success models in the context of higher education.

Details

Information and Learning Sciences, vol. 120 no. 3/4
Type: Research Article
ISSN: 2398-5348

Keywords

Article
Publication date: 8 January 2024

Indranil Ghosh, Rabin K. Jana and Dinesh K. Sharma

Owing to highly volatile and chaotic external events, predicting future movements of cryptocurrencies is a challenging task. This paper advances a granular hybrid predictive

Abstract

Purpose

Owing to highly volatile and chaotic external events, predicting future movements of cryptocurrencies is a challenging task. This paper advances a granular hybrid predictive modeling framework for predicting the future figures of Bitcoin (BTC), Litecoin (LTC), Ethereum (ETH), Stellar (XLM) and Tether (USDT) during normal and pandemic regimes.

Design/methodology/approach

Initially, the major temporal characteristics of the price series are examined. In the second stage, ensemble empirical mode decomposition (EEMD) and maximal overlap discrete wavelet transformation (MODWT) are used to decompose the original time series into two distinct sets of granular subseries. In the third stage, long- and short-term memory network (LSTM) and extreme gradient boosting (XGB) are applied to the decomposed subseries to estimate the initial forecasts. Lastly, sequential quadratic programming (SQP) is used to fetch the forecast by combining the initial forecasts.

Findings

Rigorous performance assessment and the outcome of the Diebold-Mariano’s pairwise statistical test demonstrate the efficacy of the suggested predictive framework. The framework yields commendable predictive performance during the COVID-19 pandemic timeline explicitly as well. Future trends of BTC and ETH are found to be relatively easier to predict, while USDT is relatively difficult to predict.

Originality/value

The robustness of the proposed framework can be leveraged for practical trading and managing investment in crypto market. Empirical properties of the temporal dynamics of chosen cryptocurrencies provide deeper insights.

Details

China Finance Review International, vol. ahead-of-print no. ahead-of-print
Type: Research Article
ISSN: 2044-1398

Keywords

Open Access
Article
Publication date: 13 April 2022

Florian Schuberth, Manuel E. Rademaker and Jörg Henseler

This study aims to examine the role of an overall model fit assessment in the context of partial least squares path modeling (PLS-PM). In doing so, it will explain when it is…

6122

Abstract

Purpose

This study aims to examine the role of an overall model fit assessment in the context of partial least squares path modeling (PLS-PM). In doing so, it will explain when it is important to assess the overall model fit and provides ways of assessing the fit of composite models. Moreover, it will resolve major concerns about model fit assessment that have been raised in the literature on PLS-PM.

Design/methodology/approach

This paper explains when and how to assess the fit of PLS path models. Furthermore, it discusses the concerns raised in the PLS-PM literature about the overall model fit assessment and provides concise guidelines on assessing the overall fit of composite models.

Findings

This study explains that the model fit assessment is as important for composite models as it is for common factor models. To assess the overall fit of composite models, researchers can use a statistical test and several fit indices known through structural equation modeling (SEM) with latent variables.

Research limitations/implications

Researchers who use PLS-PM to assess composite models that aim to understand the mechanism of an underlying population and draw statistical inferences should take the concept of the overall model fit seriously.

Practical implications

To facilitate the overall fit assessment of composite models, this study presents a two-step procedure adopted from the literature on SEM with latent variables.

Originality/value

This paper clarifies that the necessity to assess model fit is not a question of which estimator will be used (PLS-PM, maximum likelihood, etc). but of the purpose of statistical modeling. Whereas, the model fit assessment is paramount in explanatory modeling, it is not imperative in predictive modeling.

Details

European Journal of Marketing, vol. 57 no. 6
Type: Research Article
ISSN: 0309-0566

Keywords

1 – 10 of over 33000