Search results

1 – 10 of 260
Article
Publication date: 4 October 2019

Rahul Priyadarshi, Akash Panigrahi, Srikanta Routroy and Girish Kant Garg

The purpose of this study is to select the appropriate forecasting model at the retail stage for selected vegetables on the basis of performance analysis.

1822

Abstract

Purpose

The purpose of this study is to select the appropriate forecasting model at the retail stage for selected vegetables on the basis of performance analysis.

Design/methodology/approach

Various forecasting models such as the Box–Jenkins-based auto-regressive integrated moving average model and machine learning-based algorithms such as long short-term memory (LSTM) networks, support vector regression (SVR), random forest regression, gradient boosting regression (GBR) and extreme GBR (XGBoost/XGBR) were proposed and applied (i.e. modeling, training, testing and predicting) at the retail stage for selected vegetables to forecast demand. The performance analysis (i.e. forecasting error analysis) was carried out to select the appropriate forecasting model at the retail stage for selected vegetables.

Findings

From the obtained results for a case environment, it was observed that the machine learning algorithms, namely LSTM and SVR, produced the better results in comparison with other different demand forecasting models.

Research limitations/implications

The results obtained from the case environment cannot be generalized. However, it may be used for forecasting of different agriculture produces at the retail stage, capturing their demand environment.

Practical implications

The implementation of LSTM and SVR for the case situation at the retail stage will reduce the forecast error, daily retail inventory and fresh produce wastage and will increase the daily revenue.

Originality/value

The demand forecasting model selection for agriculture produce at the retail stage on the basis of performance analysis is a unique study where both traditional and non-traditional models were analyzed and compared.

Article
Publication date: 13 April 2023

Ian Lenaers, Kris Boudt and Lieven De Moor

The purpose is twofold. First, this study aims to establish that black box tree-based machine learning (ML) models have better predictive performance than a standard linear…

168

Abstract

Purpose

The purpose is twofold. First, this study aims to establish that black box tree-based machine learning (ML) models have better predictive performance than a standard linear regression (LR) hedonic model for rent prediction. Second, it shows the added value of analyzing tree-based ML models with interpretable machine learning (IML) techniques.

Design/methodology/approach

Data on Belgian residential rental properties were collected. Tree-based ML models, random forest regression and eXtreme gradient boosting regression were applied to derive rent prediction models to compare predictive performance with a LR model. Interpretations of the tree-based models regarding important factors in predicting rent were made using SHapley Additive exPlanations (SHAP) feature importance (FI) plots and SHAP summary plots.

Findings

Results indicate that tree-based models perform better than a LR model for Belgian residential rent prediction. The SHAP FI plots agree that asking price, cadastral income, surface livable, number of bedrooms, number of bathrooms and variables measuring the proximity to points of interest are dominant predictors. The direction of relationships between rent and its factors is determined with SHAP summary plots. In addition to linear relationships, it emerges that nonlinear relationships exist.

Originality/value

Rent prediction using ML is relatively less studied than house price prediction. In addition, studying prediction models using IML techniques is relatively new in real estate economics. Moreover, to the best of the authors’ knowledge, this study is the first to derive insights of driving determinants of predicted rents from SHAP FI and SHAP summary plots.

Details

International Journal of Housing Markets and Analysis, vol. 17 no. 1
Type: Research Article
ISSN: 1753-8270

Keywords

Article
Publication date: 27 March 2023

Jinghui Deng, Qiyou Cheng and Xing Lu

Helicopter fuselage vibration prediction is important to keep a safety and comfortable flight process. The helicopter vibration mechanism model is difficult to meet of demand for…

Abstract

Purpose

Helicopter fuselage vibration prediction is important to keep a safety and comfortable flight process. The helicopter vibration mechanism model is difficult to meet of demand for accurate vibration prediction. Thus, the purpose of this paper is to develop an intelligent algorithm for accurate helicopter fuselage vibration analysis.

Design/methodology/approach

In this research, a novel weighted variational mode decomposition (VMD)- extreme gradient boosting (xgboost) helicopter fuselage vibration prediction model is proposed. The vibration data is decomposed and reconstructed by the signal clustering results. The vibration response is predicted by xgboost algorithm based on the reconstructed data. The information transfer order between the controllable flight data and flight attitude are analyzed.

Findings

The mean absolute percentage error (MAPE), root mean square error (RMSE) and mean absolute error (MAE) of the proposed weighted VMD-xgboost model are decreased by 6.8%, 31.5% and 32.8% compared with xgboost model. The established weighted VMD-xgboost model has the highest prediction accuracy with the lowest mean MAPE, RMSE and MAE of 4.54%, 0.0162, and 0.0131, respectively. The attitude of horizontal tail and cycle pitch are the key factors to vibration.

Originality/value

A novel weighted VMD-xgboost intelligent prediction methods is proposed. The prediction effect of xgboost model is highly improved by using the signal-weighted reconstruction technique. In addition, the data set used is collected from actual helicopter flight process.

Details

Aircraft Engineering and Aerospace Technology, vol. 95 no. 7
Type: Research Article
ISSN: 1748-8842

Keywords

Article
Publication date: 5 September 2020

Cheng Zhang and Zehao Ye

Owing to the consumption of considerable resources in developing physical pipe prediction models and the fact that the statistical models cannot fit the failure records perfectly…

Abstract

Purpose

Owing to the consumption of considerable resources in developing physical pipe prediction models and the fact that the statistical models cannot fit the failure records perfectly, the purpose of this paper is to use data mining method to analyze and predict the risks of water pipe failure via considering attributes and location of pipes in historical failure records. One of the Automatized Machine Learning (AutoML) methods, tree-based pipeline optimization technique (TPOT) was used as the key data mining technique in this research.

Design/methodology/approach

By considering pipeline attributes, environmental factors and historical pipeline broke/breaks records, a water pipeline failure prediction method is proposed in this research. Regression analysis, genetic algorithm, machine learning, data mining approaches are used to analyze and predict the probability of pipeline failure. TPOT was used as the key data mining technique. A case study was carried out in a specific area in China to investigate the relationships between pipeline broke/breaks and relevant parameters, such as pipeline age, materials, diameter, pipeline density and so on.

Findings

By integrating the prediction models for individual pipelines and small research regions, a prediction model is developed to describe the probability of water pipe failures and validated by real data. A high fitting degree is achieved, which means a good potential of using the proposed method in reality as a guideline for identifying areas with high risks and taking proactive measures and optimizing the resources allocation for water supply companies.

Originality/value

Different models are developed to have better prediction on regional or individual pipeline. A comparison between the predicted values with real records has shown that a preliminary model has a good potential in predicting the future failure risks.

Details

Facilities, vol. 39 no. 1/2
Type: Research Article
ISSN: 0263-2772

Keywords

Article
Publication date: 24 April 2020

Wei Kang Loo

The purpose of this study is to evaluate the performance of the ensemble learning models, such as the Random Forest and Extreme Gradient Boosting models, in predicting the…

Abstract

Purpose

The purpose of this study is to evaluate the performance of the ensemble learning models, such as the Random Forest and Extreme Gradient Boosting models, in predicting the direction of the Japan real estate investment trusts (J-REITs) at different return horizons, based on input obtained from various technical indicators.

Design/methodology/approach

This study measures the predictability of J-REITs with technical indicators by using different horizons of REITs' return and machine learning models. The ensemble learning models includes Random Forest and Extreme Gradient Boosting models while the return horizons of REITs ranging from 1 to 300 days. The results were further split into individual years to check for the consistency of the performance across time.

Findings

The Extreme Gradient Boosting appears to be the best method in improving forecast accuracy but not the trading return. A wider return horizons platform seemed to deliver a relatively better performance in both forecast accuracy and trading return, when compared to the return horizon of one.

Practical implications

It is recommended that the Extreme Gradient Boosting and Random Forest model be considered by practitioners for back-testing trading model. In addition, selecting different return horizons so as to achieve a better performance in trading/investment should also be considered.

Originality/value

The predictability of J-REITs using technical indicators was compared among different returns horizons and the models (Extreme Gradient Boosting and Random Forest).

Details

Journal of Property Investment & Finance, vol. 38 no. 6
Type: Research Article
ISSN: 1463-578X

Keywords

Article
Publication date: 30 December 2020

Suraj Kulkarni, Suhas Suresh Ambekar and Manoj Hudnurkar

Increasing health-care costs are a major concern, especially in the USA. The purpose of this paper is to predict the hospital charges of a patient before being admitted. This will…

Abstract

Purpose

Increasing health-care costs are a major concern, especially in the USA. The purpose of this paper is to predict the hospital charges of a patient before being admitted. This will help a patient who is getting admitted: “electively” can plan his/her finance. Also, this can be used as a tool by payers (insurance companies) to better forecast the amount that a patient might claim.

Design/methodology/approach

This research method involves secondary data collected from New York state’s patient discharges of 2017. A stratified sampling technique is used to sample the data from the population, feature engineering is done on categorical variables. Different regression techniques are being used to predict the target value “total charges.”

Findings

Total cost varies linearly with the length of stay. Among all the machine learning algorithms considered, namely, random forest, stochastic gradient descent (SGD) regressor, K nearest neighbors regressor, extreme gradient boosting regressor and gradient boosting regressor, random forest regressor had the best accuracy with R2 value 0.7753. “Age group” was the most important predictor among all the features.

Practical implications

This model can be helpful for patients who want to compare the cost at different hospitals and can plan their finances accordingly in case of “elective” admission. Insurance companies can predict how much a patient with a particular medical condition might claim by getting admitted to the hospital.

Originality/value

Health care can be a costly affair if not planned properly. This research gives patients and insurance companies a better prediction of the total cost that they might incur.

Details

International Journal of Innovation Science, vol. 13 no. 1
Type: Research Article
ISSN: 1757-2223

Keywords

Article
Publication date: 30 November 2021

Julián Martínez-Vargas, Pedro Carmona and Pol Torrelles

The purpose of this paper is to study the influence of different quantitative (traditionally used) and qualitative variables, such as the possible negative effect in determined…

Abstract

Purpose

The purpose of this paper is to study the influence of different quantitative (traditionally used) and qualitative variables, such as the possible negative effect in determined periods of certain socio-political factors on share price formation.

Design/methodology/approach

We first analyse descriptively the evolution of the Ibex-35 in recent years and compare it with other international benchmark indices. Bellow, two techniques have been compared: a classic linear regression statistical model (GLM) and a method based on machine learning techniques called Extreme Gradient Boosting (XGBoost).

Findings

XGBoost yields a very accurate market value prediction model that clearly outperforms the other, with a coefficient of determination close to 90%, calculated on validation sets.

Practical implications

According to our analysis, individual accounts are equally or more important than consolidated information in predicting the behaviour of share prices. This would justify Spain maintaining the obligation to present individual interim financial statements, which does not happen in other European Union countries because IAS 34 only stipulates consolidated interim financial statements.

Social implications

The descriptive analysis allows us to see how the Ibex-35 has moved away from international trends, especially in periods in which some relevant socio-political events occurred, such as the independence referendum in Catalonia, the double elections of 2019 or the early handling of the Covid-19 pandemic in 2020.

Originality/value

Compared to other variables, the XGBoost model assigns little importance to socio-political factors when it comes to share price formation; however, this model explains 89.33% of its variance.

Propósito

El propósito de este artículo es estudiar la influencia de diferentes variables cuantitativas (tradicionalmente usadas) y cualitativas, como la posible influencia negativa en determinados períodos de ciertos factores sociopolíticos, sobre la formación del precio de.

Diseño/metodología/enfoque

Primero analizamos de forma descriptiva la evolución del Ibex-35 en los últimos años y la comparamos con la de otros índices internacionales de referencia. A continuación, se han contrastado dos técnicas: un modelo estadístico clásico de regresión lineal (GLM) y un método basado en el aprendizaje automático denominado Extreme Gradient Boosting (XGBoost).

Resultados

XGBoost nos permite obtener un modelo de predicción del valor de mercado muy preciso y claramente superior al otro, con un coeficiente de determinación cercano al 90%, calculado sobre las muestras de validación.

Implicaciones prácticas

De acuerdo con nuestro análisis, la información contable individual es igual o más importante que la consolidada para predecir el comportamiento del precio de las acciones. Esto justificaría que España mantenga la obligación de presentar estados financieros intermedios individuales, lo que no ocurre en otros países de la Unión Europea porque la NIC 34 solo obliga a realizar estados financieros intermedios consolidados.

Implicaciones sociales

El análisis descriptivo permite ver cómo el Ibex-35 se ha alejado de las tendencias internacionales, especialmente en periodos en los que se produjo algún hecho sociopolítico relevante, como el referéndum de autodeterminación de Cataluña, el doble proceso electoral de 2019 o la gestión inicial de la pandemia generada por el Covid-19.

Originalidad/valor

En comparación con otras variables, el modelo XGBoost asigna poca importancia a los factores sociopolíticos cuando se trata de la formación del precio de las acciones; sin embargo, este modelo explica el 89.33% de su varianza.

Details

Academia Revista Latinoamericana de Administración, vol. 35 no. 1
Type: Research Article
ISSN: 1012-8255

Keywords

Article
Publication date: 26 December 2023

Farshad Peiman, Mohammad Khalilzadeh, Nasser Shahsavari-Pour and Mehdi Ravanshadnia

Earned value management (EVM)–based models for estimating project actual duration (AD) and cost at completion using various methods are continuously developed to improve the…

Abstract

Purpose

Earned value management (EVM)–based models for estimating project actual duration (AD) and cost at completion using various methods are continuously developed to improve the accuracy and actualization of predicted values. This study primarily aimed to examine natural gradient boosting (NGBoost-2020) with the classification and regression trees (CART) base model (base learner). To the best of the authors' knowledge, this concept has never been applied to EVM AD forecasting problem. Consequently, the authors compared this method to the single K-nearest neighbor (KNN) method, the ensemble method of extreme gradient boosting (XGBoost-2016) with the CART base model and the optimal equation of EVM, the earned schedule (ES) equation with the performance factor equal to 1 (ES1). The paper also sought to determine the extent to which the World Bank's two legal factors affect countries and how the two legal causes of delay (related to institutional flaws) influence AD prediction models.

Design/methodology/approach

In this paper, data from 30 construction projects of various building types in Iran, Pakistan, India, Turkey, Malaysia and Nigeria (due to the high number of delayed projects and the detrimental effects of these delays in these countries) were used to develop three models. The target variable of the models was a dimensionless output, the ratio of estimated duration to completion (ETC(t)) to planned duration (PD). Furthermore, 426 tracking periods were used to build the three models, with 353 samples and 23 projects in the training set, 73 patterns (17% of the total) and six projects (21% of the total) in the testing set. Furthermore, 17 dimensionless input variables were used, including ten variables based on the main variables and performance indices of EVM and several other variables detailed in the study. The three models were subsequently created using Python and several GitHub-hosted codes.

Findings

For the testing set of the optimal model (NGBoost), the better percentage mean (better%) of the prediction error (based on projects with a lower error percentage) of the NGBoost compared to two KNN and ES1 single models, as well as the total mean absolute percentage error (MAPE) and mean lags (MeLa) (indicating model stability) were 100, 83.33, 5.62 and 3.17%, respectively. Notably, the total MAPE and MeLa for the NGBoost model testing set, which had ten EVM-based input variables, were 6.74 and 5.20%, respectively. The ensemble artificial intelligence (AI) models exhibited a much lower MAPE than ES1. Additionally, ES1 was less stable in prediction than NGBoost. The possibility of excessive and unusual MAPE and MeLa values occurred only in the two single models. However, on some data sets, ES1 outperformed AI models. NGBoost also outperformed other models, especially single models for most developing countries, and was more accurate than previously presented optimized models. In addition, sensitivity analysis was conducted on the NGBoost predicted outputs of 30 projects using the SHapley Additive exPlanations (SHAP) method. All variables demonstrated an effect on ETC(t)/PD. The results revealed that the most influential input variables in order of importance were actual time (AT) to PD, regulatory quality (RQ), earned duration (ED) to PD, schedule cost index (SCI), planned complete percentage, rule of law (RL), actual complete percentage (ACP) and ETC(t) of the ES optimal equation to PD. The probabilistic hybrid model was selected based on the outputs predicted by the NGBoost and XGBoost models and the MAPE values from three AI models. The 95% prediction interval of the NGBoost–XGBoost model revealed that 96.10 and 98.60% of the actual output values of the testing and training sets are within this interval, respectively.

Research limitations/implications

Due to the use of projects performed in different countries, it was not possible to distribute the questionnaire to the managers and stakeholders of 30 projects in six developing countries. Due to the low number of EVM-based projects in various references, it was unfeasible to utilize other types of projects. Future prospects include evaluating the accuracy and stability of NGBoost for timely and non-fluctuating projects (mostly in developed countries), considering a greater number of legal/institutional variables as input, using legal/institutional/internal/inflation inputs for complex projects with extremely high uncertainty (such as bridge and road construction) and integrating these inputs and NGBoost with new technologies (such as blockchain, radio frequency identification (RFID) systems, building information modeling (BIM) and Internet of things (IoT)).

Practical implications

The legal/intuitive recommendations made to governments are strict control of prices, adequate supervision, removal of additional rules, removal of unfair regulations, clarification of the future trend of a law change, strict monitoring of property rights, simplification of the processes for obtaining permits and elimination of unnecessary changes particularly in developing countries and at the onset of irregular projects with limited information and numerous uncertainties. Furthermore, the managers and stakeholders of this group of projects were informed of the significance of seven construction variables (institutional/legal external risks, internal factors and inflation) at an early stage, using time series (dynamic) models to predict AD, accurate calculation of progress percentage variables, the effectiveness of building type in non-residential projects, regular updating inflation during implementation, effectiveness of employer type in the early stage of public projects in addition to the late stage of private projects, and allocating reserve duration (buffer) in order to respond to institutional/legal risks.

Originality/value

Ensemble methods were optimized in 70% of references. To the authors' knowledge, NGBoost from the set of ensemble methods was not used to estimate construction project duration and delays. NGBoost is an effective method for considering uncertainties in irregular projects and is often implemented in developing countries. Furthermore, AD estimation models do fail to incorporate RQ and RL from the World Bank's worldwide governance indicators (WGI) as risk-based inputs. In addition, the various WGI, EVM and inflation variables are not combined with substantial degrees of delay institutional risks as inputs. Consequently, due to the existence of critical and complex risks in different countries, it is vital to consider legal and institutional factors. This is especially recommended if an in-depth, accurate and reality-based method like SHAP is used for analysis.

Details

Engineering, Construction and Architectural Management, vol. ahead-of-print no. ahead-of-print
Type: Research Article
ISSN: 0969-9988

Keywords

Abstract

Details

Machine Learning and Artificial Intelligence in Marketing and Sales
Type: Book
ISBN: 978-1-80043-881-1

Open Access
Article
Publication date: 25 January 2023

Mikko Ranta and Mika Ylinen

This study aims to examine the association between board gender diversity (BGD) and workplace diversity and the relative importance of various board and firm characteristics in…

4915

Abstract

Purpose

This study aims to examine the association between board gender diversity (BGD) and workplace diversity and the relative importance of various board and firm characteristics in predicting diversity.

Design/methodology/approach

With a novel machine learning (ML) approach, this study models the association between three workplace diversity variables and BGD using a social media data set of approximately 250,000 employee reviews. Using the tools of explainable artificial intelligence, the authors interpret the results of the ML model.

Findings

The results show that BGD has a strong positive association with the gender equality and inclusiveness dimensions of corporate diversity culture. However, BGD is found to have a weak negative association with age diversity in a company. Furthermore, the authors find that workplace diversity is an important predictor of firm value, indicating a possible channel on how BGD affects firm performance.

Originality/value

The effects of BGD on workplace diversity below management levels are mainly omitted in the current corporate governance literature. Furthermore, existing research has not considered different dimensions of this diversity and has mainly focused on its gender aspects. In this study, the authors address this research problem and examine how BGD affects different dimensions of diversity at the overall company level. This study reveals important associations and identifies key variables that should be included as a part of theoretical causal models in future research.

Details

Corporate Governance: The International Journal of Business in Society, vol. 23 no. 5
Type: Research Article
ISSN: 1472-0701

Keywords

1 – 10 of 260