Search results

1 – 10 of 411

View access options

Article

Publication date: 24 April 2020

Performing technical analysis to predict Japan REITs' movement through ensemble learning

The purpose of this study is to evaluate the performance of the ensemble learning models, such as the Random Forest and Extreme Gradient Boosting models, in predicting the…

HTML

PDF (114 KB)

Downloads

239

Abstract

Purpose

The purpose of this study is to evaluate the performance of the ensemble learning models, such as the Random Forest and Extreme Gradient Boosting models, in predicting the direction of the Japan real estate investment trusts (J-REITs) at different return horizons, based on input obtained from various technical indicators.

Design/methodology/approach

This study measures the predictability of J-REITs with technical indicators by using different horizons of REITs' return and machine learning models. The ensemble learning models includes Random Forest and Extreme Gradient Boosting models while the return horizons of REITs ranging from 1 to 300 days. The results were further split into individual years to check for the consistency of the performance across time.

Findings

The Extreme Gradient Boosting appears to be the best method in improving forecast accuracy but not the trading return. A wider return horizons platform seemed to deliver a relatively better performance in both forecast accuracy and trading return, when compared to the return horizon of one.

Practical implications

It is recommended that the Extreme Gradient Boosting and Random Forest model be considered by practitioners for back-testing trading model. In addition, selecting different return horizons so as to achieve a better performance in trading/investment should also be considered.

Originality/value

The predictability of J-REITs using technical indicators was compared among different returns horizons and the models (Extreme Gradient Boosting and Random Forest).

Details

Journal of Property Investment & Finance, vol. 38 no. 6

Type: Research Article

DOI:

ISSN: 1463-578X

Keywords

View access options

Article

Publication date: 30 December 2020

Predicting the inpatient hospital cost using a machine learning approach

Suraj Kulkarni, Suhas Suresh Ambekar and Manoj Hudnurkar

Increasing health-care costs are a major concern, especially in the USA. The purpose of this paper is to predict the hospital charges of a patient before being admitted. This will…

HTML

PDF (749 KB)

Downloads

390

Abstract

Purpose

Increasing health-care costs are a major concern, especially in the USA. The purpose of this paper is to predict the hospital charges of a patient before being admitted. This will help a patient who is getting admitted: “electively” can plan his/her finance. Also, this can be used as a tool by payers (insurance companies) to better forecast the amount that a patient might claim.

Design/methodology/approach

This research method involves secondary data collected from New York state’s patient discharges of 2017. A stratified sampling technique is used to sample the data from the population, feature engineering is done on categorical variables. Different regression techniques are being used to predict the target value “total charges.”

Findings

Total cost varies linearly with the length of stay. Among all the machine learning algorithms considered, namely, random forest, stochastic gradient descent (SGD) regressor, K nearest neighbors regressor, extreme gradient boosting regressor and gradient boosting regressor, random forest regressor had the best accuracy with R² value 0.7753. “Age group” was the most important predictor among all the features.

Practical implications

This model can be helpful for patients who want to compare the cost at different hospitals and can plan their finances accordingly in case of “elective” admission. Insurance companies can predict how much a patient with a particular medical condition might claim by getting admitted to the hospital.

Originality/value

Health care can be a costly affair if not planned properly. This research gives patients and insurance companies a better prediction of the total cost that they might incur.

Details

International Journal of Innovation Science, vol. 13 no. 1

Type: Research Article

DOI:

ISSN: 1757-2223

Keywords

View access options

Article

Publication date: 30 November 2021

Evolution of the IBEX-35 vs other international indices: determinants of market value according to XGBOOST and GLM models

Julián Martínez-Vargas, Pedro Carmona and Pol Torrelles

The purpose of this paper is to study the influence of different quantitative (traditionally used) and qualitative variables, such as the possible negative effect in determined…

HTML

PDF (4.9 MB)

Downloads

131

Abstract

Purpose

The purpose of this paper is to study the influence of different quantitative (traditionally used) and qualitative variables, such as the possible negative effect in determined periods of certain socio-political factors on share price formation.

Design/methodology/approach

We first analyse descriptively the evolution of the Ibex-35 in recent years and compare it with other international benchmark indices. Bellow, two techniques have been compared: a classic linear regression statistical model (GLM) and a method based on machine learning techniques called Extreme Gradient Boosting (XGBoost).

Findings

XGBoost yields a very accurate market value prediction model that clearly outperforms the other, with a coefficient of determination close to 90%, calculated on validation sets.

Practical implications

According to our analysis, individual accounts are equally or more important than consolidated information in predicting the behaviour of share prices. This would justify Spain maintaining the obligation to present individual interim financial statements, which does not happen in other European Union countries because IAS 34 only stipulates consolidated interim financial statements.

Social implications

The descriptive analysis allows us to see how the Ibex-35 has moved away from international trends, especially in periods in which some relevant socio-political events occurred, such as the independence referendum in Catalonia, the double elections of 2019 or the early handling of the Covid-19 pandemic in 2020.

Originality/value

Compared to other variables, the XGBoost model assigns little importance to socio-political factors when it comes to share price formation; however, this model explains 89.33% of its variance.

Propósito

El propósito de este artículo es estudiar la influencia de diferentes variables cuantitativas (tradicionalmente usadas) y cualitativas, como la posible influencia negativa en determinados períodos de ciertos factores sociopolíticos, sobre la formación del precio de.

Diseño/metodología/enfoque

Primero analizamos de forma descriptiva la evolución del Ibex-35 en los últimos años y la comparamos con la de otros índices internacionales de referencia. A continuación, se han contrastado dos técnicas: un modelo estadístico clásico de regresión lineal (GLM) y un método basado en el aprendizaje automático denominado Extreme Gradient Boosting (XGBoost).

Resultados

XGBoost nos permite obtener un modelo de predicción del valor de mercado muy preciso y claramente superior al otro, con un coeficiente de determinación cercano al 90%, calculado sobre las muestras de validación.

Implicaciones prácticas

De acuerdo con nuestro análisis, la información contable individual es igual o más importante que la consolidada para predecir el comportamiento del precio de las acciones. Esto justificaría que España mantenga la obligación de presentar estados financieros intermedios individuales, lo que no ocurre en otros países de la Unión Europea porque la NIC 34 solo obliga a realizar estados financieros intermedios consolidados.

Implicaciones sociales

El análisis descriptivo permite ver cómo el Ibex-35 se ha alejado de las tendencias internacionales, especialmente en periodos en los que se produjo algún hecho sociopolítico relevante, como el referéndum de autodeterminación de Cataluña, el doble proceso electoral de 2019 o la gestión inicial de la pandemia generada por el Covid-19.

Originalidad/valor

En comparación con otras variables, el modelo XGBoost asigna poca importancia a los factores sociopolíticos cuando se trata de la formación del precio de las acciones; sin embargo, este modelo explica el 89.33% de su varianza.

Details

Academia Revista Latinoamericana de Administración, vol. 35 no. 1

Type: Research Article

DOI:

ISSN: 1012-8255

Keywords

Open Access

Article

Publication date: 25 January 2023

Board gender diversity and workplace diversity: a machine learning approach

Mikko Ranta and Mika Ylinen

This study aims to examine the association between board gender diversity (BGD) and workplace diversity and the relative importance of various board and firm characteristics in…

HTML

PDF (2.4 MB)

Downloads

4915

Abstract

Purpose

This study aims to examine the association between board gender diversity (BGD) and workplace diversity and the relative importance of various board and firm characteristics in predicting diversity.

Design/methodology/approach

With a novel machine learning (ML) approach, this study models the association between three workplace diversity variables and BGD using a social media data set of approximately 250,000 employee reviews. Using the tools of explainable artificial intelligence, the authors interpret the results of the ML model.

Findings

The results show that BGD has a strong positive association with the gender equality and inclusiveness dimensions of corporate diversity culture. However, BGD is found to have a weak negative association with age diversity in a company. Furthermore, the authors find that workplace diversity is an important predictor of firm value, indicating a possible channel on how BGD affects firm performance.

Originality/value

The effects of BGD on workplace diversity below management levels are mainly omitted in the current corporate governance literature. Furthermore, existing research has not considered different dimensions of this diversity and has mainly focused on its gender aspects. In this study, the authors address this research problem and examine how BGD affects different dimensions of diversity at the overall company level. This study reveals important associations and identifies key variables that should be included as a part of theoretical causal models in future research.

Details

Corporate Governance: The International Journal of Business in Society, vol. 23 no. 5

Type: Research Article

DOI:

ISSN: 1472-0701

Keywords

View access options

Article

Publication date: 26 December 2023

Estimation of building project completion duration using a natural gradient boosting ensemble model and legal and institutional variables

Farshad Peiman, Mohammad Khalilzadeh, Nasser Shahsavari-Pour and Mehdi Ravanshadnia

Earned value management (EVM)–based models for estimating project actual duration (AD) and cost at completion using various methods are continuously developed to improve the…

HTML

PDF (4.2 MB)

Downloads

124

Abstract

Purpose

Earned value management (EVM)–based models for estimating project actual duration (AD) and cost at completion using various methods are continuously developed to improve the accuracy and actualization of predicted values. This study primarily aimed to examine natural gradient boosting (NGBoost-2020) with the classification and regression trees (CART) base model (base learner). To the best of the authors' knowledge, this concept has never been applied to EVM AD forecasting problem. Consequently, the authors compared this method to the single K-nearest neighbor (KNN) method, the ensemble method of extreme gradient boosting (XGBoost-2016) with the CART base model and the optimal equation of EVM, the earned schedule (ES) equation with the performance factor equal to 1 (ES1). The paper also sought to determine the extent to which the World Bank's two legal factors affect countries and how the two legal causes of delay (related to institutional flaws) influence AD prediction models.

Design/methodology/approach

In this paper, data from 30 construction projects of various building types in Iran, Pakistan, India, Turkey, Malaysia and Nigeria (due to the high number of delayed projects and the detrimental effects of these delays in these countries) were used to develop three models. The target variable of the models was a dimensionless output, the ratio of estimated duration to completion (ETC(t)) to planned duration (PD). Furthermore, 426 tracking periods were used to build the three models, with 353 samples and 23 projects in the training set, 73 patterns (17% of the total) and six projects (21% of the total) in the testing set. Furthermore, 17 dimensionless input variables were used, including ten variables based on the main variables and performance indices of EVM and several other variables detailed in the study. The three models were subsequently created using Python and several GitHub-hosted codes.

Findings

For the testing set of the optimal model (NGBoost), the better percentage mean (better%) of the prediction error (based on projects with a lower error percentage) of the NGBoost compared to two KNN and ES1 single models, as well as the total mean absolute percentage error (MAPE) and mean lags (MeLa) (indicating model stability) were 100, 83.33, 5.62 and 3.17%, respectively. Notably, the total MAPE and MeLa for the NGBoost model testing set, which had ten EVM-based input variables, were 6.74 and 5.20%, respectively. The ensemble artificial intelligence (AI) models exhibited a much lower MAPE than ES1. Additionally, ES1 was less stable in prediction than NGBoost. The possibility of excessive and unusual MAPE and MeLa values occurred only in the two single models. However, on some data sets, ES1 outperformed AI models. NGBoost also outperformed other models, especially single models for most developing countries, and was more accurate than previously presented optimized models. In addition, sensitivity analysis was conducted on the NGBoost predicted outputs of 30 projects using the SHapley Additive exPlanations (SHAP) method. All variables demonstrated an effect on ETC(t)/PD. The results revealed that the most influential input variables in order of importance were actual time (AT) to PD, regulatory quality (RQ), earned duration (ED) to PD, schedule cost index (SCI), planned complete percentage, rule of law (RL), actual complete percentage (ACP) and ETC(t) of the ES optimal equation to PD. The probabilistic hybrid model was selected based on the outputs predicted by the NGBoost and XGBoost models and the MAPE values from three AI models. The 95% prediction interval of the NGBoost–XGBoost model revealed that 96.10 and 98.60% of the actual output values of the testing and training sets are within this interval, respectively.

Research limitations/implications

Due to the use of projects performed in different countries, it was not possible to distribute the questionnaire to the managers and stakeholders of 30 projects in six developing countries. Due to the low number of EVM-based projects in various references, it was unfeasible to utilize other types of projects. Future prospects include evaluating the accuracy and stability of NGBoost for timely and non-fluctuating projects (mostly in developed countries), considering a greater number of legal/institutional variables as input, using legal/institutional/internal/inflation inputs for complex projects with extremely high uncertainty (such as bridge and road construction) and integrating these inputs and NGBoost with new technologies (such as blockchain, radio frequency identification (RFID) systems, building information modeling (BIM) and Internet of things (IoT)).

Practical implications

The legal/intuitive recommendations made to governments are strict control of prices, adequate supervision, removal of additional rules, removal of unfair regulations, clarification of the future trend of a law change, strict monitoring of property rights, simplification of the processes for obtaining permits and elimination of unnecessary changes particularly in developing countries and at the onset of irregular projects with limited information and numerous uncertainties. Furthermore, the managers and stakeholders of this group of projects were informed of the significance of seven construction variables (institutional/legal external risks, internal factors and inflation) at an early stage, using time series (dynamic) models to predict AD, accurate calculation of progress percentage variables, the effectiveness of building type in non-residential projects, regular updating inflation during implementation, effectiveness of employer type in the early stage of public projects in addition to the late stage of private projects, and allocating reserve duration (buffer) in order to respond to institutional/legal risks.

Originality/value

Ensemble methods were optimized in 70% of references. To the authors' knowledge, NGBoost from the set of ensemble methods was not used to estimate construction project duration and delays. NGBoost is an effective method for considering uncertainties in irregular projects and is often implemented in developing countries. Furthermore, AD estimation models do fail to incorporate RQ and RL from the World Bank's worldwide governance indicators (WGI) as risk-based inputs. In addition, the various WGI, EVM and inflation variables are not combined with substantial degrees of delay institutional risks as inputs. Consequently, due to the existence of critical and complex risks in different countries, it is vital to consider legal and institutional factors. This is especially recommended if an in-depth, accurate and reality-based method like SHAP is used for analysis.

Details

Engineering, Construction and Architectural Management, vol. ahead-of-print no. ahead-of-print

Type: Research Article

DOI:

ISSN: 0969-9988

Keywords

View access options

Article

Publication date: 4 October 2019

Demand forecasting at retail stage for selected vegetables: a performance analysis

Rahul Priyadarshi, Akash Panigrahi, Srikanta Routroy and Girish Kant Garg

The purpose of this study is to select the appropriate forecasting model at the retail stage for selected vegetables on the basis of performance analysis.

HTML

PDF (1.2 MB)

Downloads

1822

Abstract

Purpose

The purpose of this study is to select the appropriate forecasting model at the retail stage for selected vegetables on the basis of performance analysis.

Design/methodology/approach

Various forecasting models such as the Box–Jenkins-based auto-regressive integrated moving average model and machine learning-based algorithms such as long short-term memory (LSTM) networks, support vector regression (SVR), random forest regression, gradient boosting regression (GBR) and extreme GBR (XGBoost/XGBR) were proposed and applied (i.e. modeling, training, testing and predicting) at the retail stage for selected vegetables to forecast demand. The performance analysis (i.e. forecasting error analysis) was carried out to select the appropriate forecasting model at the retail stage for selected vegetables.

Findings

From the obtained results for a case environment, it was observed that the machine learning algorithms, namely LSTM and SVR, produced the better results in comparison with other different demand forecasting models.

Research limitations/implications

The results obtained from the case environment cannot be generalized. However, it may be used for forecasting of different agriculture produces at the retail stage, capturing their demand environment.

Practical implications

The implementation of LSTM and SVR for the case situation at the retail stage will reduce the forecast error, daily retail inventory and fresh produce wastage and will increase the daily revenue.

Originality/value

The demand forecasting model selection for agriculture produce at the retail stage on the basis of performance analysis is a unique study where both traditional and non-traditional models were analyzed and compared.

Details

Journal of Modelling in Management, vol. 14 no. 4

Type: Research Article

DOI:

ISSN: 1746-5664

Keywords

View access options

Article

Publication date: 30 March 2023

A stacked ensemble learning method for customer lifetime value prediction

Nader Asadi Ejgerdi and Mehrdad Kazerooni

With the growth of organizations and businesses, customer acquisition and retention processes have become more complex in the long run. That is why customer lifetime value (CLV…

HTML

PDF (4.1 MB)

Downloads

200

Abstract

Purpose

With the growth of organizations and businesses, customer acquisition and retention processes have become more complex in the long run. That is why customer lifetime value (CLV) has become crucial to sales managers. Predicting the CLV is a strategic weapon and competitive advantage in increasing profitability and identifying customers with more splendid profitability and is one of the essential key performance indicators (KPI) used in customer segmentation. Thus, this paper proposes a stacked ensemble learning method, a combination of multiple machine learning methods, for CLV prediction.

Design/methodology/approach

In order to utilize customers’ behavioral features for predicting the value of each customer’s CLV, the data of a textile sales company was used as a case study. The proposed stacked ensemble learning method is compared with several popular predictive methods named deep neural networks, bagging support vector regression, light gradient boosting machine, random forest and extreme gradient boosting.

Findings

Empirical results indicate that the regression performance of the stacked ensemble learning method outperformed other methods in terms of normalized rooted mean squared error, normalized mean absolute error and coefficient of determination, at 0.248, 0.364 and 0.848, respectively. In addition, the prediction capability of the proposed method improved significantly after optimizing its hyperparameters.

Originality/value

This paper proposes a stacked ensemble learning method as a new method for accurate CLV prediction. The results and comparisons support the robustness and efficiency of the proposed method for CLV prediction.

Details

Kybernetes, vol. ahead-of-print no. ahead-of-print

Type: Research Article

DOI:

ISSN: 0368-492X

Keywords

View access options

Article

Publication date: 17 July 2023

Quantifying the drivers of residential housing demand – an interpretable machine learning approach

Marcelo Cajias and Joseph-Alexander Zeitler

The paper employs a unique online user-generated housing search dataset and introduces a novel measure for housing demand, namely “contacts per listing” as explained by hedonic…

HTML

PDF (5.1 MB)

Downloads

133

Abstract

Purpose

The paper employs a unique online user-generated housing search dataset and introduces a novel measure for housing demand, namely “contacts per listing” as explained by hedonic, geographic and socioeconomic variables.

Design/methodology/approach

The authors explore housing demand by employing an extensive Internet search dataset from a German housing market platform. The authors apply state-of-the-art artificial intelligence, the eXtreme Gradient Boosting, to quantify factors that lead an apartment to be in demand.

Findings

The authors compare the results to alternative parametric models and find evidence of the superiority of the nonparametric model. The authors use eXplainable artificial intelligence (XAI) techniques to show economic meanings and inferences of the results. The results suggest that hedonic, socioeconomic and spatial aspects influence search intensity. The authors further find differences in temporal dynamics and geographical variations.

Originality/value

To the best of the authors’ knowledge, it is the first study of its kind. The statistical model of housing search draws on insights from decision theory, AI and qualitative studies on housing search. The econometric approach employed is new as it considers standard regression models and an eXtreme Gradient Boosting (XGB or XGBoost) approach followed by a model-agnostic interpretation of the underlying effects.

Details

Journal of European Real Estate Research, vol. 16 no. 2

Type: Research Article

DOI:

ISSN: 1753-9269

Keywords

View access options

Article

Publication date: 26 September 2022

Comparison of machine learning algorithms for evaluating building energy efficiency using big data analytics

Christian Nnaemeka Egwim, Hafiz Alaka, Oluwapelumi Oluwaseun Egunjobi, Alvaro Gomes and Iosif Mporas

This study aims to compare and evaluate the application of commonly used machine learning (ML) algorithms used to develop models for assessing energy efficiency of buildings.

HTML

PDF (1.5 MB)

Downloads

249

Abstract

Purpose

This study aims to compare and evaluate the application of commonly used machine learning (ML) algorithms used to develop models for assessing energy efficiency of buildings.

Design/methodology/approach

This study foremostly combined building energy efficiency ratings from several data sources and used them to create predictive models using a variety of ML methods. Secondly, to test the hypothesis of ensemble techniques, this study designed a hybrid stacking ensemble approach based on the best performing bagging and boosting ensemble methods generated from its predictive analytics.

Findings

Based on performance evaluation metrics scores, the extra trees model was shown to be the best predictive model. More importantly, this study demonstrated that the cumulative result of ensemble ML algorithms is usually always better in terms of predicted accuracy than a single method. Finally, it was discovered that stacking is a superior ensemble approach for analysing building energy efficiency than bagging and boosting.

Research limitations/implications

While the proposed contemporary method of analysis is assumed to be applicable in assessing energy efficiency of buildings within the sector, the unique data transformation used in this study may not, as typical of any data driven model, be transferable to the data from other regions other than the UK.

Practical implications

This study aids in the initial selection of appropriate and high-performing ML algorithms for future analysis. This study also assists building managers, residents, government agencies and other stakeholders in better understanding contributing factors and making better decisions about building energy performance. Furthermore, this study will assist the general public in proactively identifying buildings with high energy demands, potentially lowering energy costs by promoting avoidance behaviour and assisting government agencies in making informed decisions about energy tariffs when this novel model is integrated into an energy monitoring system.

Originality/value

This study fills a gap in the lack of a reason for selecting appropriate ML algorithms for assessing building energy efficiency. More importantly, this study demonstrated that the cumulative result of ensemble ML algorithms is usually always better in terms of predicted accuracy than a single method.

Details

Journal of Engineering, Design and Technology , vol. ahead-of-print no. ahead-of-print

Type: Research Article

DOI:

ISSN: 1726-0531

Keywords

View access options

Article

Publication date: 5 December 2023

Investor sentiment and the NFT hype index: to buy or not to buy?

Valeriia Baklanova, Aleksei Kurkin and Tamara Teplova

The primary objective of this research is to provide a precise interpretation of the constructed machine learning model and produce definitive summaries that can evaluate the…

HTML

PDF (2.3 MB)

Downloads

Abstract

Purpose

The primary objective of this research is to provide a precise interpretation of the constructed machine learning model and produce definitive summaries that can evaluate the influence of investor sentiment on the overall sales of non-fungible token (NFT) assets. To achieve this objective, the NFT hype index was constructed as well as several approaches of XAI were employed to interpret Black Box models and assess the magnitude and direction of the impact of the features used.

Design/methodology/approach

The research paper involved the construction of a sentiment index termed the NFT hype index, which aims to measure the influence of market actors within the NFT industry. This index was created by analyzing written content posted by 62 high-profile individuals and opinion leaders on the social media platform Twitter. The authors collected posts from the Twitter accounts that were afterward classified by tonality with a help of natural language processing model VADER. Then the machine learning methods and XAI approaches (feature importance, permutation importance and SHAP) were applied to explain the obtained results.

Findings

The built index was subjected to rigorous analysis using the gradient boosting regressor model and explainable AI techniques, which confirmed its significant explanatory power. Remarkably, the NFT hype index exhibited a higher degree of predictive accuracy compared to the well-known sentiment indices.

Practical implications

The NFT hype index, constructed from Twitter textual data, functions as an innovative, sentiment-based indicator for investment decision-making in the NFT market. It offers investors unique insights into the market sentiment that can be used alongside conventional financial analysis techniques to enhance risk management, portfolio optimization and overall investment outcomes within the rapidly evolving NFT ecosystem. Thus, the index plays a crucial role in facilitating well-informed, data-driven investment decisions and ensuring a competitive edge in the digital assets market.

Originality/value

The authors developed a novel index of investor interest for NFT assets (NFT hype index) based on text messages posted by market influencers and compared it to conventional sentiment indices in terms of their explanatory power. With the application of explainable AI, it was shown that sentiment indices may perform as significant predictors for NFT sales and that the NFT hype index works best among all sentiment indices considered.

Details

China Finance Review International, vol. ahead-of-print no. ahead-of-print

Type: Research Article

DOI:

ISSN: 2044-1398

Keywords

Access

Year

Content type

1 – 10 of 411