Search results

1 – 10 of 212
Article
Publication date: 30 December 2020

Suraj Kulkarni, Suhas Suresh Ambekar and Manoj Hudnurkar

Increasing health-care costs are a major concern, especially in the USA. The purpose of this paper is to predict the hospital charges of a patient before being admitted. This will…

Abstract

Purpose

Increasing health-care costs are a major concern, especially in the USA. The purpose of this paper is to predict the hospital charges of a patient before being admitted. This will help a patient who is getting admitted: “electively” can plan his/her finance. Also, this can be used as a tool by payers (insurance companies) to better forecast the amount that a patient might claim.

Design/methodology/approach

This research method involves secondary data collected from New York state’s patient discharges of 2017. A stratified sampling technique is used to sample the data from the population, feature engineering is done on categorical variables. Different regression techniques are being used to predict the target value “total charges.”

Findings

Total cost varies linearly with the length of stay. Among all the machine learning algorithms considered, namely, random forest, stochastic gradient descent (SGD) regressor, K nearest neighbors regressor, extreme gradient boosting regressor and gradient boosting regressor, random forest regressor had the best accuracy with R2 value 0.7753. “Age group” was the most important predictor among all the features.

Practical implications

This model can be helpful for patients who want to compare the cost at different hospitals and can plan their finances accordingly in case of “elective” admission. Insurance companies can predict how much a patient with a particular medical condition might claim by getting admitted to the hospital.

Originality/value

Health care can be a costly affair if not planned properly. This research gives patients and insurance companies a better prediction of the total cost that they might incur.

Details

International Journal of Innovation Science, vol. 13 no. 1
Type: Research Article
ISSN: 1757-2223

Keywords

Article
Publication date: 12 November 2019

Sasanka Choudhury, Dhirendra Nath Thatoi, Jhalak Hota and Mohan D. Rao

To avoid the structural defect, early crack detection is oneof the important aspects in the recent area of research. The purpose of this paper is to detect the crack before its…

Abstract

Purpose

To avoid the structural defect, early crack detection is oneof the important aspects in the recent area of research. The purpose of this paper is to detect the crack before its failure by means of its position and severity.

Design/methodology/approach

This paper uses two trees based regressors, namely, decision tree (DT) regressor and random forest (RF) regressor for their capabilities to adopt different types of parameter and generate simple rules by which the method can predict the crack parameters with better accuracy, making it possible to effectively predict the crack parameters such as its location and depth before failure of the beam.

Findings

The predicted parameters can be achieved, if the relationship between vibration and crack parameters can be attained. The relationship yields the results of beam natural frequencies, which is used as the input value for the regression techniques. It is observed that the RF regressor predicts the parameters with better accuracy as compared to DT regressor.

Originality/value

The idea is used the developed regression techniques to identify the crack parameters which are more effective as compared to other developed methods because the alternate name of prediction is called regression. The authors have used DT regressor and RF regressor to achieve the target. In this paper care has been given to the generalization of the model, so that the adaptability of the model can be ensured. The robustness of proposed methods has been verified in support of numerical and experimental analysis.

Details

International Journal of Structural Integrity, vol. 11 no. 6
Type: Research Article
ISSN: 1757-9864

Keywords

Article
Publication date: 26 May 2022

Ismail Abiodun Sulaimon, Hafiz Alaka, Razak Olu-Ajayi, Mubashir Ahmad, Saheed Ajayi and Abdul Hye

Road traffic emissions are generally believed to contribute immensely to air pollution, but the effect of road traffic data sets on air quality (AQ) predictions has not been fully…

291

Abstract

Purpose

Road traffic emissions are generally believed to contribute immensely to air pollution, but the effect of road traffic data sets on air quality (AQ) predictions has not been fully investigated. This paper aims to investigate the effects traffic data set have on the performance of machine learning (ML) predictive models in AQ prediction.

Design/methodology/approach

To achieve this, the authors have set up an experiment with the control data set having only the AQ data set and meteorological (Met) data set, while the experimental data set is made up of the AQ data set, Met data set and traffic data set. Several ML models (such as extra trees regressor, eXtreme gradient boosting regressor, random forest regressor, K-neighbors regressor and two others) were trained, tested and compared on these individual combinations of data sets to predict the volume of PM2.5, PM10, NO2 and O3 in the atmosphere at various times of the day.

Findings

The result obtained showed that various ML algorithms react differently to the traffic data set despite generally contributing to the performance improvement of all the ML algorithms considered in this study by at least 20% and an error reduction of at least 18.97%.

Research limitations/implications

This research is limited in terms of the study area, and the result cannot be generalized outside of the UK as some of the inherent conditions may not be similar elsewhere. Additionally, only the ML algorithms commonly used in literature are considered in this research, therefore, leaving out a few other ML algorithms.

Practical implications

This study reinforces the belief that the traffic data set has a significant effect on improving the performance of air pollution ML prediction models. Hence, there is an indication that ML algorithms behave differently when trained with a form of traffic data set in the development of an AQ prediction model. This implies that developers and researchers in AQ prediction need to identify the ML algorithms that behave in their best interest before implementation.

Originality/value

The result of this study will enable researchers to focus more on algorithms of benefit when using traffic data sets in AQ prediction.

Details

Journal of Engineering, Design and Technology , vol. 22 no. 3
Type: Research Article
ISSN: 1726-0531

Keywords

Article
Publication date: 16 February 2024

Hossam Mohamed Toma, Ahmed H. Abdeen and Ahmed Ibrahim

The equipment resale price plays an important role in calculating the optimum time for equipment replacement. Some of the existing models that predict the equipment resale price…

Abstract

Purpose

The equipment resale price plays an important role in calculating the optimum time for equipment replacement. Some of the existing models that predict the equipment resale price do not take many of the influencing factors on the resale price into account. Other models consider more factors that influence equipment resale price, but they still with low accuracy because of the modeling techniques that were used. An easy tool is required to help in forecasting the resale price and support efficient decisions for equipment replacement. This research presents a machine learning (ML) computer model helping in forecasting accurately the equipment resale price.

Design/methodology/approach

A measuring method for the influencing factors that have impacts on the equipment resale price was determined. The values of those factors were measured for 1,700 pieces of equipment and their corresponding resale price. The data were used to develop a ML model that covers three types of equipment (loaders, excavators and bulldozers). The methodology used to develop the model applied three ML algorithms: the random forest regressor, extra trees regressor and decision tree regressor, to find an accurate model for the equipment resale price. The three algorithms were verified and tested with data of 340 pieces of equipment.

Findings

Using a large number of data to train the ML model resulted in a high-accuracy predicting model. The accuracy of the extra trees regressor algorithm was the highest among the three used algorithms to develop the ML model. The accuracy of the model is 98%. A computer interface is designed to make the use of the model easier.

Originality/value

The proposed model is accurate and makes it easy to predict the equipment resale price. The predicted resale price can be used to calculate equipment elements that are essential for developing a dependable equipment replacement plan. The proposed model was developed based on the most influencing factors on the equipment resale price and evaluation of those factors was done using reliable methods. The technique used to develop the model is the ML that proved its accuracy in modeling. The accuracy of the model, which is 98%, enhances the value of the model.

Details

Engineering, Construction and Architectural Management, vol. ahead-of-print no. ahead-of-print
Type: Research Article
ISSN: 0969-9988

Keywords

Article
Publication date: 28 October 2022

Elena Fedorova, Pavel Chertsov and Anna Kuzmina

The purpose of this study is to assess how the information disclosed in prospectuses impacted the initial public offering (IPO) underpricing at a time of high government…

Abstract

Purpose

The purpose of this study is to assess how the information disclosed in prospectuses impacted the initial public offering (IPO) underpricing at a time of high government interference amid the ongoing pandemic.

Design/methodology/approach

The design of this study has several tracks, namely, a macro-level track, which is represented by the government measures to halt the pandemic; a micro-level track, which is followed by textual analysis of IPO prospectuses; and, finally, a machine learning track, in which the authors use state-of-the-art tools to improve their linear regression model.

Findings

The authors found that strict government anti-COVID-19 measures indeed contribute to the reduction of the IPO underpricing. Interestingly, the mere fact of such measures taking place is enough to take effect on financial markets, regardless of the resulting efficiency of such measures. At the micro-level, the authors show that prospectus sentiments and their significance differ across prospectus sections. Using linear regression and machine learning models, the authors find robust evidence that such sections as “Risk factors”, “Prospectus summary”, “Financial Information” and “Business” play a crucial role in explaining the underpricing. Their effect is different, namely, it turns out that the more negative “Risk factors” and “Financial Information” sentiment, the higher the resulting underpricing. Conversely, the more positive “Prospectus summary” and “Business” sentiments appear, the lower the resulting underpricing is. In addition, we used machine learning methods. Consisting of more than 580 IPO prospectuses, the study sample required modern and powerful machine learning tools like Isolation Forest for pre-processing or Random Forest Regressor and Light Gradient Boosting Model for modelling purposes, which enabled the authors to gain better results compared to the classic linear regression model.

Originality/value

At the micro level, this study is not confined to 2020, but also embraces 2021, the year of the record number of IPOs held. Moreover, in this paper, these were prospectuses that served as a source of management sentiment. In addition, the authors used a tailor-made government stringency index. At the micro level, basing the study on behavioural finance hypotheses, the authors conducted both separate and holistic analysis of prospectuses to assess investors’ reaction to different aspects of IPO companies as well as to the characteristics of the IPOs themselves. Lastly, the authors introduced a few innovations to the research methodology. Textual analysis was conducted on a corpus of prospectuses included in a study sample. However, the authors did not use pre-trained dictionaries, but instead opted for FLAIR, a modern open-source framework for natural language processing.

Details

Journal of Financial Reporting and Accounting, vol. 21 no. 4
Type: Research Article
ISSN: 1985-2517

Keywords

Article
Publication date: 27 August 2024

Ali Albada, Eimad Eldin Abusham, Chui Zi Ong and Khalid Al Qatiti

Empirical examinations of initial public offering (IPO) initial returns often rely heavily on linear regression models. However, these models can prove inefficient owing to their…

Abstract

Purpose

Empirical examinations of initial public offering (IPO) initial returns often rely heavily on linear regression models. However, these models can prove inefficient owing to their susceptibility to outliers, a common occurrence in IPO data. This study introduces a machine learning method, known as random forest, to address issues that linear regression may struggle to resolve.

Design/methodology/approach

The study’s sample comprises 352 fixed-priced IPOs from the year 2004 until 2021. A unique aspect of this research is its application of the random forest method. The accuracy of random forest in comparison to other methods is evaluated. The findings indicate that the random forest model significantly outperforms other methods in all of the evaluated aspects.

Findings

The variable importance measure indicates that investors’ demand, divergence of opinion among investors and offer price are the most crucial predictors of IPO initial returns. These determinants hold particular significance due to the widespread use of the fixed-price method in Malaysia, as this method amplifies the information asymmetry in the IPO market.

Originality/value

To the best of the authors’ knowledge, this study is among the pioneering works in Malaysian literature to apply the random forest method to address the constraints of conventional linear regression models. This is achieved by considering a more extensive array of factors and acknowledging the influence of outliers. Additionally, this study adds value to Malaysian literature by ranking and identifying the ex-ante information that best signals the issuing firm’s quality. This contribution facilitates prospective investors’ decision-making processes and provides issuing firms with effective means to communicate their value and quality to the IPO market.

Details

Managerial Finance, vol. ahead-of-print no. ahead-of-print
Type: Research Article
ISSN: 0307-4358

Keywords

Article
Publication date: 1 March 2023

Farouq Sammour, Heba Alkailani, Ghaleb J. Sweis, Rateb J. Sweis, Wasan Maaitah and Abdulla Alashkar

Demand forecasts are a key component of planning efforts and are crucial for managing core operations. This study aims to evaluate the use of several machine learning (ML…

Abstract

Purpose

Demand forecasts are a key component of planning efforts and are crucial for managing core operations. This study aims to evaluate the use of several machine learning (ML) algorithms to forecast demand for residential construction in Jordan.

Design/methodology/approach

The identification and selection of variables and ML algorithms that are related to the demand for residential construction are indicated using a literature review. Feature selection was done by using a stepwise backward elimination. The developed algorithm’s accuracy has been demonstrated by comparing the ML predictions with real residual values and compared based on the coefficient of determination.

Findings

Nine economic indicators were selected to develop the demand models. Elastic-Net showed the highest accuracy of (0.838) versus artificial neural networkwith an accuracy of (0.727), followed by Eureqa with an accuracy of (0.715) and the Extra Trees with an accuracy of (0.703). According to the results of the best-performing model forecast, Jordan’s 2023 first-quarter demand for residential construction is anticipated to rise by 11.5% from the same quarter of the year 2022.

Originality/value

The results of this study extend to the existing body of knowledge through the identification of the most influential variables in the Jordanian residential construction industry. In addition, the models developed will enable users in the fields of construction engineering to make reliable demand forecasts while also assisting in effective financial decision-making.

Details

Construction Innovation , vol. 24 no. 5
Type: Research Article
ISSN: 1471-4175

Keywords

Article
Publication date: 12 July 2022

Maitri Mistry, Rahul Gupta, Swati Jain, Jaiprakash V. Verma and Daehan Won

The purpose of this paper is to develop a machine learning model that predicts the component self-alignment offsets along the length and width of the component and in the angular…

Abstract

Purpose

The purpose of this paper is to develop a machine learning model that predicts the component self-alignment offsets along the length and width of the component and in the angular direction. To find the best performing model, various algorithms like random forest regressor (RFR), support vector regressor (SVR), neural networks (NN), gradient boost (GB) and K-nearest neighbors (KNN) were performed and analyzed. The models were implemented using input features, which can be categorized as solder paste volume, paste-pad offset, component-pad offset, angular offset and orientation.

Design/methodology/approach

Surface-mount technology (SMT) is the technology behind the production of printed circuit boards, which is used in several types of commercial equipment such as communication devices, home appliances, medical imaging systems and sensors. In SMT, components undergo movement known as self-alignment during the reflow process. Although self-alignment is used to decrease the misalignment, it may not work for smaller size chipsets. If the solder paste depositions are not well-aligned, the self-alignment might deteriorate the final alignment of the component.

Findings

It were trained on their targets. Results obtained by each method for each target variable were compared to find the algorithm that gives the best performance. It was found that RFR gives the best performance in case of predicting offsets along the length and width of the component, whereas SVR does so in case of predicting offsets in the angular direction. The scope of this study can be extended to developing this model further to predict defects that can occur during the reflow process. It could also be developed to be used for optimizing the placement process in SMT.

Originality/value

This paper proposes a predictive model that predicts the component self-alignment offsets along the length and width of component and in the angular direction. To find the best performing model, various algorithms like RFR, SVR, NN, GB and KNN were performed and analyzed for predicting the component self-alignment offsets. This helps to achieve the following research objectives: best machine learning model for prediction of component self-alignment offsets. This model can be used to optimize the mounting process in SMT, which reduces occurrences of defects and making the process more efficient.

Details

Soldering & Surface Mount Technology, vol. 35 no. 2
Type: Research Article
ISSN: 0954-0911

Keywords

Article
Publication date: 5 December 2023

Valeriia Baklanova, Aleksei Kurkin and Tamara Teplova

The primary objective of this research is to provide a precise interpretation of the constructed machine learning model and produce definitive summaries that can evaluate the…

Abstract

Purpose

The primary objective of this research is to provide a precise interpretation of the constructed machine learning model and produce definitive summaries that can evaluate the influence of investor sentiment on the overall sales of non-fungible token (NFT) assets. To achieve this objective, the NFT hype index was constructed as well as several approaches of XAI were employed to interpret Black Box models and assess the magnitude and direction of the impact of the features used.

Design/methodology/approach

The research paper involved the construction of a sentiment index termed the NFT hype index, which aims to measure the influence of market actors within the NFT industry. This index was created by analyzing written content posted by 62 high-profile individuals and opinion leaders on the social media platform Twitter. The authors collected posts from the Twitter accounts that were afterward classified by tonality with a help of natural language processing model VADER. Then the machine learning methods and XAI approaches (feature importance, permutation importance and SHAP) were applied to explain the obtained results.

Findings

The built index was subjected to rigorous analysis using the gradient boosting regressor model and explainable AI techniques, which confirmed its significant explanatory power. Remarkably, the NFT hype index exhibited a higher degree of predictive accuracy compared to the well-known sentiment indices.

Practical implications

The NFT hype index, constructed from Twitter textual data, functions as an innovative, sentiment-based indicator for investment decision-making in the NFT market. It offers investors unique insights into the market sentiment that can be used alongside conventional financial analysis techniques to enhance risk management, portfolio optimization and overall investment outcomes within the rapidly evolving NFT ecosystem. Thus, the index plays a crucial role in facilitating well-informed, data-driven investment decisions and ensuring a competitive edge in the digital assets market.

Originality/value

The authors developed a novel index of investor interest for NFT assets (NFT hype index) based on text messages posted by market influencers and compared it to conventional sentiment indices in terms of their explanatory power. With the application of explainable AI, it was shown that sentiment indices may perform as significant predictors for NFT sales and that the NFT hype index works best among all sentiment indices considered.

Details

China Finance Review International, vol. ahead-of-print no. ahead-of-print
Type: Research Article
ISSN: 2044-1398

Keywords

Article
Publication date: 24 December 2021

Neetika Jain and Sangeeta Mittal

A cost-effective way to achieve fuel economy is to reinforce positive driving behaviour. Driving behaviour can be controlled if drivers can be alerted for behaviour that results…

Abstract

Purpose

A cost-effective way to achieve fuel economy is to reinforce positive driving behaviour. Driving behaviour can be controlled if drivers can be alerted for behaviour that results in poor fuel economy. Fuel consumption must be tracked and monitored instantaneously rather than tracking average fuel economy for the entire trip duration. A single-step application of machine learning (ML) is not sufficient to model prediction of instantaneous fuel consumption and detection of anomalous fuel economy. The study designs an ML pipeline to track and monitor instantaneous fuel economy and detect anomalies.

Design/methodology/approach

This research iteratively applies different variations of a two-step ML pipeline to the driving dataset for hatchback cars. The first step addresses the problem of accurate measurement and prediction of fuel economy using time series driving data, and the second step detects abnormal fuel economy in relation to contextual information. Long short-term memory autoencoder method learns and uses the most salient features of time series data to build a regression model. The contextual anomaly is detected by following two approaches, kernel quantile estimator and one-class support vector machine. The kernel quantile estimator sets dynamic threshold for detecting anomalous behaviour. Any error beyond a threshold is classified as an anomaly. The one-class support vector machine learns training error pattern and applies the model to test data for anomaly detection. The two-step ML pipeline is further modified by replacing long short term memory autoencoder with gated recurrent network autoencoder, and the performance of both models is compared. The speed recommendations and feedback are issued to the driver based on detected anomalies for controlling aggressive behaviour.

Findings

A composite long short-term memory autoencoder was compared with gated recurrent unit autoencoder. Both models achieve prediction accuracy within a range of 98%–100% for prediction as a first step. Recall and accuracy metrics for anomaly detection using kernel quantile estimator remains within 98%–100%, whereas the one-class support vector machine approach performs within the range of 99.3%–100%.

Research limitations/implications

The proposed approach does not consider socio-demographics or physiological information of drivers due to privacy concerns. However, it can be extended to correlate driver's physiological state such as fatigue, sleep and stress to correlate with driving behaviour and fuel economy. The anomaly detection approach here is limited to providing feedback to driver, it can be extended to give contextual feedback to the steering controller or throttle controller. In the future, a controller-based system can be associated with an anomaly detection approach to control the acceleration and braking action of the driver.

Practical implications

The suggested approach is helpful in monitoring and reinforcing fuel-economical driving behaviour among fleet drivers as per different environmental contexts. It can also be used as a training tool for improving driving efficiency for new drivers. It keeps drivers engaged positively by issuing a relevant warning for significant contextual anomalies and avoids issuing a warning for minor operational errors.

Originality/value

This paper contributes to the existing literature by providing an ML pipeline approach to track and monitor instantaneous fuel economy rather than relying on average fuel economy values. The approach is further extended to detect contextual driving behaviour anomalies and optimises fuel economy. The main contributions for this approach are as follows: (1) a prediction model is applied to fine-grained time series driving data to predict instantaneous fuel consumption. (2) Anomalous fuel economy is detected by comparing prediction error against a threshold and analysing error patterns based on contextual information.

Details

International Journal of Intelligent Computing and Cybernetics, vol. 15 no. 4
Type: Research Article
ISSN: 1756-378X

Keywords

1 – 10 of 212