Search results
1 – 10 of 486Amirhessam Tahmassebi, Mehrtash Motamedi, Amir H. Alavi and Amir H. Gandomi
Engineering design and operational decisions depend largely on deep understanding of applications that requires assumptions for simplification of the problems in order to find…
Abstract
Purpose
Engineering design and operational decisions depend largely on deep understanding of applications that requires assumptions for simplification of the problems in order to find proper solutions. Cutting-edge machine learning algorithms can be used as one of the emerging tools to simplify this process. In this paper, we propose a novel scalable and interpretable machine learning framework to automate this process and fill the current gap.
Design/methodology/approach
The essential principles of the proposed pipeline are mainly (1) scalability, (2) interpretibility and (3) robust probabilistic performance across engineering problems. The lack of interpretibility of complex machine learning models prevents their use in various problems including engineering computation assessments. Many consumers of machine learning models would not trust the results if they cannot understand the method. Thus, the SHapley Additive exPlanations (SHAP) approach is employed to interpret the developed machine learning models.
Findings
The proposed framework can be applied to a variety of engineering problems including seismic damage assessment of structures. The performance of the proposed framework is investigated using two case studies of failure identification in reinforcement concrete (RC) columns and shear walls. In addition, the reproducibility, reliability and generalizability of the results were validated and the results of the framework were compared to the benchmark studies. The results of the proposed framework outperformed the benchmark results with high statistical significance.
Originality/value
Although, the current study reveals that the geometric input features and reinforcement indices are the most important variables in failure modes detection, better model can be achieved with employing more robust strategies to establish proper database to decrease the errors in some of the failure modes identification.
Details
Keywords
Vinay Singh, Iuliia Konovalova and Arpan Kumar Kar
Explainable artificial intelligence (XAI) has importance in several industrial applications. The study aims to provide a comparison of two important methods used for explainable…
Abstract
Purpose
Explainable artificial intelligence (XAI) has importance in several industrial applications. The study aims to provide a comparison of two important methods used for explainable AI algorithms.
Design/methodology/approach
In this study multiple criteria has been used to compare between explainable Ranked Area Integrals (xRAI) and integrated gradient (IG) methods for the explainability of AI algorithms, based on a multimethod phase-wise analysis research design.
Findings
The theoretical part includes the comparison of frameworks of two methods. In contrast, the methods have been compared across five dimensions like functional, operational, usability, safety and validation, from a practical point of view.
Research limitations/implications
A comparison has been made by combining criteria from theoretical and practical points of view, which demonstrates tradeoffs in terms of choices for the user.
Originality/value
Our results show that the xRAI method performs better from a theoretical point of view. However, the IG method shows a good result with both model accuracy and prediction quality.
Details
Keywords
Research into the interpretability and explainability of data analytics and artificial intelligence (AI) systems is on the rise. However, most recent studies either solely promote…
Abstract
Purpose
Research into the interpretability and explainability of data analytics and artificial intelligence (AI) systems is on the rise. However, most recent studies either solely promote the benefits of explainability or criticize it due to its counterproductive effects. This study addresses this polarized space and aims to identify opposing effects of the explainability of AI and the tensions between them and propose how to manage this tension to optimize AI system performance and trustworthiness.
Design/methodology/approach
The author systematically reviews the literature and synthesizes it using a contingency theory lens to develop a framework for managing the opposing effects of AI explainability.
Findings
The author finds five opposing effects of explainability: comprehensibility, conduct, confidentiality, completeness and confidence in AI (5Cs). The author also proposes six perspectives on managing the tensions between the 5Cs: pragmatism in explanation, contextualization of the explanation, cohabitation of human agency and AI agency, metrics and standardization, regulatory and ethical principles, and other emerging solutions (i.e. AI enveloping, blockchain and AI fuzzy systems).
Research limitations/implications
As in other systematic literature review studies, the results are limited by the content of the selected papers.
Practical implications
The findings show how AI owners and developers can manage tensions between profitability, prediction accuracy and system performance via visibility, accountability and maintaining the “social goodness” of AI. The results guide practitioners in developing metrics and standards for AI explainability, with the context of AI operation as the focus.
Originality/value
This study addresses polarized beliefs amongst scholars and practitioners about the benefits of AI explainability versus its counterproductive effects. It poses that there is no single best way to maximize AI explainability. Instead, the co-existence of enabling and constraining effects must be managed.
Details
Keywords
Indranil Ghosh, Rabin K. Jana and Mohammad Zoynul Abedin
The prediction of Airbnb listing prices predominantly uses a set of amenity-driven features. Choosing an appropriate set of features from thousands of available amenity-driven…
Abstract
Purpose
The prediction of Airbnb listing prices predominantly uses a set of amenity-driven features. Choosing an appropriate set of features from thousands of available amenity-driven features makes the prediction task difficult. This paper aims to propose a scalable, robust framework to predict listing prices of Airbnb units without using amenity-driven features.
Design/methodology/approach
The authors propose an artificial intelligence (AI)-based framework to predict Airbnb listing prices. The authors consider 75 thousand Airbnb listings from the five US cities with more than 1.9 million observations. The proposed framework integrates (i) feature screening, (ii) stacking that combines gradient boosting, bagging, random forest, (iii) particle swarm optimization and (iv) explainable AI to accomplish the research objective.
Findings
The key findings have three aspects – prediction accuracy, homogeneity and identification of best and least predictable cities. The proposed framework yields predictions of supreme precision. The predictability of listing prices varies significantly across cities. The listing prices are the best predictable for Boston and the least predictable for Chicago.
Practical implications
The framework and findings of the research can be leveraged by the hosts to determine rental prices and augment the service offerings by emphasizing key features, respectively.
Originality/value
Although individual components are known, the way they have been integrated into the proposed framework to derive a high-quality forecast of Airbnb listing prices is unique. It is scalable. The Airbnb listing price modeling literature rarely witnesses such a framework.
Details
Keywords
Ean Zou Teoh, Wei-Chuen Yau, Thian Song Ong and Tee Connie
This study aims to develop a regression-based machine learning model to predict housing price, determine and interpret factors that contribute to housing prices using different…
Abstract
Purpose
This study aims to develop a regression-based machine learning model to predict housing price, determine and interpret factors that contribute to housing prices using different data sets available publicly. The significant determinants that affect housing prices will be first identified by using multinomial logistics regression (MLR) based on the level of relative importance. A comprehensive study is then conducted by using SHapley Additive exPlanations (SHAP) analysis to examine the features that cause the major changes in housing prices.
Design/methodology/approach
Predictive analytics is an effective way to deal with uncertainties in process modelling and improve decision-making for housing price prediction. The focus of this paper is two-fold; the authors first apply regression analysis to investigate how well the housing independent variables contribute to the housing price prediction. Two data sets are used for this study, namely, Ames Housing dataset and Melbourne Housing dataset. For both the data sets, random forest regression performs the best by achieving an average R2 of 86% for the Ames dataset and 85% for the Melbourne dataset, respectively. Second, multinomial logistic regression is adopted to investigate and identify the factor determinants of housing sales price. For the Ames dataset, the authors find that the top three most significant factor variables to determine the housing price is the general living area, basement size and age of remodelling. As for the Melbourne dataset, properties having more rooms/bathrooms, larger land size and closer distance to central business district (CBD) are higher priced. This is followed by a comprehensive analysis on how these determinants contribute to the predictability of the selected regression model by using explainable SHAP values. These prominent factors can be used to determine the optimal price range of a property which are useful for decision-making for both buyers and sellers.
Findings
By using the combination of MLR and SHAP analysis, it is noticeable that general living area, basement size and age of remodelling are the top three most important variables in determining the house’s price in the Ames dataset, while properties with more rooms/bathrooms, larger land area and closer proximity to the CBD or to the South of Melbourne are more expensive in the Melbourne dataset. These important factors can be used to estimate the best price range for a housing property for better decision-making.
Research limitations/implications
A limitation of this study is that the distribution of the housing prices is highly skewed. Although it is normal that the properties’ price is normally cluttered at the lower side and only a few houses are highly price. As mentioned before, MLR can effectively help in evaluating the likelihood ratio of each variable towards these categories. However, housing price is originally continuous, and there is a need to convert the price to categorical type. Nonetheless, the most effective method to categorize the data is still questionable.
Originality/value
The key point of this paper is the use of explainable machine learning approach to identify the prominent factors of housing price determination, which could be used to determine the optimal price range of a property which are useful for decision-making for both the buyers and sellers.
Details
Keywords
This study aims to examine the association between board gender diversity (BGD) and workplace diversity and the relative importance of various board and firm characteristics in…
Abstract
Purpose
This study aims to examine the association between board gender diversity (BGD) and workplace diversity and the relative importance of various board and firm characteristics in predicting diversity.
Design/methodology/approach
With a novel machine learning (ML) approach, this study models the association between three workplace diversity variables and BGD using a social media data set of approximately 250,000 employee reviews. Using the tools of explainable artificial intelligence, the authors interpret the results of the ML model.
Findings
The results show that BGD has a strong positive association with the gender equality and inclusiveness dimensions of corporate diversity culture. However, BGD is found to have a weak negative association with age diversity in a company. Furthermore, the authors find that workplace diversity is an important predictor of firm value, indicating a possible channel on how BGD affects firm performance.
Originality/value
The effects of BGD on workplace diversity below management levels are mainly omitted in the current corporate governance literature. Furthermore, existing research has not considered different dimensions of this diversity and has mainly focused on its gender aspects. In this study, the authors address this research problem and examine how BGD affects different dimensions of diversity at the overall company level. This study reveals important associations and identifies key variables that should be included as a part of theoretical causal models in future research.
Details
Keywords
In recent years, there has been growing interest in the use of stainless steel (SS) in reinforced concrete (RC) structures due to its distinctive corrosion resistance and…
Abstract
Purpose
In recent years, there has been growing interest in the use of stainless steel (SS) in reinforced concrete (RC) structures due to its distinctive corrosion resistance and excellent mechanical properties. To ensure effective synergy between SS and concrete, it is necessary to develop a time-saving approach to accurately determine the ultimate bond strength τu between the two materials in RC structures.
Design/methodology/approach
Three robust machine learning (ML) models, including support vector regression (SVR), random forest (RF) and extreme gradient boosting (XGBoost), are employed to predict τu between ribbed SS and concrete. Model hyperparameters are fine-tuned using Bayesian optimization (BO) with 10-fold cross-validation. The interpretable techniques including partial dependence plots (PDPs) and Shapley additive explanation (SHAP) are also utilized to figure out the relationship between input features and output for the best model.
Findings
Among the three ML models, BO-XGBoost exhibits the strongest generalization and highest accuracy in estimating τu. According to SHAP value-based feature importance, compressive strength of concrete fc emerges as the most prominent feature, followed by concrete cover thickness c, while the embedment length to diameter ratio l/d, and the diameter d for SS are deemed less important features. Properly increasing c and fc can enhance τu between ribbed SS and concrete.
Originality/value
An online graphical user interface (GUI) has been developed based on BO-XGBoost to estimate τu. This tool can be utilized in structural design of RC structures with ribbed SS as reinforcement.
Details
Keywords
This study aims to explain the state-of-the-art machine learning models that are used in the intrusion detection problem for human-being understandable and study the relationship…
Abstract
Purpose
This study aims to explain the state-of-the-art machine learning models that are used in the intrusion detection problem for human-being understandable and study the relationship between the explainability and the performance of the models.
Design/methodology/approach
The authors study a recent intrusion data set collected from real-world scenarios and use state-of-the-art machine learning algorithms to detect the intrusion. The authors apply several novel techniques to explain the models, then evaluate manually the explanation. The authors then compare the performance of model post- and prior-explainability-based feature selection.
Findings
The authors confirm our hypothesis above and claim that by forcing the explainability, the model becomes more robust, requires less computational power but achieves a better predictive performance.
Originality/value
The authors draw our conclusions based on their own research and experimental works.
Details
Keywords
Javiera M. Guedes, Akinbami Akinwale and María Requemán Fontecha
Content marketing is a crucial aspect of digital marketing in modern firms. By generating content that is interesting and engaging, companies have the two-fold advantage of…
Abstract
Content marketing is a crucial aspect of digital marketing in modern firms. By generating content that is interesting and engaging, companies have the two-fold advantage of promoting their products in a relatable way, while increasing familiarity and engagement with the brand. As data scientists at Credit Suisse, we value our content teams because their voice is the bank's voice. We strive to provide them with the best tools to increase their articles' success. With the help of machine learning, we have created digital products that allow them to improve articles before publication, recommend them to the most interested readers, and track their performance. The chapter begins with a brief introduction to content marketing, followed by an overview of our data, a review of the business challenges we have encountered, and the machine learning solutions we have developed in order to provide the best data insights to our internal and external stakeholders. We close the chapter with a brief summary of our work.
Details
Keywords
Samar Shilbayeh and Rihab Grassa
Bank creditworthiness refers to the evaluation of a bank’s ability to meet its financial obligations. It is an assessment of the bank’s financial health, stability and capacity to…
Abstract
Purpose
Bank creditworthiness refers to the evaluation of a bank’s ability to meet its financial obligations. It is an assessment of the bank’s financial health, stability and capacity to manage risks. This paper aims to investigate the credit rating patterns that are crucial for assessing creditworthiness of the Islamic banks, thereby evaluating the stability of their industry.
Design/methodology/approach
Three distinct machine learning algorithms are exploited and evaluated for the desired objective. This research initially uses the decision tree machine learning algorithm as a base learner conducting an in-depth comparison with the ensemble decision tree and Random Forest. Subsequently, the Apriori algorithm is deployed to uncover the most significant attributes impacting a bank’s credit rating. To appraise the previously elucidated models, a ten-fold cross-validation method is applied. This method involves segmenting the data sets into ten folds, with nine used for training and one for testing alternatively ten times changeable. This approach aims to mitigate any potential biases that could arise during the learning and training phases. Following this process, the accuracy is assessed and depicted in a confusion matrix as outlined in the methodology section.
Findings
The findings of this investigation reveal that the Random Forest machine learning algorithm superperforms others, achieving an impressive 90.5% accuracy in predicting credit ratings. Notably, our research sheds light on the significance of the loan-to-deposit ratio as a primary attribute affecting credit rating predictions. Moreover, this study uncovers additional pivotal banking features that intensely impact the measurements under study. This paper’s findings provide evidence that the loan-to-deposit ratio looks to be the purest bank attribute that affects credit rating prediction. In addition, deposit-to-assets ratio and profit sharing investment account ratio criteria are found to be effective in credit rating prediction and the ownership structure criterion came to be viewed as one of the essential bank attributes in credit rating prediction.
Originality/value
These findings contribute significant evidence to the understanding of attributes that strongly influence credit rating predictions within the banking sector. This study uniquely contributes by uncovering patterns that have not been previously documented in the literature, broadening our understanding in this field.
Details