Search results

1 – 10 of over 2000
Article
Publication date: 3 November 2020

Jagroop Kaur and Jaswinder Singh

Normalization is an important step in all the natural language processing applications that are handling social media text. The text from social media poses a different kind of…

Abstract

Purpose

Normalization is an important step in all the natural language processing applications that are handling social media text. The text from social media poses a different kind of problems that are not present in regular text. Recently, a considerable amount of work has been done in this direction, but mostly in the English language. People who do not speak English code mixed the text with their native language and posted text on social media using the Roman script. This kind of text further aggravates the problem of normalizing. This paper aims to discuss the concept of normalization with respect to code-mixed social media text, and a model has been proposed to normalize such text.

Design/methodology/approach

The system is divided into two phases – candidate generation and most probable sentence selection. Candidate generation task is treated as machine translation task where the Roman text is treated as source language and Gurmukhi text is treated as the target language. Character-based translation system has been proposed to generate candidate tokens. Once candidates are generated, the second phase uses the beam search method for selecting the most probable sentence based on hidden Markov model.

Findings

Character error rate (CER) and bilingual evaluation understudy (BLEU) score are reported. The proposed system has been compared with Akhar software and RB\_R2G system, which are also capable of transliterating Roman text to Gurmukhi. The performance of the system outperforms Akhar software. The CER and BLEU scores are 0.268121 and 0.6807939, respectively, for ill-formed text.

Research limitations/implications

It was observed that the system produces dialectical variations of a word or the word with minor errors like diacritic missing. Spell checker can improve the output of the system by correcting these minor errors. Extensive experimentation is needed for optimizing language identifier, which will further help in improving the output. The language model also seeks further exploration. Inclusion of wider context, particularly from social media text, is an important area that deserves further investigation.

Practical implications

The practical implications of this study are: (1) development of parallel dataset containing Roman and Gurmukhi text; (2) development of dataset annotated with language tag; (3) development of the normalizing system, which is first of its kind and proposes translation based solution for normalizing noisy social media text from Roman to Gurmukhi. It can be extended for any pair of scripts. (4) The proposed system can be used for better analysis of social media text. Theoretically, our study helps in better understanding of text normalization in social media context and opens the doors for further research in multilingual social media text normalization.

Originality/value

Existing research work focus on normalizing monolingual text. This study contributes towards the development of a normalization system for multilingual text.

Details

International Journal of Intelligent Computing and Cybernetics, vol. 13 no. 4
Type: Research Article
ISSN: 1756-378X

Keywords

Open Access
Article
Publication date: 21 December 2021

Vahid Badeli, Sascha Ranftl, Gian Marco Melito, Alice Reinbacher-Köstinger, Wolfgang Von Der Linden, Katrin Ellermann and Oszkar Biro

This paper aims to introduce a non-invasive and convenient method to detect a life-threatening disease called aortic dissection. A Bayesian inference based on enhanced…

Abstract

Purpose

This paper aims to introduce a non-invasive and convenient method to detect a life-threatening disease called aortic dissection. A Bayesian inference based on enhanced multi-sensors impedance cardiography (ICG) method has been applied to classify signals from healthy and sick patients.

Design/methodology/approach

A 3D numerical model consisting of simplified organ geometries is used to simulate the electrical impedance changes in the ICG-relevant domain of the human torso. The Bayesian probability theory is used for detecting an aortic dissection, which provides information about the probabilities for both cases, a dissected and a healthy aorta. Thus, the reliability and the uncertainty of the disease identification are found by this method and may indicate further diagnostic clarification.

Findings

The Bayesian classification shows that the enhanced multi-sensors ICG is more reliable in detecting aortic dissection than conventional ICG. Bayesian probability theory allows a rigorous quantification of all uncertainties to draw reliable conclusions for the medical treatment of aortic dissection.

Originality/value

This paper presents a non-invasive and reliable method based on a numerical simulation that could be beneficial for the medical management of aortic dissection patients. With this method, clinicians would be able to monitor the patient’s status and make better decisions in the treatment procedure of each patient.

Details

COMPEL - The international journal for computation and mathematics in electrical and electronic engineering , vol. 41 no. 3
Type: Research Article
ISSN: 0332-1649

Keywords

Article
Publication date: 7 March 2023

Sedat Metlek

The purpose of this study is to develop and test a new deep learning model to predict aircraft fuel consumption. For this purpose, real data obtained from different landings and…

Abstract

Purpose

The purpose of this study is to develop and test a new deep learning model to predict aircraft fuel consumption. For this purpose, real data obtained from different landings and take-offs were used. As a result, a new hybrid convolutional neural network (CNN)-bi-directional long short term memory (BiLSTM) model was developed as intended.

Design/methodology/approach

The data used are divided into training and testing according to the k-fold 5 value. In this study, 13 different parameters were used together as input parameters. Fuel consumption was used as the output parameter. Thus, the effect of many input parameters on fuel flow was modeled simultaneously using the deep learning method in this study. In addition, the developed hybrid model was compared with the existing deep learning models long short term memory (LSTM) and BiLSTM.

Findings

In this study, when tested with LSTM, one of the existing deep learning models, values of 0.9162, 6.476, and 5.76 were obtained for R2, root mean square error (RMSE), and mean absolute percentage error (MAPE), respectively. For the BiLSTM model when tested, values of 0.9471, 5.847 and 4.62 were obtained for R2, RMSE and MAPE, respectively. In the proposed hybrid model when tested, values of 0.9743, 2.539 and 1.62 were obtained for R2, RMSE and MAPE, respectively. The results obtained according to the LSTM and BiLSTM models are much closer to the actual fuel consumption values. The error of the models used was verified against the actual fuel flow reports, and an average absolute percent error value of less than 2% was obtained.

Originality/value

In this study, a new hybrid CNN-BiLSTM model is proposed. The proposed model is trained and tested with real flight data for fuel consumption estimation. As a result of the test, it is seen that it gives much better results than the LSTM and BiLSTM methods found in the literature. For this reason, it can be used in many different engine types and applications in different fields, especially the turboprop engine used in the study. Because it can be applied to different engines than the engine type used in the study, it can be easily integrated into many simulation models.

Details

Aircraft Engineering and Aerospace Technology, vol. 95 no. 5
Type: Research Article
ISSN: 1748-8842

Keywords

Article
Publication date: 8 February 2022

K. Arunkumar and S. Vasundra

Patient treatment trajectory data are used to predict the outcome of the treatment to particular disease that has been carried out in the research. In order to determine the…

Abstract

Purpose

Patient treatment trajectory data are used to predict the outcome of the treatment to particular disease that has been carried out in the research. In order to determine the evolving disease on the patient and changes in the health due to treatment has not considered existing methodologies. Hence deep learning models to trajectory data mining can be employed to identify disease prediction with high accuracy and less computation cost.

Design/methodology/approach

Multifocus deep neural network classifiers has been utilized to detect the novel disease class and comorbidity class to the changes in the genome pattern of the patient trajectory data can be identified on the layers of the architecture. Classifier is employed to learn extracted feature set with activation and weight function and then merged on many aspects to classify the undetermined sequence of diseases as a new variant. The performance of disease progression learning progress utilizes the precision of the constituent classifiers, which usually has larger generalization benefits than those optimized classifiers.

Findings

Deep learning architecture uses weight function, bias function on input layers and max pooling. Outcome of the input layer has applied to hidden layer to generate the multifocus characteristics of the disease, and multifocus characterized disease is processed in activation function using ReLu function along hyper parameter tuning which produces the effective outcome in the output layer of a fully connected network. Experimental results have proved using cross validation that proposed model outperforms methodologies in terms of computation time and accuracy.

Originality/value

Proposed evolving classifier represented as a robust architecture on using objective function to map the data sequence into a class distribution of the evolving disease class to the patient trajectory. Then, the generative output layer of the proposed model produces the progression outcome of the disease of the particular patient trajectory. The model tries to produce the accurate prognosis outcomes by employing data conditional probability function. The originality of the work defines 70% and comparisons of the previous methods the method of values are accurate and increased analysis of the predictions.

Details

International Journal of Intelligent Computing and Cybernetics, vol. 15 no. 4
Type: Research Article
ISSN: 1756-378X

Keywords

Article
Publication date: 19 July 2023

Gaurav Kumar, Molla Ramizur Rahman, Abhinav Rajverma and Arun Kumar Misra

This study aims to analyse the systemic risk emitted by all publicly listed commercial banks in a key emerging economy, India.

Abstract

Purpose

This study aims to analyse the systemic risk emitted by all publicly listed commercial banks in a key emerging economy, India.

Design/methodology/approach

The study makes use of the Tobias and Brunnermeier (2016) estimator to quantify the systemic risk (ΔCoVaR) that banks contribute to the system. The methodology addresses a classification problem based on the probability that a particular bank will emit high systemic risk or moderate systemic risk. The study applies machine learning models such as logistic regression, random forest (RF), neural networks and gradient boosting machine (GBM) and addresses the issue of imbalanced data sets to investigate bank’s balance sheet features and bank’s stock features which may potentially determine the factors of systemic risk emission.

Findings

The study reports that across various performance matrices, the authors find that two specifications are preferred: RF and GBM. The study identifies lag of the estimator of systemic risk, stock beta, stock volatility and return on equity as important features to explain emission of systemic risk.

Practical implications

The findings will help banks and regulators with the key features that can be used to formulate the policy decisions.

Originality/value

This study contributes to the existing literature by suggesting classification algorithms that can be used to model the probability of systemic risk emission in a classification problem setting. Further, the study identifies the features responsible for the likelihood of systemic risk.

Details

Journal of Modelling in Management, vol. 19 no. 2
Type: Research Article
ISSN: 1746-5664

Keywords

Article
Publication date: 7 April 2015

Jie Sun, Hui Li, Pei-Chann Chang and Qing-Hua Huang

Previous researches on credit scoring mainly focussed on static modeling on panel sample data set in a certain period of time, and did not pay enough attention on dynamic…

Abstract

Purpose

Previous researches on credit scoring mainly focussed on static modeling on panel sample data set in a certain period of time, and did not pay enough attention on dynamic incremental modeling. The purpose of this paper is to address the integration of branch and bound algorithm with incremental support vector machine (SVM) ensemble to make dynamic modeling of credit scoring.

Design/methodology/approach

This new model hybridizes support vectors of old data with incremental financial data of corporate in the process of dynamic ensemble modeling based on bagged SVM. In the incremental stage, multiple base SVM models are dynamically adjusted according to bagged new updated information for credit scoring. These updated base models are further combined to generate a dynamic credit scoring. In the empirical experiment, the new method was compared with the traditional model of non-incremental SVM ensemble for credit scoring.

Findings

The results show that the new model is able to continuously and dynamically adjust credit scoring according to corporate incremental information, which helps produce better evaluation ability than the traditional model.

Originality/value

This research pioneered on dynamic modeling for credit scoring with incremental SVM ensemble. As time pasts, new incremental samples will be combined with support vectors of old samples to construct SVM ensemble credit scoring model. The incremental model will continuously adjust itself to keep good evaluation performance.

Details

Kybernetes, vol. 44 no. 4
Type: Research Article
ISSN: 0368-492X

Keywords

Article
Publication date: 21 June 2019

Kelvin Balcombe, Iain Fraser and Abhijit Sharma

The purpose of this paper is to re-examine the long-run relationship between radiative forcing (including emissions of carbon dioxide, sulphur oxides, methane and solar radiation…

Abstract

Purpose

The purpose of this paper is to re-examine the long-run relationship between radiative forcing (including emissions of carbon dioxide, sulphur oxides, methane and solar radiation) and temperatures from a structural time series modelling perspective. The authors assess whether forcing measures are cointegrated with global temperatures using the structural time series approach.

Design/methodology/approach

A Bayesian approach is used to obtain estimates that represent the uncertainty regarding this relationship. The estimated structural time series model enables alternative model specifications to be consistently compared by evaluating model performance.

Findings

The results confirm that cointegration between radiative forcing and temperatures is consistent with the data. However, the results find less support for cointegration between forcing and temperature data than found previously.

Research limitations/implications

Given considerable debate within the literature relating to the “best” way to statistically model this relationship and explain results arising as well as model performance, there is uncertainty regarding our understanding of this relationship and resulting policy design and implementation. There is a need for further modelling and use of more data.

Practical implications

There is divergence of views as to how best to statistically capture, explain and model this relationship. Researchers should avoid being too strident in their claims about model performance and better appreciate the role of uncertainty.

Originality/value

The results of this study make a contribution to the literature by employing a theoretically motivated framework in which a number of plausible alternatives are considered in detail, as opposed to simply employing a standard cointegration framework.

Details

Management of Environmental Quality: An International Journal, vol. 30 no. 5
Type: Research Article
ISSN: 1477-7835

Keywords

Article
Publication date: 5 March 2021

Mayank Kumar Jha, Yogesh Mani Tripathi and Sanku Dey

The purpose of this article is to derive inference for multicomponent reliability where stress-strength variables follow unit generalized Rayleigh (GR) distributions with common…

Abstract

Purpose

The purpose of this article is to derive inference for multicomponent reliability where stress-strength variables follow unit generalized Rayleigh (GR) distributions with common scale parameter.

Design/methodology/approach

The authors derive inference for the unknown parametric function using classical and Bayesian approaches. In sequel, (weighted) least square (LS) and maximum product of spacing methods are used to estimate the reliability. Bootstrapping is also considered for this purpose. Bayesian inference is derived under gamma prior distributions. In consequence credible intervals are constructed. For the known common scale, unbiased estimator is obtained and is compared with the corresponding exact Bayes estimate.

Findings

Different point and interval estimators of the reliability are examined using Monte Carlo simulations for different sample sizes. In summary, the authors observe that Bayes estimators obtained using gamma prior distributions perform well compared to the other studied estimators. The average length (AL) of highest posterior density (HPD) interval remains shorter than other proposed intervals. Further coverage probabilities of all the intervals are reasonably satisfactory. A data analysis is also presented in support of studied estimation methods. It is noted that proposed methods work good for the considered estimation problem.

Originality/value

In the literature various probability distributions which are often analyzed in life test studies are mostly unbounded in nature, that is, their support of positive probabilities lie in infinite interval. This class of distributions includes generalized exponential, Burr family, gamma, lognormal and Weibull models, among others. In many situations the authors need to analyze data which lie in bounded interval like average height of individual, survival time from a disease, income per-capita etc. Thus use of probability models with support on finite intervals becomes inevitable. The authors have investigated stress-strength reliability based on unit GR distribution. Useful comments are obtained based on the numerical study.

Details

International Journal of Quality & Reliability Management, vol. 38 no. 10
Type: Research Article
ISSN: 0265-671X

Keywords

Open Access
Article
Publication date: 4 November 2022

Bianca Caiazzo, Teresa Murino, Alberto Petrillo, Gianluca Piccirillo and Stefania Santini

This work aims at proposing a novel Internet of Things (IoT)-based and cloud-assisted monitoring architecture for smart manufacturing systems able to evaluate their overall status…

2931

Abstract

Purpose

This work aims at proposing a novel Internet of Things (IoT)-based and cloud-assisted monitoring architecture for smart manufacturing systems able to evaluate their overall status and detect eventual anomalies occurring into the production. A novel artificial intelligence (AI) based technique, able to identify the specific anomalous event and the related risk classification for possible intervention, is hence proposed.

Design/methodology/approach

The proposed solution is a five-layer scalable and modular platform in Industry 5.0 perspective, where the crucial layer is the Cloud Cyber one. This embeds a novel anomaly detection solution, designed by leveraging control charts, autoencoders (AE) long short-term memory (LSTM) and Fuzzy Inference System (FIS). The proper combination of these methods allows, not only detecting the products defects, but also recognizing their causalities.

Findings

The proposed architecture, experimentally validated on a manufacturing system involved into the production of a solar thermal high-vacuum flat panel, provides to human operators information about anomalous events, where they occur, and crucial information about their risk levels.

Practical implications

Thanks to the abnormal risk panel; human operators and business managers are able, not only of remotely visualizing the real-time status of each production parameter, but also to properly face with the eventual anomalous events, only when necessary. This is especially relevant in an emergency situation, such as the COVID-19 pandemic.

Originality/value

The monitoring platform is one of the first attempts in leading modern manufacturing systems toward the Industry 5.0 concept. Indeed, it combines human strengths, IoT technology on machines, cloud-based solutions with AI and zero detect manufacturing strategies in a unified framework so to detect causalities in complex dynamic systems by enabling the possibility of products’ waste avoidance.

Details

Journal of Manufacturing Technology Management, vol. 34 no. 4
Type: Research Article
ISSN: 1741-038X

Keywords

Article
Publication date: 19 December 2023

Guilherme Dayrell Mendonça, Stanley Robson de Medeiros Oliveira, Orlando Fontes Lima Jr and Paulo Tarso Vilela de Resende

The objective of this paper is to evaluate whether the data from consignors, logistics service providers (LSPs) and consignees contribute to the prediction of air transport…

Abstract

Purpose

The objective of this paper is to evaluate whether the data from consignors, logistics service providers (LSPs) and consignees contribute to the prediction of air transport shipment delays in a machine learning application.

Design/methodology/approach

The research database contained 2,244 air freight intercontinental shipments to 4 automotive production plants in Latin America. Different algorithm classes were tested in the knowledge discovery in databases (KDD) process: support vector machine (SVM), random forest (RF), artificial neural networks (ANN) and k-nearest neighbors (KNN).

Findings

Shipper, consignee and LSP data attribute selection achieved 86% accuracy through the RF algorithm in a cross-validation scenario after a combined class balancing procedure.

Originality/value

These findings expand the current literature on machine learning applied to air freight delay management, which has mostly focused on weather, airport structure, flight schedule, ground delay and congestion as explanatory attributes.

Details

International Journal of Physical Distribution & Logistics Management, vol. 54 no. 1
Type: Research Article
ISSN: 0960-0035

Keywords

1 – 10 of over 2000