Search results

1 – 5 of 5
Article
Publication date: 7 November 2016

Ismail Hmeidi, Mahmoud Al-Ayyoub, Nizar A. Mahyoub and Mohammed A. Shehab

Multi-label Text Classification (MTC) is one of the most recent research trends in data mining and information retrieval domains because of many reasons such as the rapid growth…

Abstract

Purpose

Multi-label Text Classification (MTC) is one of the most recent research trends in data mining and information retrieval domains because of many reasons such as the rapid growth of online data and the increasing tendency of internet users to be more comfortable with assigning multiple labels/tags to describe documents, emails, posts, etc. The dimensionality of labels makes MTC more difficult and challenging compared with traditional single-labeled text classification (TC). Because it is a natural extension of TC, several ways are proposed to benefit from the rich literature of TC through what is called problem transformation (PT) methods. Basically, PT methods transform the multi-label data into a single-label one that is suitable for traditional single-label classification algorithms. Another approach is to design novel classification algorithms customized for MTC. Over the past decade, several works have appeared on both approaches focusing mainly on the English language. This work aims to present an elaborate study of MTC of Arabic articles.

Design/methodology/approach

This paper presents a novel lexicon-based method for MTC, where the keywords that are most associated with each label are extracted from the training data along with a threshold that can later be used to determine whether each test document belongs to a certain label.

Findings

The experiments show that the presented approach outperforms the currently available approaches. Specifically, the results of our experiments show that the best accuracy obtained from existing approaches is only 18 per cent, whereas the accuracy of the presented lexicon-based approach can reach an accuracy level of 31 per cent.

Originality/value

Although there exist some tools that can be customized to address the MTC problem for Arabic text, their accuracies are very low when applied to Arabic articles. This paper presents a novel method for MTC. The experiments show that the presented approach outperforms the currently available approaches.

Details

International Journal of Web Information Systems, vol. 12 no. 4
Type: Research Article
ISSN: 1744-0084

Keywords

Article
Publication date: 16 February 2024

Hossam Mohamed Toma, Ahmed H. Abdeen and Ahmed Ibrahim

The equipment resale price plays an important role in calculating the optimum time for equipment replacement. Some of the existing models that predict the equipment resale price…

Abstract

Purpose

The equipment resale price plays an important role in calculating the optimum time for equipment replacement. Some of the existing models that predict the equipment resale price do not take many of the influencing factors on the resale price into account. Other models consider more factors that influence equipment resale price, but they still with low accuracy because of the modeling techniques that were used. An easy tool is required to help in forecasting the resale price and support efficient decisions for equipment replacement. This research presents a machine learning (ML) computer model helping in forecasting accurately the equipment resale price.

Design/methodology/approach

A measuring method for the influencing factors that have impacts on the equipment resale price was determined. The values of those factors were measured for 1,700 pieces of equipment and their corresponding resale price. The data were used to develop a ML model that covers three types of equipment (loaders, excavators and bulldozers). The methodology used to develop the model applied three ML algorithms: the random forest regressor, extra trees regressor and decision tree regressor, to find an accurate model for the equipment resale price. The three algorithms were verified and tested with data of 340 pieces of equipment.

Findings

Using a large number of data to train the ML model resulted in a high-accuracy predicting model. The accuracy of the extra trees regressor algorithm was the highest among the three used algorithms to develop the ML model. The accuracy of the model is 98%. A computer interface is designed to make the use of the model easier.

Originality/value

The proposed model is accurate and makes it easy to predict the equipment resale price. The predicted resale price can be used to calculate equipment elements that are essential for developing a dependable equipment replacement plan. The proposed model was developed based on the most influencing factors on the equipment resale price and evaluation of those factors was done using reliable methods. The technique used to develop the model is the ML that proved its accuracy in modeling. The accuracy of the model, which is 98%, enhances the value of the model.

Details

Engineering, Construction and Architectural Management, vol. ahead-of-print no. ahead-of-print
Type: Research Article
ISSN: 0969-9988

Keywords

Content available
Article
Publication date: 25 October 2021

Enna Hirata and Takuma Matsuda

This research aims to uncover coronavirus disease 2019’s (COVID-19's) impact on shipping and logistics using Internet articles as the source.

4610

Abstract

Purpose

This research aims to uncover coronavirus disease 2019’s (COVID-19's) impact on shipping and logistics using Internet articles as the source.

Design/methodology/approach

This research applies web mining to collect information on COVID-19's impact on shipping and logistics from Internet articles. The information extracted is then analyzed through machine learning algorithms for useful insights.

Findings

The research results indicate that the recovery of the global supply chain in China could potentially drive the global supply chain to return to normalcy. In addition, researchers and policymakers should prioritize two aspects: (1) Ease of cross-border trade and logistics. Digitization of the supply chain and applying breakthrough technologies like blockchain and IoT are needed more than ever before. (2) Supply chain resilience. The high dependency of the global supply chain on China sounds like an alarm of supply chain resilience. It calls for a framework to increase global supply chain resilience that enables quick recovery from disruptions in the long term.

Originality/value

Differing from other studies taking the natural language processing (NLP) approach, this research uses Internet articles as the data source. The findings reveal significant components of COVID-19's impact on shipping and logistics, highlighting crucial agendas for scholars to research.

Details

Maritime Business Review, vol. 7 no. 4
Type: Research Article
ISSN: 2397-3757

Keywords

Article
Publication date: 19 July 2023

Gaurav Kumar, Molla Ramizur Rahman, Abhinav Rajverma and Arun Kumar Misra

This study aims to analyse the systemic risk emitted by all publicly listed commercial banks in a key emerging economy, India.

Abstract

Purpose

This study aims to analyse the systemic risk emitted by all publicly listed commercial banks in a key emerging economy, India.

Design/methodology/approach

The study makes use of the Tobias and Brunnermeier (2016) estimator to quantify the systemic risk (ΔCoVaR) that banks contribute to the system. The methodology addresses a classification problem based on the probability that a particular bank will emit high systemic risk or moderate systemic risk. The study applies machine learning models such as logistic regression, random forest (RF), neural networks and gradient boosting machine (GBM) and addresses the issue of imbalanced data sets to investigate bank’s balance sheet features and bank’s stock features which may potentially determine the factors of systemic risk emission.

Findings

The study reports that across various performance matrices, the authors find that two specifications are preferred: RF and GBM. The study identifies lag of the estimator of systemic risk, stock beta, stock volatility and return on equity as important features to explain emission of systemic risk.

Practical implications

The findings will help banks and regulators with the key features that can be used to formulate the policy decisions.

Originality/value

This study contributes to the existing literature by suggesting classification algorithms that can be used to model the probability of systemic risk emission in a classification problem setting. Further, the study identifies the features responsible for the likelihood of systemic risk.

Details

Journal of Modelling in Management, vol. 19 no. 2
Type: Research Article
ISSN: 1746-5664

Keywords

Article
Publication date: 19 November 2021

Swathi Kailasam, Sampath Dakshina Murthy Achanta, P. Rama Koteswara Rao, Ramesh Vatambeti and Saikumar Kayam

In cultivation, early harvest offers farmers an opportunity to increase production while decreasing the chances of lower crop production rates, ensuring that the economy remains…

Abstract

Purpose

In cultivation, early harvest offers farmers an opportunity to increase production while decreasing the chances of lower crop production rates, ensuring that the economy remains balanced. The significant reason is to predict the disease in plants and distinguish the type of syndrome with the help of segmentation and random forest optimization classification. In this investigation, the accurate prior phase of crop imagery has been collected from different datasets like cropscience, yesmodes and nelsonwisc . In the current study, the real-time earlier state of crop images has been gathered from numerous data sources similar to crop_science, yes_modes, nelson_wisc dataset.

Design/methodology/approach

In this research work, random forest machine learning-based persuasive plants healthcare computing is provided. If proper ecological care is not applied to early harvesting, it can cause diseases in plants, decrease the cropping rate and less production. Until now different methods have been developed for crop analysis at an earlier stage, but it is necessary to implement methods to advanced techniques. So, the detection of plant diseases with the help of threshold segmentation and random forest classification has been involved in this investigation. This implemented design is verified on Python 3.7.8 software for simulation analysis.

Findings

In this work, different methods are developed for crops at an earlier stage, but more methods are needed to implement methods with prior stage crop harvesting. Because of this, a disease-finding system has been implemented. The methodologies like “Threshold segmentation” and RFO classifier lends 97.8% identification precision with 99.3% real optimistic rate, and 59.823 peak signal-to-noise (PSNR), 0.99894 structure similarity index (SSIM), 0.00812 machine squared error (MSE) values are attained.

Originality/value

The implemented machine learning design is outperformance methodology, and they are proving good application detection rate.

Details

International Journal of Intelligent Computing and Cybernetics, vol. 15 no. 2
Type: Research Article
ISSN: 1756-378X

Keywords

1 – 5 of 5