Search results

1 – 4 of 4
Article
Publication date: 27 October 2020

Lokesh Singh, Rekh Ram Janghel and Satya Prakash Sahu

The study aims to cope with the problems confronted in the skin lesion datasets with less training data toward the classification of melanoma. The vital, challenging issue is the…

Abstract

Purpose

The study aims to cope with the problems confronted in the skin lesion datasets with less training data toward the classification of melanoma. The vital, challenging issue is the insufficiency of training data that occurred while classifying the lesions as melanoma and non-melanoma.

Design/methodology/approach

In this work, a transfer learning (TL) framework Transfer Constituent Support Vector Machine (TrCSVM) is designed for melanoma classification based on feature-based domain adaptation (FBDA) leveraging the support vector machine (SVM) and Transfer AdaBoost (TrAdaBoost). The working of the framework is twofold: at first, SVM is utilized for domain adaptation for learning much transferrable representation between source and target domain. In the first phase, for homogeneous domain adaptation, it augments features by transforming the data from source and target (different but related) domains in a shared-subspace. In the second phase, for heterogeneous domain adaptation, it leverages knowledge by augmenting features from source to target (different and not related) domains to a shared-subspace. Second, TrAdaBoost is utilized to adjust the weights of wrongly classified data in the newly generated source and target datasets.

Findings

The experimental results empirically prove the superiority of TrCSVM than the state-of-the-art TL methods on less-sized datasets with an accuracy of 98.82%.

Originality/value

Experiments are conducted on six skin lesion datasets and performance is compared based on accuracy, precision, sensitivity, and specificity. The effectiveness of TrCSVM is evaluated on ten other datasets towards testing its generalizing behavior. Its performance is also compared with two existing TL frameworks (TrResampling, TrAdaBoost) for the classification of melanoma.

Details

Data Technologies and Applications, vol. 55 no. 1
Type: Research Article
ISSN: 2514-9288

Keywords

Article
Publication date: 20 June 2022

Lokesh Singh, Rekh Ram Janghel and Satya Prakash Sahu

Automated skin lesion analysis plays a vital role in early detection. Having relatively small-sized imbalanced skin lesion datasets impedes learning and dominates research in…

Abstract

Purpose

Automated skin lesion analysis plays a vital role in early detection. Having relatively small-sized imbalanced skin lesion datasets impedes learning and dominates research in automated skin lesion analysis. The unavailability of adequate data poses difficulty in developing classification methods due to the skewed class distribution.

Design/methodology/approach

Boosting-based transfer learning (TL) paradigms like Transfer AdaBoost algorithm can compensate for such a lack of samples by taking advantage of auxiliary data. However, in such methods, beneficial source instances representing the target have a fast and stochastic weight convergence, which results in “weight-drift” that negates transfer. In this paper, a framework is designed utilizing the “Rare-Transfer” (RT), a boosting-based TL algorithm, that prevents “weight-drift” and simultaneously addresses absolute-rarity in skin lesion datasets. RT prevents the weights of source samples from quick convergence. It addresses absolute-rarity using an instance transfer approach incorporating the best-fit set of auxiliary examples, which improves balanced error minimization. It compensates for class unbalance and scarcity of training samples in absolute-rarity simultaneously for inducing balanced error optimization.

Findings

Promising results are obtained utilizing the RT compared with state-of-the-art techniques on absolute-rare skin lesion datasets with an accuracy of 92.5%. Wilcoxon signed-rank test examines significant differences amid the proposed RT algorithm and conventional algorithms used in the experiment.

Originality/value

Experimentation is performed on absolute-rare four skin lesion datasets, and the effectiveness of RT is assessed based on accuracy, sensitivity, specificity and area under curve. The performance is compared with an existing ensemble and boosting-based TL methods.

Details

Data Technologies and Applications, vol. 57 no. 1
Type: Research Article
ISSN: 2514-9288

Keywords

Article
Publication date: 3 January 2018

Lei La, Shuyan Cao and Liangjuan Qin

As a foundational issue of social mining, sentiment classification suffered from a lack of unlabeled data. To enhance accuracy of classification with few labeled data, many…

Abstract

Purpose

As a foundational issue of social mining, sentiment classification suffered from a lack of unlabeled data. To enhance accuracy of classification with few labeled data, many semi-supervised algorithms had been proposed. These algorithms improved the classification performance when the labeled data are insufficient. However, precision and efficiency are difficult to be ensured at the same time in many semi-supervised methods. This paper aims to present a novel method for using unlabeled data in a more accurate and more efficient way.

Design/methodology/approach

First, the authors designed a boosting-based method for unlabeled data selection. The improved boosting-based method can choose unlabeled data which have the same distribution with the labeled data. The authors then proposed a novel strategy which can combine weak classifiers into strong classifiers that are more rational. Finally, a semi-supervised sentiment classification algorithm is given.

Findings

Experimental results demonstrate that the novel algorithm can achieve really high accuracy with low time consumption. It is helpful for achieving high-performance social network-related applications.

Research limitations/implications

The novel method needs a small labeled data set for semi-supervised learning. Maybe someday the authors can improve it to an unsupervised method.

Practical implications

The mentioned method can be used in text mining, image classification, audio processing and so on, and also in an unstructured data mining-related field. Overcome the problem of insufficient labeled data and achieve high precision using fewer computational time.

Social implications

Sentiment mining has wide applications in public opinion management, public security, market analysis, social network and related fields. Sentiment classification is the basis of sentiment mining.

Originality/value

According to what the authors have been informed, it is the first time transfer learning be introduced to AdaBoost for semi-supervised learning. Moreover, the improved AdaBoost uses a totally new mechanism for weighting.

Details

Kybernetes, vol. 47 no. 3
Type: Research Article
ISSN: 0368-492X

Keywords

Article
Publication date: 20 July 2023

Mu Shengdong, Liu Yunjie and Gu Jijian

By introducing Stacking algorithm to solve the underfitting problem caused by insufficient data in traditional machine learning, this paper provides a new solution to the cold…

Abstract

Purpose

By introducing Stacking algorithm to solve the underfitting problem caused by insufficient data in traditional machine learning, this paper provides a new solution to the cold start problem of entrepreneurial borrowing risk control.

Design/methodology/approach

The authors introduce semi-supervised learning and integrated learning into the field of migration learning, and innovatively propose the Stacking model migration learning, which can independently train models on entrepreneurial borrowing credit data, and then use the migration strategy itself as the learning object, and use the Stacking algorithm to combine the prediction results of the source domain model and the target domain model.

Findings

The effectiveness of the two migration learning models is evaluated with real data from an entrepreneurial borrowing. The algorithmic performance of the Stacking-based model migration learning is further improved compared to the benchmark model without migration learning techniques, with the model area under curve value rising to 0.8. Comparing the two migration learning models reveals that the model-based migration learning approach performs better. The reason for this is that the sample-based migration learning approach only eliminates the noisy samples that are relatively less similar to the entrepreneurial borrowing data. However, the calculation of similarity and the weighing of similarity are subjective, and there is no unified judgment standard and operation method, so there is no guarantee that the retained traditional credit samples have the same sample distribution and feature structure as the entrepreneurial borrowing data.

Practical implications

From a practical standpoint, on the one hand, it provides a new solution to the cold start problem of entrepreneurial borrowing risk control. The small number of labeled high-quality samples cannot support the learning and deployment of big data risk control models, which is the cold start problem of the entrepreneurial borrowing risk control system. By extending the training sample set with auxiliary domain data through suitable migration learning methods, the prediction performance of the model can be improved to a certain extent and more generalized laws can be learned.

Originality/value

This paper introduces the thought method of migration learning to the entrepreneurial borrowing scenario, provides a new solution to the cold start problem of the entrepreneurial borrowing risk control system and verifies the feasibility and effectiveness of the migration learning method applied in the risk control field through empirical data.

Details

Management Decision, vol. ahead-of-print no. ahead-of-print
Type: Research Article
ISSN: 0025-1747

Keywords

1 – 4 of 4