Search results

1 – 10 of over 3000
Article
Publication date: 16 April 2020

Mohammad Mahdi Ershadi and Abbas Seifi

This study aims to differential diagnosis of some diseases using classification methods to support effective medical treatment. For this purpose, different classification methods…

Abstract

Purpose

This study aims to differential diagnosis of some diseases using classification methods to support effective medical treatment. For this purpose, different classification methods based on data, experts’ knowledge and both are considered in some cases. Besides, feature reduction and some clustering methods are used to improve their performance.

Design/methodology/approach

First, the performances of classification methods are evaluated for differential diagnosis of different diseases. Then, experts' knowledge is utilized to modify the Bayesian networks' structures. Analyses of the results show that using experts' knowledge is more effective than other algorithms for increasing the accuracy of Bayesian network classification. A total of ten different diseases are used for testing, taken from the Machine Learning Repository datasets of the University of California at Irvine (UCI).

Findings

The proposed method improves both the computation time and accuracy of the classification methods used in this paper. Bayesian networks based on experts' knowledge achieve a maximum average accuracy of 87 percent, with a minimum standard deviation average of 0.04 over the sample datasets among all classification methods.

Practical implications

The proposed methodology can be applied to perform disease differential diagnosis analysis.

Originality/value

This study presents the usefulness of experts' knowledge in the diagnosis while proposing an adopted improvement method for classifications. Besides, the Bayesian network based on experts' knowledge is useful for different diseases neglected by previous papers.

Details

International Journal of Intelligent Computing and Cybernetics, vol. 13 no. 1
Type: Research Article
ISSN: 1756-378X

Keywords

Article
Publication date: 14 November 2016

Shrawan Kumar Trivedi and Shubhamoy Dey

The email is an important medium for sharing information rapidly. However, spam, being a nuisance in such communication, motivates the building of a robust filtering system with…

Abstract

Purpose

The email is an important medium for sharing information rapidly. However, spam, being a nuisance in such communication, motivates the building of a robust filtering system with high classification accuracy and good sensitivity towards false positives. In that context, this paper aims to present a combined classifier technique using a committee selection mechanism where the main objective is to identify a set of classifiers so that their individual decisions can be combined by a committee selection procedure for accurate detection of spam.

Design/methodology/approach

For training and testing of the relevant machine learning classifiers, text mining approaches are used in this research. Three data sets (Enron, SpamAssassin and LingSpam) have been used to test the classifiers. Initially, pre-processing is performed to extract the features associated with the email files. In the next step, the extracted features are taken through a dimensionality reduction method where non-informative features are removed. Subsequently, an informative feature subset is selected using genetic feature search. Thereafter, the proposed classifiers are tested on those informative features and the results compared with those of other classifiers.

Findings

For building the proposed combined classifier, three different studies have been performed. The first study identifies the effect of boosting algorithms on two probabilistic classifiers: Bayesian and Naïve Bayes. In that study, AdaBoost has been found to be the best algorithm for performance boosting. The second study was on the effect of different Kernel functions on support vector machine (SVM) classifier, where SVM with normalized polynomial (NP) kernel was observed to be the best. The last study was on combining classifiers with committee selection where the committee members were the best classifiers identified by the first study i.e. Bayesian and Naïve bays with AdaBoost, and the committee president was selected from the second study i.e. SVM with NP kernel. Results show that combining of the identified classifiers to form a committee machine gives excellent performance accuracy with a low false positive rate.

Research limitations/implications

This research is focused on the classification of email spams written in English language. Only body (text) parts of the emails have been used. Image spam has not been included in this work. We have restricted our work to only emails messages. None of the other types of messages like short message service or multi-media messaging service were a part of this study.

Practical implications

This research proposes a method of dealing with the issues and challenges faced by internet service providers and organizations that use email. The proposed model provides not only better classification accuracy but also a low false positive rate.

Originality/value

The proposed combined classifier is a novel classifier designed for accurate classification of email spam.

Details

VINE Journal of Information and Knowledge Management Systems, vol. 46 no. 4
Type: Research Article
ISSN: 2059-5891

Keywords

Article
Publication date: 26 July 2019

Ayalapogu Ratna Raju, Suresh Pabboju and Ramisetty Rajeswara Rao

Brain tumor segmentation and classification is the interesting area for differentiating the tumorous and the non-tumorous cells in the brain and classifies the tumorous cells for…

Abstract

Purpose

Brain tumor segmentation and classification is the interesting area for differentiating the tumorous and the non-tumorous cells in the brain and classifies the tumorous cells for identifying its level. The methods developed so far lack the automatic classification, consuming considerable time for the classification. In this work, a novel brain tumor classification approach, namely, harmony cuckoo search-based deep belief network (HCS-DBN) has been proposed. Here, the images present in the database are segmented based on the newly developed hybrid active contour (HAC) segmentation model, which is the integration of the Bayesian fuzzy clustering (BFC) and the active contour model. The proposed HCS-DBN algorithm is trained with the features obtained from the segmented images. Finally, the classifier provides the information about the tumor class in each slice available in the database. Experimentation of the proposed HAC and the HCS-DBN algorithm is done using the MRI image available in the BRATS database, and results are observed. The simulation results prove that the proposed HAC and the HCS-DBN algorithm have an overall better performance with the values of 0.945, 0.9695 and 0.99348 for accuracy, sensitivity and specificity, respectively.

Design/methodology/approach

The proposed HAC segmentation approach integrates the properties of the AC model and BFC. Initially, the brain image with different modalities is subjected to segmentation with the BFC and AC models. Then, the Laplacian correction is applied to fuse the segmented outputs from each model. Finally, the proposed HAC segmentation provides the error-free segments of the brain tumor regions prevailing in the MRI image. The next step is to extract the useful features, based on scattering transform, wavelet transform and local Gabor binary pattern, from the segmented brain image. Finally, the extracted features from each segment are provided to the DBN for the training, and the HCS algorithm chooses the optimal weights for DBN training.

Findings

The experimentation of the proposed HAC with the HCS-DBN algorithm is analyzed with the standard BRATS database, and its performance is evaluated based on metrics such as accuracy, sensitivity and specificity. The simulation results of the proposed HAC with the HCS-DBN algorithm are compared against existing works such as k-NN, NN, multi-SVM and multi-SVNN. The results achieved by the proposed HAC with the HCS-DBN algorithm are eventually higher than the existing works with the values of 0.945, 0.9695 and 0.99348 for accuracy, sensitivity and specificity, respectively.

Originality/value

This work presents the brain tumor segmentation and the classification scheme by introducing the HAC-based segmentation model. The proposed HAC model combines the BFC and the active contour model through a fusion process, using the Laplacian correction probability for segmenting the slices in the database.

Details

Sensor Review, vol. 39 no. 4
Type: Research Article
ISSN: 0260-2288

Keywords

Article
Publication date: 7 August 2017

Shianghau Wu and Jiannjong Guo

In this paper, the authors aim to propose to find the variables that affect the Taiwanese people’s satisfaction level of the general public with the government.

Abstract

Purpose

In this paper, the authors aim to propose to find the variables that affect the Taiwanese people’s satisfaction level of the general public with the government.

Design/methodology/approach

The authors intend to utilize the Bayesian quantile regression to explore the variables that affect the satisfaction of the general public at specific quantiles of Taiwanese satisfaction with the government and rough set classification to explore key variables related to the satisfaction level. Then they make the comparison of the classification among the two methods to obtain the performance of the classification.

Findings

The experiment result shows the major factors which have the positive relationship with the people who have higher satisfaction level with the central government. These factors include satisfaction with the uncorrupted performance of the central government; the evaluation of household’s economic condition one year after the present time; the satisfaction with the Taiwanese central government’s measures on food safety and the satisfaction with the 12 years primary education reform.

Originality/value

The study’s originality hinges on the application of Bayesian quantile regression and rough set classification to the analysis of the Taiwanese satisfaction with the government. It offers more insights on the key variables related to different satisfaction level and the classification performance between the two methods.

Details

Kybernetes, vol. 46 no. 7
Type: Research Article
ISSN: 0368-492X

Keywords

Article
Publication date: 25 October 2018

Shrawan Kumar Trivedi, Shubhamoy Dey and Anil Kumar

Sentiment analysis and opinion mining are emerging areas of research for analyzing Web data and capturing users’ sentiments. This research aims to present sentiment analysis of an…

Abstract

Purpose

Sentiment analysis and opinion mining are emerging areas of research for analyzing Web data and capturing users’ sentiments. This research aims to present sentiment analysis of an Indian movie review corpus using natural language processing and various machine learning classifiers.

Design/methodology/approach

In this paper, a comparative study between three machine learning classifiers (Bayesian, naïve Bayesian and support vector machine [SVM]) was performed. All the classifiers were trained on the words/features of the corpus extracted, using five different feature selection algorithms (Chi-square, info-gain, gain ratio, one-R and relief-F [RF] attributes), and a comparative study was performed between them. The classifiers and feature selection approaches were evaluated using different metrics (F-value, false-positive [FP] rate and training time).

Findings

The results of this study show that, for the maximum number of features, the RF feature selection approach was found to be the best, with better F-values, a low FP rate and less time needed to train the classifiers, whereas for the least number of features, one-R was better than RF. When the evaluation was performed for machine learning classifiers, SVM was found to be superior, although the Bayesian classifier was comparable with SVM.

Originality/value

This is a novel research where Indian review data were collected and then a classification model for sentiment polarity (positive/negative) was constructed.

Details

The Electronic Library, vol. 36 no. 4
Type: Research Article
ISSN: 0264-0473

Keywords

Open Access
Article
Publication date: 22 November 2022

Kedong Yin, Yun Cao, Shiwei Zhou and Xinman Lv

The purposes of this research are to study the theory and method of multi-attribute index system design and establish a set of systematic, standardized, scientific index systems…

Abstract

Purpose

The purposes of this research are to study the theory and method of multi-attribute index system design and establish a set of systematic, standardized, scientific index systems for the design optimization and inspection process. The research may form the basis for a rational, comprehensive evaluation and provide the most effective way of improving the quality of management decision-making. It is of practical significance to improve the rationality and reliability of the index system and provide standardized, scientific reference standards and theoretical guidance for the design and construction of the index system.

Design/methodology/approach

Using modern methods such as complex networks and machine learning, a system for the quality diagnosis of index data and the classification and stratification of index systems is designed. This guarantees the quality of the index data, realizes the scientific classification and stratification of the index system, reduces the subjectivity and randomness of the design of the index system, enhances its objectivity and rationality and lays a solid foundation for the optimal design of the index system.

Findings

Based on the ideas of statistics, system theory, machine learning and data mining, the focus in the present research is on “data quality diagnosis” and “index classification and stratification” and clarifying the classification standards and data quality characteristics of index data; a data-quality diagnosis system of “data review – data cleaning – data conversion – data inspection” is established. Using a decision tree, explanatory structural model, cluster analysis, K-means clustering and other methods, classification and hierarchical method system of indicators is designed to reduce the redundancy of indicator data and improve the quality of the data used. Finally, the scientific and standardized classification and hierarchical design of the index system can be realized.

Originality/value

The innovative contributions and research value of the paper are reflected in three aspects. First, a method system for index data quality diagnosis is designed, and multi-source data fusion technology is adopted to ensure the quality of multi-source, heterogeneous and mixed-frequency data of the index system. The second is to design a systematic quality-inspection process for missing data based on the systematic thinking of the whole and the individual. Aiming at the accuracy, reliability, and feasibility of the patched data, a quality-inspection method of patched data based on inversion thought and a unified representation method of data fusion based on a tensor model are proposed. The third is to use the modern method of unsupervised learning to classify and stratify the index system, which reduces the subjectivity and randomness of the design of the index system and enhances its objectivity and rationality.

Details

Marine Economics and Management, vol. 5 no. 2
Type: Research Article
ISSN: 2516-158X

Keywords

Article
Publication date: 29 October 2018

Shrawan Kumar Trivedi and Shubhamoy Dey

To be sustainable and competitive in the current business environment, it is useful to understand users’ sentiment towards products and services. This critical task can be…

Abstract

Purpose

To be sustainable and competitive in the current business environment, it is useful to understand users’ sentiment towards products and services. This critical task can be achieved via natural language processing and machine learning classifiers. This paper aims to propose a novel probabilistic committee selection classifier (PCC) to analyse and classify the sentiment polarities of movie reviews.

Design/methodology/approach

An Indian movie review corpus is assembled for this study. Another publicly available movie review polarity corpus is also involved with regard to validating the results. The greedy stepwise search method is used to extract the features/words of the reviews. The performance of the proposed classifier is measured using different metrics, such as F-measure, false positive rate, receiver operating characteristic (ROC) curve and training time. Further, the proposed classifier is compared with other popular machine-learning classifiers, such as Bayesian, Naïve Bayes, Decision Tree (J48), Support Vector Machine and Random Forest.

Findings

The results of this study show that the proposed classifier is good at predicting the positive or negative polarity of movie reviews. Its performance accuracy and the value of the ROC curve of the PCC is found to be the most suitable of all other classifiers tested in this study. This classifier is also found to be efficient at identifying positive sentiments of reviews, where it gives low false positive rates for both the Indian Movie Review and Review Polarity corpora used in this study. The training time of the proposed classifier is found to be slightly higher than that of Bayesian, Naïve Bayes and J48.

Research limitations/implications

Only movie review sentiments written in English are considered. In addition, the proposed committee selection classifier is prepared only using the committee of probabilistic classifiers; however, other classifier committees can also be built, tested and compared with the present experiment scenario.

Practical implications

In this paper, a novel probabilistic approach is proposed and used for classifying movie reviews, and is found to be highly effective in comparison with other state-of-the-art classifiers. This classifier may be tested for different applications and may provide new insights for developers and researchers.

Social implications

The proposed PCC may be used to classify different product reviews, and hence may be beneficial to organizations to justify users’ reviews about specific products or services. By using authentic positive and negative sentiments of users, the credibility of the specific product, service or event may be enhanced. PCC may also be applied to other applications, such as spam detection, blog mining, news mining and various other data-mining applications.

Originality/value

The constructed PCC is novel and was tested on Indian movie review data.

Article
Publication date: 5 February 2018

Loukas K. Tsironis

The purpose of this paper is to propose a way of implementing data mining (DM) techniques and algorithms to apply quality improvement (QI) approaches in order to resolve quality…

1373

Abstract

Purpose

The purpose of this paper is to propose a way of implementing data mining (DM) techniques and algorithms to apply quality improvement (QI) approaches in order to resolve quality issues (Rokach and Maimon, 2006; Köksal et al., 2011; Kahraman and Yanik, 2016). The effectiveness of the proposed methodologies is demonstrated through their application results. The goal of this paper is to develop a DM system based on the seven new QI tools in order to discover useful knowledge, in the form of rules, that are hidden in a vast amount of data and to propose solutions and actions that will lead an organization to improve its quality through the evaluation of the results.

Design/methodology/approach

Four popular data-mining approaches (rough sets, association rules, classification rules and Bayesian networks) are applied on a set of 12,477 case records concerning vehicle damages. The set of rules and patterns that is produced by each algorithm is used as an input in order to dynamically form each of the seven new quality tools (QTs).

Findings

The proposed approach enables the creation of the QTs starting from the raw data and passing through the DM process.

Originality/value

The present paper proposes an innovative work concerning the formation of the seven new QTs of quality management using DM popular algorithms. The resulted seven DM QTs were used to identify patterns and understand, so they can lead even non-experts to draw useful conclusions and make decisions.

Details

Benchmarking: An International Journal, vol. 25 no. 1
Type: Research Article
ISSN: 1463-5771

Keywords

Article
Publication date: 1 September 2021

Yuting Jiang, Shengli Deng, Hongxiu Li and Yong Liu

The purposes of this paper are to (1) explore how personality traits pertaining to the dominance influence steadiness compliance model manifest themselves in terms of user…

Abstract

Purpose

The purposes of this paper are to (1) explore how personality traits pertaining to the dominance influence steadiness compliance model manifest themselves in terms of user interaction behavior on social media and (2) examine whether social interaction data on social media platforms can predict user personality.

Design/methodology/approach

Social interaction data was collected from 198 users of Sina Weibo, a popular social media platform in China. Their personality traits were also measured via questionnaire. Machine learning techniques were applied to predict the personality traits based on the social interaction data.

Findings

The results demonstrated that the proposed classifiers had high prediction accuracy, indicating that our approach is reliable and can be used with social interaction data on social media platforms to predict user personality. “Reposting,” “being reposted,” “commenting” and “being commented on” were found to be the key interaction features that reflected Weibo users' personalities, whereas “liking” was not found to be a key feature.

Originality/value

The findings of this study are expected to enrich personality prediction research based on social media data and to provide insights into the potential of employing social media data for the purpose of personality prediction in the context of the Weibo social media platform in China.

Details

Aslib Journal of Information Management, vol. 73 no. 6
Type: Research Article
ISSN: 2050-3806

Keywords

Abstract

Details

Operations Research for Libraries and Information Agencies: Techniques for the Evaluation of Management Decision Alternatives
Type: Book
ISBN: 978-0-12424-520-4

1 – 10 of over 3000