Search results

1 – 10 of over 67000
Open Access
Article
Publication date: 22 November 2022

Kedong Yin, Yun Cao, Shiwei Zhou and Xinman Lv

The purposes of this research are to study the theory and method of multi-attribute index system design and establish a set of systematic, standardized, scientific index systems…

Abstract

Purpose

The purposes of this research are to study the theory and method of multi-attribute index system design and establish a set of systematic, standardized, scientific index systems for the design optimization and inspection process. The research may form the basis for a rational, comprehensive evaluation and provide the most effective way of improving the quality of management decision-making. It is of practical significance to improve the rationality and reliability of the index system and provide standardized, scientific reference standards and theoretical guidance for the design and construction of the index system.

Design/methodology/approach

Using modern methods such as complex networks and machine learning, a system for the quality diagnosis of index data and the classification and stratification of index systems is designed. This guarantees the quality of the index data, realizes the scientific classification and stratification of the index system, reduces the subjectivity and randomness of the design of the index system, enhances its objectivity and rationality and lays a solid foundation for the optimal design of the index system.

Findings

Based on the ideas of statistics, system theory, machine learning and data mining, the focus in the present research is on “data quality diagnosis” and “index classification and stratification” and clarifying the classification standards and data quality characteristics of index data; a data-quality diagnosis system of “data review – data cleaning – data conversion – data inspection” is established. Using a decision tree, explanatory structural model, cluster analysis, K-means clustering and other methods, classification and hierarchical method system of indicators is designed to reduce the redundancy of indicator data and improve the quality of the data used. Finally, the scientific and standardized classification and hierarchical design of the index system can be realized.

Originality/value

The innovative contributions and research value of the paper are reflected in three aspects. First, a method system for index data quality diagnosis is designed, and multi-source data fusion technology is adopted to ensure the quality of multi-source, heterogeneous and mixed-frequency data of the index system. The second is to design a systematic quality-inspection process for missing data based on the systematic thinking of the whole and the individual. Aiming at the accuracy, reliability, and feasibility of the patched data, a quality-inspection method of patched data based on inversion thought and a unified representation method of data fusion based on a tensor model are proposed. The third is to use the modern method of unsupervised learning to classify and stratify the index system, which reduces the subjectivity and randomness of the design of the index system and enhances its objectivity and rationality.

Details

Marine Economics and Management, vol. 5 no. 2
Type: Research Article
ISSN: 2516-158X

Keywords

Article
Publication date: 28 February 2023

Meltem Aksoy, Seda Yanık and Mehmet Fatih Amasyali

When a large number of project proposals are evaluated to allocate available funds, grouping them based on their similarities is beneficial. Current approaches to group proposals…

Abstract

Purpose

When a large number of project proposals are evaluated to allocate available funds, grouping them based on their similarities is beneficial. Current approaches to group proposals are primarily based on manual matching of similar topics, discipline areas and keywords declared by project applicants. When the number of proposals increases, this task becomes complex and requires excessive time. This paper aims to demonstrate how to effectively use the rich information in the titles and abstracts of Turkish project proposals to group them automatically.

Design/methodology/approach

This study proposes a model that effectively groups Turkish project proposals by combining word embedding, clustering and classification techniques. The proposed model uses FastText, BERT and term frequency/inverse document frequency (TF/IDF) word-embedding techniques to extract terms from the titles and abstracts of project proposals in Turkish. The extracted terms were grouped using both the clustering and classification techniques. Natural groups contained within the corpus were discovered using k-means, k-means++, k-medoids and agglomerative clustering algorithms. Additionally, this study employs classification approaches to predict the target class for each document in the corpus. To classify project proposals, various classifiers, including k-nearest neighbors (KNN), support vector machines (SVM), artificial neural networks (ANN), classification and regression trees (CART) and random forest (RF), are used. Empirical experiments were conducted to validate the effectiveness of the proposed method by using real data from the Istanbul Development Agency.

Findings

The results show that the generated word embeddings can effectively represent proposal texts as vectors, and can be used as inputs for clustering or classification algorithms. Using clustering algorithms, the document corpus is divided into five groups. In addition, the results demonstrate that the proposals can easily be categorized into predefined categories using classification algorithms. SVM-Linear achieved the highest prediction accuracy (89.2%) with the FastText word embedding method. A comparison of manual grouping with automatic classification and clustering results revealed that both classification and clustering techniques have a high success rate.

Research limitations/implications

The proposed model automatically benefits from the rich information in project proposals and significantly reduces numerous time-consuming tasks that managers must perform manually. Thus, it eliminates the drawbacks of the current manual methods and yields significantly more accurate results. In the future, additional experiments should be conducted to validate the proposed method using data from other funding organizations.

Originality/value

This study presents the application of word embedding methods to effectively use the rich information in the titles and abstracts of Turkish project proposals. Existing research studies focus on the automatic grouping of proposals; traditional frequency-based word embedding methods are used for feature extraction methods to represent project proposals. Unlike previous research, this study employs two outperforming neural network-based textual feature extraction techniques to obtain terms representing the proposals: BERT as a contextual word embedding method and FastText as a static word embedding method. Moreover, to the best of our knowledge, there has been no research conducted on the grouping of project proposals in Turkish.

Details

International Journal of Intelligent Computing and Cybernetics, vol. 16 no. 3
Type: Research Article
ISSN: 1756-378X

Keywords

Article
Publication date: 16 April 2020

Mohammad Mahdi Ershadi and Abbas Seifi

This study aims to differential diagnosis of some diseases using classification methods to support effective medical treatment. For this purpose, different classification methods…

Abstract

Purpose

This study aims to differential diagnosis of some diseases using classification methods to support effective medical treatment. For this purpose, different classification methods based on data, experts’ knowledge and both are considered in some cases. Besides, feature reduction and some clustering methods are used to improve their performance.

Design/methodology/approach

First, the performances of classification methods are evaluated for differential diagnosis of different diseases. Then, experts' knowledge is utilized to modify the Bayesian networks' structures. Analyses of the results show that using experts' knowledge is more effective than other algorithms for increasing the accuracy of Bayesian network classification. A total of ten different diseases are used for testing, taken from the Machine Learning Repository datasets of the University of California at Irvine (UCI).

Findings

The proposed method improves both the computation time and accuracy of the classification methods used in this paper. Bayesian networks based on experts' knowledge achieve a maximum average accuracy of 87 percent, with a minimum standard deviation average of 0.04 over the sample datasets among all classification methods.

Practical implications

The proposed methodology can be applied to perform disease differential diagnosis analysis.

Originality/value

This study presents the usefulness of experts' knowledge in the diagnosis while proposing an adopted improvement method for classifications. Besides, the Bayesian network based on experts' knowledge is useful for different diseases neglected by previous papers.

Details

International Journal of Intelligent Computing and Cybernetics, vol. 13 no. 1
Type: Research Article
ISSN: 1756-378X

Keywords

Article
Publication date: 19 December 2023

Jinchao Huang

Single-shot multi-category clothing recognition and retrieval play a crucial role in online searching and offline settlement scenarios. Existing clothing recognition methods based…

Abstract

Purpose

Single-shot multi-category clothing recognition and retrieval play a crucial role in online searching and offline settlement scenarios. Existing clothing recognition methods based on RGBD clothing images often suffer from high-dimensional feature representations, leading to compromised performance and efficiency.

Design/methodology/approach

To address this issue, this paper proposes a novel method called Manifold Embedded Discriminative Feature Selection (MEDFS) to select global and local features, thereby reducing the dimensionality of the feature representation and improving performance. Specifically, by combining three global features and three local features, a low-dimensional embedding is constructed to capture the correlations between features and categories. The MEDFS method designs an optimization framework utilizing manifold mapping and sparse regularization to achieve feature selection. The optimization objective is solved using an alternating iterative strategy, ensuring convergence.

Findings

Empirical studies conducted on a publicly available RGBD clothing image dataset demonstrate that the proposed MEDFS method achieves highly competitive clothing classification performance while maintaining efficiency in clothing recognition and retrieval.

Originality/value

This paper introduces a novel approach for multi-category clothing recognition and retrieval, incorporating the selection of global and local features. The proposed method holds potential for practical applications in real-world clothing scenarios.

Details

International Journal of Intelligent Computing and Cybernetics, vol. ahead-of-print no. ahead-of-print
Type: Research Article
ISSN: 1756-378X

Keywords

Book part
Publication date: 6 September 2019

Son Nguyen, Gao Niu, John Quinn, Alan Olinsky, Jonathan Ormsbee, Richard M. Smith and James Bishop

In recent years, the problem of classification with imbalanced data has been growing in popularity in the data-mining and machine-learning communities due to the emergence of an…

Abstract

In recent years, the problem of classification with imbalanced data has been growing in popularity in the data-mining and machine-learning communities due to the emergence of an abundance of imbalanced data in many fields. In this chapter, we compare the performance of six classification methods on an imbalanced dataset under the influence of four resampling techniques. These classification methods are the random forest, the support vector machine, logistic regression, k-nearest neighbor (KNN), the decision tree, and AdaBoost. Our study has shown that all of the classification methods have difficulty when working with the imbalanced data, with the KNN performing the worst, detecting only 27.4% of the minority class. However, with the help of resampling techniques, all of the classification methods experience improvement on overall performances. In particular, the Random Forest, in combination with the random over-sampling technique, performs the best, achieving 82.8% balanced accuracy (the average of the true-positive rate and true-negative rate).

We then propose a new procedure to resample the data. Our method is based on the idea of eliminating “easy” majority observations before under-sampling them. It has further improved the balanced accuracy of the Random Forest to 83.7%, making it the best approach for the imbalanced data.

Details

Advances in Business and Management Forecasting
Type: Book
ISBN: 978-1-78754-290-7

Keywords

Article
Publication date: 6 October 2023

Vahide Bulut

Feature extraction from 3D datasets is a current problem. Machine learning is an important tool for classification of complex 3D datasets. Machine learning classification…

Abstract

Purpose

Feature extraction from 3D datasets is a current problem. Machine learning is an important tool for classification of complex 3D datasets. Machine learning classification techniques are widely used in various fields, such as text classification, pattern recognition, medical disease analysis, etc. The aim of this study is to apply the most popular classification and regression methods to determine the best classification and regression method based on the geodesics.

Design/methodology/approach

The feature vector is determined by the unit normal vector and the unit principal vector at each point of the 3D surface along with the point coordinates themselves. Moreover, different examples are compared according to the classification methods in terms of accuracy and the regression algorithms in terms of R-squared value.

Findings

Several surface examples are analyzed for the feature vector using classification (31 methods) and regression (23 methods) machine learning algorithms. In addition, two ensemble methods XGBoost and LightGBM are used for classification and regression. Also, the scores for each surface example are compared.

Originality/value

To the best of the author’s knowledge, this is the first study to analyze datasets based on geodesics using machine learning algorithms for classification and regression.

Details

Engineering Computations, vol. 40 no. 9/10
Type: Research Article
ISSN: 0264-4401

Keywords

Article
Publication date: 31 July 2019

Zhe Zhang and Yue Dai

For classification problems of customer relationship management (CRM), the purpose of this paper is to propose a method with interpretability of the classification results that…

Abstract

Purpose

For classification problems of customer relationship management (CRM), the purpose of this paper is to propose a method with interpretability of the classification results that combines multiple decision trees based on a genetic algorithm.

Design/methodology/approach

In the proposed method, multiple decision trees are combined in parallel. Subsequently, a genetic algorithm is used to optimize the weight matrix in the combination algorithm.

Findings

The method is applied to customer credit rating assessment and customer response behavior pattern recognition. The results demonstrate that compared to a single decision tree, the proposed combination method improves the predictive accuracy and optimizes the classification rules, while maintaining interpretability of the classification results.

Originality/value

The findings of this study contribute to research methodologies in CRM. It specifically focuses on a new method with interpretability by combining multiple decision trees based on genetic algorithms for customer classification.

Details

Asia Pacific Journal of Marketing and Logistics, vol. 32 no. 5
Type: Research Article
ISSN: 1355-5855

Keywords

Article
Publication date: 17 August 2018

Youlong Lv, Wei Qin, Jungang Yang and Jie Zhang

Three adjustment modes are alternatives for mixed-model assembly lines (MMALs) to improve their production plans according to constantly changing customer requirements. The…

Abstract

Purpose

Three adjustment modes are alternatives for mixed-model assembly lines (MMALs) to improve their production plans according to constantly changing customer requirements. The purpose of this paper is to deal with the decision-making problem between these modes by proposing a novel multi-classification method. This method recommends appropriate adjustment modes for the assembly lines faced with different customer orders through machine learning from historical data.

Design/methodology/approach

The decision-making method uses the classification model composed of an input layer, two intermediate layers and an output layer. The input layer describes the assembly line in a knowledge-intensive manner by presenting the impact degrees of production parameters on line performances. The first intermediate layer provides the support vector data description (SVDD) of each adjustment mode through historical data training. The second intermediate layer employs the Dempster–Shafer (D–S) theory to combine the posterior classification possibilities generated from different SVDDs. The output layer gives the adjustment mode with the maximum posterior possibility as the classification result according to Bayesian decision theory.

Findings

The proposed method achieves higher classification accuracies than the support vector machine methods and the traditional SVDD method in the numerical test consisting of data sets from the machine-learning repository and the case study of a diesel engine assembly line.

Practical implications

This research recommends appropriate adjustment modes for MMALs in response to customer demand changes. According to the suggested adjustment mode, the managers can improve the line performance more effectively by using the well-designed optimization methods for a specific scope.

Originality/value

The adjustment mode decision belongs to the multi-classification problem featured with limited historical data. Although traditional SVDD methods can solve these problems by providing the posterior possibility of each classification result, they might have poor classification accuracies owing to the conflicts and uncertainties of these possibilities. This paper develops a novel classification model that integrates the SVDD method with the D–S theory. By handling the conflicts and uncertainties appropriately, this model achieves higher classification accuracies than traditional methods.

Details

Industrial Management & Data Systems, vol. 118 no. 8
Type: Research Article
ISSN: 0263-5577

Keywords

Article
Publication date: 1 April 2006

Janita F.J. Vos and Marjolein C. Achterkamp

The management of stakeholder involvement within innovation projects is a task of growing importance. The purpose of this paper is to present a method for the first challenge in…

7880

Abstract

Purpose

The management of stakeholder involvement within innovation projects is a task of growing importance. The purpose of this paper is to present a method for the first challenge in stakeholder management: the identification of those stakeholders to be involved in innovation projects.

Design/methodology/approach

Analysis of stakeholder literature leads to the conclusion that stakeholder identification is considered a problem of classification. Although the availability of a classification model is necessary, it is argued that for a classification model to be of use in identifying stakeholders, such a model needs to be supplemented with an identification procedure for identifying real world parties. Furthermore, a classification model should fit the context the stakeholders are identified for, in this case for innovation projects. These insights have led to the development of a classification model fitting the innovation context, and to the embedding of this model, along with a matching identification procedure, in an identification method.

Findings

A partial and integral evaluation of the method on four cases showed its efficacy in the managerial practice of identifying stakeholders within innovation projects.

Originality/value

The method as proposed in the paper can be used for identifying stakeholders in innovation projects. The method can be considered a first step in managing stakeholder involvement.

Details

European Journal of Innovation Management, vol. 9 no. 2
Type: Research Article
ISSN: 1460-1060

Keywords

Article
Publication date: 25 February 2021

Baohua Yang, Junming Jiang and Jinshuai Zhao

The purpose of this study is to construct a gray relational model based on information diffusion to avoid rank reversal when the available decision information is insufficient, or…

Abstract

Purpose

The purpose of this study is to construct a gray relational model based on information diffusion to avoid rank reversal when the available decision information is insufficient, or the decision objects vary.

Design/methodology/approach

Considering that the sample dependence of the ideal sequence selection in gray relational decision-making is based on case sampling, which causes the phenomenon of rank reversal, this study designs an ideal point diffusion method based on the development trend and distribution skewness of the sample information. In this method, a gray relational model for sample classification is constructed using a virtual-ideal sequence. Subsequently, an optimization model is established to obtain the criteria weights and classification radius values that minimize the deviation between the comprehensive relational degree of the classification object and the critical value.

Findings

The rank-reversal problem in gray relational models could drive decision-makers away from using this method. The results of this study demonstrate that the proposed gray relational model based on information diffusion and virtual-ideal sequencing can effectively avoid rank reversal. The method is applied to classify 31 brownfield redevelopment projects based on available interval gray information. The case analysis verifies the rationality and feasibility of the model.

Originality/value

This study proposes a robust method for ideal point choice when the decision information is limited or dynamic. This method can reduce the influence of ideal sequence changes in gray relational models on decision-making results considerably better than other approaches.

Details

Grey Systems: Theory and Application, vol. 12 no. 1
Type: Research Article
ISSN: 2043-9377

Keywords

1 – 10 of over 67000