Search results

1 – 10 of over 46000
Article
Publication date: 6 March 2017

Michael J. Brusco, Renu Singh, J. Dennis Cradit and Douglas Steinley

The purpose of this paper is twofold. First, the authors provide a survey of operations management (OM) research applications of traditional hierarchical and…

1703

Abstract

Purpose

The purpose of this paper is twofold. First, the authors provide a survey of operations management (OM) research applications of traditional hierarchical and nonhierarchical clustering methods with respect to key decisions that are central to a valid analysis. Second, the authors offer recommendations for practice with respect to these decisions.

Design/methodology/approach

A coding study was conducted for 97 cluster analyses reported in six OM journals during the period spanning 1994-2015. Data were collected with respect to: variable selection, variable standardization, method, selection of the number of clusters, consistency/stability of the clustering solution, and profiling of the clusters based on exogenous variables. Recommended practices for validation of clustering solutions are provided within the context of this framework.

Findings

There is considerable variability across clustering applications with respect to the components of validation, as well as a mix of productive and undesirable practices. This justifies the importance of the authors’ provision of a schema for conducting a cluster analysis.

Research limitations/implications

Certain aspects of the coding study required some degree of subjectivity with respect to interpretation or classification. However, in light of the sheer magnitude of the coding study (97 articles), the authors are confident that an accurate picture of empirical OM clustering applications has been presented.

Practical implications

The paper provides a critique and synthesis of the practice of cluster analysis in OM research. The coding study provides a thorough foundation for how the key decisions of a cluster analysis have been previously handled in the literature. Both researchers and practitioners are provided with guidelines for performing a valid cluster analysis.

Originality/value

To the best of the authors’ knowledge, no study of this type has been reported in the OM literature. The authors’ recommendations for cluster validation draw from recent studies in other disciplines that are apt to be unfamiliar to many OM researchers.

Details

International Journal of Operations & Production Management, vol. 37 no. 3
Type: Research Article
ISSN: 0144-3577

Keywords

Article
Publication date: 15 May 2017

Young Wook Seo, Kun Chang Lee and Sangjae Lee

For those who plan research funds and assess the research performance from the funds, it is necessary to overcome the limitations of the conventional classification of…

Abstract

Purpose

For those who plan research funds and assess the research performance from the funds, it is necessary to overcome the limitations of the conventional classification of evaluated papers published by the research funds. Besides, they need to promote the objective, fair clustering of papers, and analysis of research performance. Therefore, the purpose of this paper is to find the optimum clustering algorithm using the MATLAB tools by comparing the performances of and the hybrid particle swarm optimization algorithms using the particle swarm optimization (PSO) algorithm and the conventional K-means clustering method.

Design/methodology/approach

The clustering analysis experiment for each of the three fields of study – health and medicine, physics, and chemistry – used the following three algorithms: “K-means+Simulated annealing (SA)+Adjustment of parameters+PSO” (KASA-PSO clustering), “K-means+SA+PSO” clustering, “K-means+PSO” clustering.

Findings

The clustering analyses of all the three fields showed that KASA-PSO is the best method for the minimization of fitness value. Furthermore, this study administered the surveys intended for the “performance measurement of decision-making process” with 13 members of the research fund organization to compare the group clustering by the clustering analysis method of KASA-PSO algorithm and the group clustering by research funds. The results statistically demonstrated that the group clustering by the clustering analysis method of KASA-PSO algorithm was better than the group clustering by research funds.

Practical implications

This study examined the impact of bibliometric indicators on research impact of papers. The results showed that research period, the number of authors, and the number of participating researchers had positive effects on the impact factor (IF) of the papers; the IF that indicates the qualitative level of papers had a positive effect on the primary times cited; and the primary times cited had a positive effect on the secondary times cited. Furthermore, this study clearly showed the decision quality perceived by those who are working for the research fund organization.

Originality/value

There are still too few studies that assess the research project evaluation mechanisms and its effectiveness perceived by the research fund managers. To fill the research void like this, this study aims to propose PSO and successfully proves validity of the proposed approach.

Article
Publication date: 16 July 2019

Yong Liu, Jun-liang Du, Ren-Shi Zhang and Jeffrey Yi-Lin Forrest

This paper aims to establish a novel three-way decisions-based grey incidence analysis clustering approach and exploit it to extract information and rules implied in panel data.

Abstract

Purpose

This paper aims to establish a novel three-way decisions-based grey incidence analysis clustering approach and exploit it to extract information and rules implied in panel data.

Design/methodology/approach

Because of taking on the spatiotemporal characteristics, panel data can well-describe and depict the systematic and dynamic of the decision objects. However, it is difficult for traditional panel data analysis methods to efficiently extract information and rules implied in panel data. To effectively deal with panel data clustering problem, according to the spatiotemporal characteristics of panel data, from the three dimensions of absolute amount level, increasing amount level and volatility level, the authors define the conception of the comprehensive distance between decision objects, and then construct a novel grey incidence analysis clustering approach for panel data and study its computing mechanism of threshold value by exploiting the thought and method of three-way decisions; finally, the authors take a case of the clustering problems on the regional high-tech industrialization in China to illustrate the validity and rationality of the proposed model.

Findings

The results show that the proposed model can objectively determine the threshold value of clustering and achieve the extraction of information and rules inherent in the data panel.

Practical implications

The novel model proposed in the paper can well-describe and resolve panel data clustering problem and efficiently extract information and rules implied in panel data.

Originality/value

The proposed model can deal with panel data clustering problem and realize the extraction of information and rules inherent in the data panel.

Details

Kybernetes, vol. 48 no. 9
Type: Research Article
ISSN: 0368-492X

Keywords

Article
Publication date: 23 August 2022

Kamlesh Kumar Pandey and Diwakar Shukla

The K-means (KM) clustering algorithm is extremely responsive to the selection of initial centroids since the initial centroid of clusters determines computational…

Abstract

Purpose

The K-means (KM) clustering algorithm is extremely responsive to the selection of initial centroids since the initial centroid of clusters determines computational effectiveness, efficiency and local optima issues. Numerous initialization strategies are to overcome these problems through the random and deterministic selection of initial centroids. The random initialization strategy suffers from local optimization issues with the worst clustering performance, while the deterministic initialization strategy achieves high computational cost. Big data clustering aims to reduce computation costs and improve cluster efficiency. The objective of this study is to achieve a better initial centroid for big data clustering on business management data without using random and deterministic initialization that avoids local optima and improves clustering efficiency with effectiveness in terms of cluster quality, computation cost, data comparisons and iterations on a single machine.

Design/methodology/approach

This study presents the Normal Distribution Probability Density (NDPD) algorithm for big data clustering on a single machine to solve business management-related clustering issues. The NDPDKM algorithm resolves the KM clustering problem by probability density of each data point. The NDPDKM algorithm first identifies the most probable density data points by using the mean and standard deviation of the datasets through normal probability density. Thereafter, the NDPDKM determines K initial centroid by using sorting and linear systematic sampling heuristics.

Findings

The performance of the proposed algorithm is compared with KM, KM++, Var-Part, Murat-KM, Mean-KM and Sort-KM algorithms through Davies Bouldin score, Silhouette coefficient, SD Validity, S_Dbw Validity, Number of Iterations and CPU time validation indices on eight real business datasets. The experimental evaluation demonstrates that the NDPDKM algorithm reduces iterations, local optima, computing costs, and improves cluster performance, effectiveness, efficiency with stable convergence as compared to other algorithms. The NDPDKM algorithm minimizes the average computing time up to 34.83%, 90.28%, 71.83%, 92.67%, 69.53% and 76.03%, and reduces the average iterations up to 40.32%, 44.06%, 32.02%, 62.78%, 19.07% and 36.74% with reference to KM, KM++, Var-Part, Murat-KM, Mean-KM and Sort-KM algorithms.

Originality/value

The KM algorithm is the most widely used partitional clustering approach in data mining techniques that extract hidden knowledge, patterns and trends for decision-making strategies in business data. Business analytics is one of the applications of big data clustering where KM clustering is useful for the various subcategories of business analytics such as customer segmentation analysis, employee salary and performance analysis, document searching, delivery optimization, discount and offer analysis, chaplain management, manufacturing analysis, productivity analysis, specialized employee and investor searching and other decision-making strategies in business.

Details

Journal of Advances in Management Research, vol. ahead-of-print no. ahead-of-print
Type: Research Article
ISSN: 0972-7981

Keywords

Article
Publication date: 30 July 2019

Hossein Abbasimehr and Mostafa Shabani

The purpose of this paper is to propose a new methodology that handles the issue of the dynamic behavior of customers over time.

Abstract

Purpose

The purpose of this paper is to propose a new methodology that handles the issue of the dynamic behavior of customers over time.

Design/methodology/approach

A new methodology is presented based on time series clustering to extract dominant behavioral patterns of customers over time. This methodology is implemented using bank customers’ transactions data which are in the form of time series data. The data include the recency (R), frequency (F) and monetary (M) attributes of businesses that are using the point-of-sale (POS) data of a bank. This data were obtained from the data analysis department of the bank.

Findings

After carrying out an empirical study on the acquired transaction data of 2,531 business customers that are using POS devices of the bank, the dominant trends of behavior are discovered using the proposed methodology. The obtained trends were analyzed from the marketing viewpoint. Based on the analysis of the monetary attribute, customers were divided into four main segments, including high-value growing customers, middle-value growing customers, prone to churn and churners. For each resulted group of customers with a distinctive trend, effective and practical marketing recommendations were devised to improve the bank relationship with that group. The prone-to-churn segment contains most of the customers; therefore, the bank should conduct interesting promotions to retain this segment.

Practical implications

The discovered trends of customer behavior and proposed marketing recommendations can be helpful for banks in devising segment-specific marketing strategies as they illustrate the dynamic behavior of customers over time. The obtained trends are visualized so that they can be easily interpreted and used by banks. This paper contributes to the literature on customer relationship management (CRM) as the proposed methodology can be effectively applied to different businesses to reveal trends in customer behavior.

Originality/value

In the current business condition, customer behavior is changing continually over time and customers are churning due to the reduced switching costs. Therefore, choosing an effective customer segmentation methodology which can consider the dynamic behaviors of customers is essential for every business. This paper proposes a new methodology to capture customer dynamic behavior using time series clustering on time-ordered data. This is an improvement over previous studies, in which static segmentation approaches have often been adopted. To the best of the authors’ knowledge, this is the first study that combines the recency, frequency, and monetary model and time series clustering to reveal trends in customer behavior.

Article
Publication date: 1 August 2016

Peiman Alipour Sarvari, Alp Ustundag and Hidayet Takci

The purpose of this paper is to determine the best approach to customer segmentation and to extrapolate associated rules for this based on recency, frequency and monetary…

3266

Abstract

Purpose

The purpose of this paper is to determine the best approach to customer segmentation and to extrapolate associated rules for this based on recency, frequency and monetary (RFM) considerations as well as demographic factors. In this study, the impacts of RFM and demographic attributes have been challenged in order to enrich factors that lend comprehension to customer segmentation. Different types of scenario were designed, performed and evaluated meticulously under uniform test conditions. The data for this study were extracted from the database of a global pizza restaurant chain in Turkey. This paper summarizes the findings of the study and also provides evidence of its empirical implications to improve the performance of customer segmentation as well as achieving extracted rule perfection via effective model factors and variations. Accordingly, marketing and service processes will work more effectively and efficiently for customers and society. The implication of this study is that it explains a clear concept for interaction between producers and consumers.

Design/methodology/approach

Customer relationship management, which aims to manage record and evaluate customer interactions, is generally regarded as a vital tool for companies that wish to be successful in the rapidly changing global market. The prediction of customer behaviors is a strategically important and difficult issue because of the high variance and wide range of customer orders and preferences. So to have an effective tool for extracting rules based on customer purchasing behavior, considering tangible and intangible criteria is highly important. To overcome the challenges imposed by the multifaceted nature of this problem, the authors utilized artificial intelligence methods, including k-means clustering, Apriori association rule mining (ARM) and neural networks. The main idea was that customer clusters are better enhanced when segmentation processes are based on RFM analysis accompanied by demographic data. Weighted RFM (WRFM) and unweighted RFM values/scores were applied with and without demographic factors and utilized to compose different types and numbers of clusters. The Apriori algorithm was used to extract rules of association. The performance analyses of scenarios have been conducted based on these extracted rules. The number of rules, elapsed time and prediction accuracy were used to evaluate the different scenarios. The results of evaluations were compared with the outputs of another available technique.

Findings

The results showed that having an appropriate segmentation approach is vital if there are to be strong association rules. Also, it has been determined from the results that the weights of RFM attributes affect rule association performance positively. Moreover, to capture more accurate customer segments, a combination of RFM and demographic attributes is recommended for clustering. The results’ analyses indicate the undeniable importance of demographic data merged with WRFM. Above all, this challenge introduced the best possible sequence of factors for an analysis of clustering and ARM based on RFM and demographic data.

Originality/value

The work compared k-means and Kohonen clustering methods in its segmentation phase to prove the superiority of adopted segmentation techniques. In addition, this study indicated that customer segments containing WRFM scores and demographic data in the same clusters brought about stronger and more accurate association rules for the understanding of customer behavior. These so-called achievements were compared with the results of classical approaches in order to support the credibility of the proposed methodology. Based on previous works, classical methods for customer segmentation have overlooked any combination of demographic data with WRFM during clustering before proceeding to their rule extraction stages.

Article
Publication date: 17 October 2008

Rui Xu and Donald C. Wunsch

The purpose of this paper is to provide a review of the issues related to cluster analysis, one of the most important and primitive activities of human beings, and of the…

1710

Abstract

Purpose

The purpose of this paper is to provide a review of the issues related to cluster analysis, one of the most important and primitive activities of human beings, and of the advances made in recent years.

Design/methodology/approach

The paper investigates the clustering algorithms rooted in machine learning, computer science, statistics, and computational intelligence.

Findings

The paper reviews the basic issues of cluster analysis and discusses the recent advances of clustering algorithms in scalability, robustness, visualization, irregular cluster shape detection, and so on.

Originality/value

The paper presents a comprehensive and systematic survey of cluster analysis and emphasizes its recent efforts in order to meet the challenges caused by the glut of complicated data from a wide variety of communities.

Details

International Journal of Intelligent Computing and Cybernetics, vol. 1 no. 4
Type: Research Article
ISSN: 1756-378X

Keywords

Article
Publication date: 6 August 2018

Fenyi Dong, Bing Qi and Yuyang Jie

The purpose of this paper is to cluster and analyse the level of agricultural science and technology in China’s provinces by using grey clustering model, to have an…

Abstract

Purpose

The purpose of this paper is to cluster and analyse the level of agricultural science and technology in China’s provinces by using grey clustering model, to have an overall understanding of the current situation of agricultural science and technology development in these provinces, and to offer a reference for decision-making departments to draw up agricultural science and technology development plans.

Design/methodology/approach

First of all, the grey clustering assessment is used to evaluate the clustering of agricultural science and technology level in China’s provinces in 2011, 2013 and 2015. Also a comparative static analysis is made. Then, based on the prediction data of GM (1,1) model, the provincial agricultural science and technology levels in 2017 and 2019 are analysed by grey clustering. Finally, some suggestions are put forward, such as adjusting the allocation of agricultural science and technology resources and providing policy preferences to backward areas, so as to promote the coordinated development of agricultural science and technology in China.

Findings

The development of agricultural science and technology in various provinces and regions of the authors’ country is unbalanced, with a big gap of agricultural and technology level between different provinces. What’s more, the level of agricultural science and technology in remote areas has been developing slowly, but it has been lagging behind. Through the grey clustering analysis of the provincial agricultural science and technology level in 2017 and 2019, it is concluded that the level of agricultural science and technology will be promoted as a whole, but the gap of agricultural science and technology level between different provinces and cities will be enlarged.

Research limitations/implications

This paper comprehensively studies the current situation and future development trends of agricultural science and technology in China’s provinces in recent years, and preliminarily analyses the reasons for the transformation of agricultural science and technology level, however, with no further inspection. Related research should be made for further study.

Practical implications

This paper will provide overall understanding of the current situation of agricultural science and technology development in China’s provinces and cities, and put forward relevant suggestions for the future development of agricultural science and technology in China’s provinces and cities, and provide references for decision-making departments to draw up agricultural science and technology development plans.

Originality/value

For the first time, the grey clustering method is used to the research of agricultural science and technology level in the province. It analyses and evaluates the past and present situation and predicts the future development trend of provincial agricultural science and technology level by the grey clustering analysis method, which is a complete research.

Details

Grey Systems: Theory and Application, vol. 8 no. 4
Type: Research Article
ISSN: 2043-9377

Keywords

Article
Publication date: 3 November 2022

Reza Edris Abadi, Mohammad Javad Ershadi and Seyed Taghi Akhavan Niaki

The overall goal of the data mining process is to extract information from an extensive data set and make it understandable for further use. When working with large…

Abstract

Purpose

The overall goal of the data mining process is to extract information from an extensive data set and make it understandable for further use. When working with large volumes of unstructured data in research information systems, it is necessary to divide the information into logical groupings after examining their quality before attempting to analyze it. On the other hand, data quality results are valuable resources for defining quality excellence programs of any information system. Hence, the purpose of this study is to discover and extract knowledge to evaluate and improve data quality in research information systems.

Design/methodology/approach

Clustering in data analysis and exploiting the outputs allows practitioners to gain an in-depth and extensive look at their information to form some logical structures based on what they have found. In this study, data extracted from an information system are used in the first stage. Then, the data quality results are classified into an organized structure based on data quality dimension standards. Next, clustering algorithms (K-Means), density-based clustering (density-based spatial clustering of applications with noise [DBSCAN]) and hierarchical clustering (balanced iterative reducing and clustering using hierarchies [BIRCH]) are applied to compare and find the most appropriate clustering algorithms in the research information system.

Findings

This paper showed that quality control results of an information system could be categorized through well-known data quality dimensions, including precision, accuracy, completeness, consistency, reputation and timeliness. Furthermore, among different well-known clustering approaches, the BIRCH algorithm of hierarchical clustering methods performs better in data clustering and gives the highest silhouette coefficient value. Next in line is the DBSCAN method, which performs better than the K-Means method.

Research limitations/implications

In the data quality assessment process, the discrepancies identified and the lack of proper classification for inconsistent data have led to unstructured reports, making the statistical analysis of qualitative metadata problems difficult and thus impossible to root out the observed errors. Therefore, in this study, the evaluation results of data quality have been categorized into various data quality dimensions, based on which multiple analyses have been performed in the form of data mining methods.

Originality/value

Although several pieces of research have been conducted to assess data quality results of research information systems, knowledge extraction from obtained data quality scores is a crucial work that has rarely been studied in the literature. Besides, clustering in data quality analysis and exploiting the outputs allows practitioners to gain an in-depth and extensive look at their information to form some logical structures based on what they have found.

Details

Information Discovery and Delivery, vol. ahead-of-print no. ahead-of-print
Type: Research Article
ISSN: 2398-6247

Keywords

Article
Publication date: 19 September 2008

George Menexes and Stamatis Angelopoulos

The aim of the study is to propose certain agricultural policy measures for the financing and development of Greek farms, established by young farmers, based on the…

Abstract

Purpose

The aim of the study is to propose certain agricultural policy measures for the financing and development of Greek farms, established by young farmers, based on the results of a clustering method suitable for handling socio‐economic categorical data.

Design/methodology/approach

The clustering method was applied to categorical data collected from 110 randomly selected investment plans of Greek agricultural farms. The investment plans were submitted to the “Region of Central Macedonia” administrative office, in the framework of the Operational Programme “Agricultural Development – Reform of the Countryside 2000‐2006” and refer to agricultural investments by “Young Farmers”, according to the terms and conditions of Priority Axis III: “Improvement of the Age Composition of the Agricultural Population”. The input variables for the analyses were the farmers' gender, age class, education level and permanent place of residence, the farms' agricultural activity, Human Labour Units (HLU) and farms' viability level. All these variables were measured on nominal or ordinal scales. The available data were analyzed by means of a hierarchical cluster analysis method applied on the rows of an appropriate matrix of a complete disjunctive form with a dummy coding 0 or 1. The similarities were measured through the Benzécri'sχ2distance (metric), while the Ward's method was used as a criterion for cluster formation.

Findings

Five clusters of farms emerged, with statistically significant diverse socio‐economic profiles. The most important impact on the formation of the groups of farms was found to be related to the number of HLU, the farmers' level of education and gender. This derived typology allows for the determination of a flexible development and funding policy for the agricultural farms, based on the socio‐economic profile of the formulated clusters.

Research limitations/implications

One of the limitations of the current study derives from the fact that the clustering method used is suitable only for categorical, non‐metric data. Another limitation comes from the fact that a relative small number of investment plans were used in the analysis. A larger sample covering and other geographical regions is needed in order to confirm the current results and make nation‐wide comparisons and “tailor‐made” proposals for financing and development. Finally, it is interesting to contact longitudinal surveys in order to evaluate the effectiveness of the funding policy of the corresponding programme.

Originality/value

The study's results could be useful to practitioners and academics because certain agricultural policy measures for the financing and development of Greek farms established by young farmers are proposed. Additionally, the data analysis method used in this study offers an alternative way for clustering categorical data.

Details

EuroMed Journal of Business, vol. 3 no. 3
Type: Research Article
ISSN: 1450-2194

Keywords

1 – 10 of over 46000