Search results

1 – 10 of 379
Article
Publication date: 1 January 1989

EDIE M. RASMUSSEN and PETER WILLETT

The implementation of hierarchic agglomerative methods of cluster anlaysis for large datasets is very demanding of computational resources when implemented on conventional…

Abstract

The implementation of hierarchic agglomerative methods of cluster anlaysis for large datasets is very demanding of computational resources when implemented on conventional computers. The ICL Distributed Array Processor (DAP) allows many of the scanning and matching operations required in clustering to be carried out in parallel. Experiments are described using the single linkage and Ward's hierarchical agglomerative clustering methods on both real and simulated datasets. Clustering runs on the DAP are compared with the most efficient algorithms currently available implemented on an IBM 3083 BX. The DAP is found to be 2.9–7.9 times as fast as the IBM, the exact degree of speed‐up depending on the size of the dataset, the clustering method, and the serial clustering algorithm that is used. An analysis of the cycle times of the two machines is presented which suggests that further, very substantial speed‐ups could be obtained from array processors of this type if they were to be based on more powerful processing elements.

Details

Journal of Documentation, vol. 45 no. 1
Type: Research Article
ISSN: 0022-0418

Article
Publication date: 1 March 1984

ALAN GRIFFITHS, LESLEY A. ROBINSON and PETER WILLETT

This paper considers the classifications produced by application of the single linkage, complete linkage, group average and Ward clustering methods to the Keen and Cranfield…

Abstract

This paper considers the classifications produced by application of the single linkage, complete linkage, group average and Ward clustering methods to the Keen and Cranfield document test collections. Experiments were carried out to study the structure of the hierarchies produced by the different methods, the extent to which the methods distort the input similarity matrices during the generation of a classification, and the retrieval effectiveness obtainable in cluster based retrieval. The results would suggest that the single linkage method, which has been used extensively in previous work on document clustering, is not the most effective procedure of those tested, although it should be emphasized that the experiments have used only small document test collections.

Details

Journal of Documentation, vol. 40 no. 3
Type: Research Article
ISSN: 0022-0418

Article
Publication date: 19 April 2022

Prosenjit Ghosh and Sabyasachi Mukherjee

The study aims to cluster the travellers based on their social media interactions as well as to find the different segments with similar and dissimilar categories according to…

602

Abstract

Purpose

The study aims to cluster the travellers based on their social media interactions as well as to find the different segments with similar and dissimilar categories according to traveller's choice. The study also aims to understand the behaviour of clusters of the travellers towards destination selection and accordingly make the tour packages in order to improve tourists' satisfaction and gain viable benefits.

Design/methodology/approach

Agglomerative hierarchical clustering with Ward's minimum variance linkage algorithm and model-based clustering with parameterized finite Gaussian mixture models has been implemented to achieve the respective goals. The dimension reduction (DR) technique was introduced for better visualizing clustering structure obtained from a finite mixture of Gaussian densities.

Findings

A total of 980 travellers have been clustered into 8 different interest groups according to their tourism destinations selection across East Asia based on individual social media feedback. For selecting the optimal number of clusters as well as the behaviour of the interested travellers groups, both these proposed methods have shown remarkable similarities. DR technique ensures the reduction in dimensionality with seven directions, of which the first two directions explained 95% of total variability.

Practical implications

Tourism organizations focus on marketing efforts to promote the most attractive benefits to the clusters of travellers. By segmenting travellers of East Asia into homogeneous groups, it is feasible to choose a similar area to test different marketing techniques. Finally, it can be identified to which segments, new respondents or potential clients belong; consequently, the tourism organizations can design the tour packages.

Originality/value

The study has uniqueness in two aspects. Firstly, the study empirically revealed tourists' experience and behavioural intention to select tourism destinations and secondly, it finds quantifiable insights into the tourism phenomenon in East Asia, which helps tourism organizations to understand the buying behaviours of tourists' segments. Finally, the application of clustering algorithms to achieve the purpose of this study and the findings are very new in the literature on tourism, to understand the tourist behaviour towards destination selection based on social media reviews.

Details

Journal of Hospitality and Tourism Insights, vol. 6 no. 2
Type: Research Article
ISSN: 2514-9792

Keywords

Article
Publication date: 9 May 2016

Chao-Lung Yang and Thi Phuong Quyen Nguyen

Class-based storage has been studied extensively and proved to be an efficient storage policy. However, few literature addressed how to cluster stuck items for class-based…

2534

Abstract

Purpose

Class-based storage has been studied extensively and proved to be an efficient storage policy. However, few literature addressed how to cluster stuck items for class-based storage. The purpose of this paper is to develop a constrained clustering method integrated with principal component analysis (PCA) to meet the need of clustering stored items with the consideration of practical storage constraints.

Design/methodology/approach

In order to consider item characteristic and the associated storage restrictions, the must-link and cannot-link constraints were constructed to meet the storage requirement. The cube-per-order index (COI) which has been used for location assignment in class-based warehouse was analyzed by PCA. The proposed constrained clustering method utilizes the principal component loadings as item sub-group features to identify COI distribution of item sub-groups. The clustering results are then used for allocating storage by using the heuristic assignment model based on COI.

Findings

The clustering result showed that the proposed method was able to provide better compactness among item clusters. The simulated result also shows the new location assignment by the proposed method was able to improve the retrieval efficiency by 33 percent.

Practical implications

While number of items in warehouse is tremendously large, the human intervention on revealing storage constraints is going to be impossible. The developed method can be easily fit in to solve the problem no matter what the size of the data is.

Originality/value

The case study demonstrated an example of practical location assignment problem with constraints. This paper also sheds a light on developing a data clustering method which can be directly applied on solving the practical data analysis issues.

Details

Industrial Management & Data Systems, vol. 116 no. 4
Type: Research Article
ISSN: 0263-5577

Keywords

Article
Publication date: 17 October 2008

Rui Xu and Donald C. Wunsch

The purpose of this paper is to provide a review of the issues related to cluster analysis, one of the most important and primitive activities of human beings, and of the advances…

1746

Abstract

Purpose

The purpose of this paper is to provide a review of the issues related to cluster analysis, one of the most important and primitive activities of human beings, and of the advances made in recent years.

Design/methodology/approach

The paper investigates the clustering algorithms rooted in machine learning, computer science, statistics, and computational intelligence.

Findings

The paper reviews the basic issues of cluster analysis and discusses the recent advances of clustering algorithms in scalability, robustness, visualization, irregular cluster shape detection, and so on.

Originality/value

The paper presents a comprehensive and systematic survey of cluster analysis and emphasizes its recent efforts in order to meet the challenges caused by the glut of complicated data from a wide variety of communities.

Details

International Journal of Intelligent Computing and Cybernetics, vol. 1 no. 4
Type: Research Article
ISSN: 1756-378X

Keywords

Article
Publication date: 19 June 2017

Khai Tan Huynh, Tho Thanh Quan and Thang Hoai Bui

Service-oriented architecture is an emerging software architecture, in which web service (WS) plays a crucial role. In this architecture, the task of WS composition and…

Abstract

Purpose

Service-oriented architecture is an emerging software architecture, in which web service (WS) plays a crucial role. In this architecture, the task of WS composition and verification is required when handling complex requirement of services from users. When the number of WS becomes very huge in practice, the complexity of the composition and verification is also correspondingly high. In this paper, the authors aim to propose a logic-based clustering approach to solve this problem by separating the original repository of WS into clusters. Moreover, they also propose a so-called quality-controlled clustering approach to ensure the quality of generated clusters in a reasonable execution time.

Design/methodology/approach

The approach represents WSs as logical formulas on which the authors conduct the clustering task. They also combine two most popular clustering approaches of hierarchical agglomerative clustering (HAC) and k-means to ensure the quality of generated clusters.

Findings

This logic-based clustering approach really helps to increase the performance of the WS composition and verification significantly. Furthermore, the logic-based approach helps us to maintain the soundness and completeness of the composition solution. Eventually, the quality-controlled strategy can ensure the quality of generated clusters in low complexity time.

Research limitations/implications

The work discussed in this paper is just implemented as a research tool known as WSCOVER. More work is needed to make it a practical and usable system for real life applications.

Originality/value

In this paper, the authors propose a logic-based paradigm to represent and cluster WSs. Moreover, they also propose an approach of quality-controlled clustering which combines and takes advantages of two most popular clustering approaches of HAC and k-means.

Article
Publication date: 30 October 2007

I.C. Mogotsi

This paper seeks to provide a tangible example of the use of text‐mining techniques in a real world setting, i.e. using real, as opposed to test, data.

1317

Abstract

Purpose

This paper seeks to provide a tangible example of the use of text‐mining techniques in a real world setting, i.e. using real, as opposed to test, data.

Design/methodology/approach

News stories are modeled using the vector space model, with the similarity between documents quantified using the cosine measure. For data analysis, three clustering algorithms are used, and the results from the best‐performing algorithm retained.

Findings

Agglomerative clustering performed poorly, while direct k‐way clustering and k‐way clustering through repeated bisections yielded similar results, with the former performing marginally better in terms of external isolation and internal cohesion of the clusters produced. A number of themes that dominated news coverage during the period under consideration were identified, some of which were noticeably only topical during certain parts of the year.

Research limitations/implications

Text mining holds much promise for businesses, particularly if integrated into a well‐orchestrated competitive intelligence function. However, more publicly accessible studies need to be undertaken if businesses are to derive maximum value from it.

Originality/value

There is a growing body of literature devoted to both data and text mining. However, much of this literature focuses on the development of new algorithms, with scant attention paid to the practical application of these techniques in business settings, possibly because of the strategic sensitivity of project findings. This study helps fill this yawning void.

Details

VINE, vol. 37 no. 4
Type: Research Article
ISSN: 0305-5728

Keywords

Article
Publication date: 13 July 2015

Hülya Güçdemir and Hasan Selim

– The purpose of this paper is to develop a systematic approach for business customer segmentation.

4026

Abstract

Purpose

The purpose of this paper is to develop a systematic approach for business customer segmentation.

Design/methodology/approach

This study proposes an approach for business customer segmentation that integrates clustering and multi-criteria decision making (MCDM). First, proper segmentation variables are identified and then customers are grouped by using hierarchical and partitional clustering algorithms. The approach extended the recency-frequency-monetary (RFM) model by proposing five novel segmentation variables for business markets. To confirm the viability of the proposed approach, a real-world application is presented. Three agglomerative hierarchical clustering algorithms namely “Ward’s method,” “single linkage” and “complete linkage,” and a partitional clustering algorithm, “k-means,” are used in segmentation. In the implementation, fuzzy analytic hierarchy process is employed to determine the importance of the segments.

Findings

Business customers of an international original equipment manufacturer (OEM) are segmented in the application. In this regard, 317 business customers of the OEM are segmented as “best,” “valuable,” “average,” “potential valuable” and “potential invaluable” according to the cluster ranks obtained in this study. The results of the application reveal that the proposed approach can effectively be used in practice for business customer segmentation.

Research limitations/implications

The success of the proposed approach relies on the availability and quality of customers’ data. Therefore, design of an extensive customer database management system is the foundation for any successful customer relationship management (CRM) solution offered by the proposed approach. Such a database management system may entail a noteworthy level of investment.

Practical implications

The results of the application reveal that the proposed approach can effectively be used in practice for business customer segmentation. By making customer segmentation decisions, the proposed approach can provides firms a basis for the development of effective loyalty programs and design of customized strategies for their customers.

Social implications

The proposed segmentation approach may contribute firms to gaining sustainable competitive advantage in the market by increasing the effectiveness of CRM strategies.

Originality/value

This study proposes an integrated approach for business customer segmentation. The proposed approach differentiates itself from its counterparts by combining MCDM and clustering in business customer segmentation. In addition, it extends the traditional RFM model by including five novel segmentation variables for business markets.

Details

Industrial Management & Data Systems, vol. 115 no. 6
Type: Research Article
ISSN: 0263-5577

Keywords

Article
Publication date: 30 April 2021

Faruk Bulut, Melike Bektaş and Abdullah Yavuz

In this study, supervision and control of the possible problems among people over a large area with a limited number of drone cameras and security staff is established.

Abstract

Purpose

In this study, supervision and control of the possible problems among people over a large area with a limited number of drone cameras and security staff is established.

Design/methodology/approach

These drones, namely unmanned aerial vehicles (UAVs) will be adaptively and automatically distributed over the crowds to control and track the communities by the proposed system. Since crowds are mobile, the design of the drone clusters will be simultaneously re-organized according to densities and distributions of people. An adaptive and dynamic distribution and routing mechanism of UAV fleets for crowds is implemented to control a specific given region. The nine popular clustering algorithms have been used and tested in the presented mechanism to gain better performance.

Findings

The nine popular clustering algorithms have been used and tested in the presented mechanism to gain better performance. An outperformed clustering performance from the aggregated model has been received when compared with a singular clustering method over five different test cases about crowds of human distributions. This study has three basic components. The first one is to divide the human crowds into clusters. The second one is to determine an optimum route of UAVs over clusters. The last one is to direct the most appropriate security personnel to the events that occurred.

Originality/value

This study has three basic components. The first one is to divide the human crowds into clusters. The second one is to determine an optimum route of UAVs over clusters. The last one is to direct the most appropriate security personnel to the events that occurred.

Details

International Journal of Intelligent Unmanned Systems, vol. 12 no. 1
Type: Research Article
ISSN: 2049-6427

Keywords

Article
Publication date: 19 September 2008

George Menexes and Stamatis Angelopoulos

The aim of the study is to propose certain agricultural policy measures for the financing and development of Greek farms, established by young farmers, based on the results of a…

Abstract

Purpose

The aim of the study is to propose certain agricultural policy measures for the financing and development of Greek farms, established by young farmers, based on the results of a clustering method suitable for handling socio‐economic categorical data.

Design/methodology/approach

The clustering method was applied to categorical data collected from 110 randomly selected investment plans of Greek agricultural farms. The investment plans were submitted to the “Region of Central Macedonia” administrative office, in the framework of the Operational Programme “Agricultural Development – Reform of the Countryside 2000‐2006” and refer to agricultural investments by “Young Farmers”, according to the terms and conditions of Priority Axis III: “Improvement of the Age Composition of the Agricultural Population”. The input variables for the analyses were the farmers' gender, age class, education level and permanent place of residence, the farms' agricultural activity, Human Labour Units (HLU) and farms' viability level. All these variables were measured on nominal or ordinal scales. The available data were analyzed by means of a hierarchical cluster analysis method applied on the rows of an appropriate matrix of a complete disjunctive form with a dummy coding 0 or 1. The similarities were measured through the Benzécri'sχ2distance (metric), while the Ward's method was used as a criterion for cluster formation.

Findings

Five clusters of farms emerged, with statistically significant diverse socio‐economic profiles. The most important impact on the formation of the groups of farms was found to be related to the number of HLU, the farmers' level of education and gender. This derived typology allows for the determination of a flexible development and funding policy for the agricultural farms, based on the socio‐economic profile of the formulated clusters.

Research limitations/implications

One of the limitations of the current study derives from the fact that the clustering method used is suitable only for categorical, non‐metric data. Another limitation comes from the fact that a relative small number of investment plans were used in the analysis. A larger sample covering and other geographical regions is needed in order to confirm the current results and make nation‐wide comparisons and “tailor‐made” proposals for financing and development. Finally, it is interesting to contact longitudinal surveys in order to evaluate the effectiveness of the funding policy of the corresponding programme.

Originality/value

The study's results could be useful to practitioners and academics because certain agricultural policy measures for the financing and development of Greek farms established by young farmers are proposed. Additionally, the data analysis method used in this study offers an alternative way for clustering categorical data.

Details

EuroMed Journal of Business, vol. 3 no. 3
Type: Research Article
ISSN: 1450-2194

Keywords

1 – 10 of 379