To read the full version of this content please select one of the options below:

Improved initial cluster center selection in K-means clustering

Minchen Zhu (College of Mathematics and Computer Science, Fuzhou University, Fuzhou, China)
Weizhi Wang (College of Civil Engineering, Fuzhou University, Fuzhou, China)
Jingshan Huang (School of Computing, University of South Alabama, Mobile, Alabama, USA)

Engineering Computations

ISSN: 0264-4401

Article publication date: 28 October 2014

Downloads
466

Abstract

Purpose

It is well known that the selection of initial cluster centers can significantly affect K-means clustering results. The purpose of this paper is to propose an improved, efficient methodology to handle such a challenge.

Design/methodology/approach

According to the fact that the inner-class distance among samples within the same cluster is supposed to be smaller than the inter-class distance among clusters, the algorithm will dynamically adjust initial cluster centers that are randomly selected. Consequently, such adjusted initial cluster centers will be highly representative in the sense that they are distributed among as many samples as possible. As a result, local optima that are common in K-means clustering can then be effectively reduced. In addition, the algorithm is able to obtain all initial cluster centers simultaneously (instead of one center at a time) during the dynamic adjustment.

Findings

Experimental results demonstrate that the proposed algorithm greatly improves the accuracy of traditional K-means clustering results and, in a more efficient manner.

Originality/value

The authors presented in this paper an efficient algorithm, which is able to dynamically adjust initial cluster centers that are randomly selected. The adjusted centers are highly representative, i.e. they are distributed among as many samples as possible. As a result, local optima that are common in K-means clustering can be effectively reduced so that the authors can achieve an improved clustering accuracy. In addition, the algorithm is a cost-efficient one and the enhanced clustering accuracy can be obtained in a more efficient manner compared with traditional K-means algorithm.

Keywords

Acknowledgements

This research was supported by the Project of Fujian Province under Grant No. 2012J01263 and 2011Y0040.

Citation

Zhu, M., Wang, W. and Huang, J. (2014), "Improved initial cluster center selection in K-means clustering", Engineering Computations, Vol. 31 No. 8, pp. 1661-1667. https://doi.org/10.1108/EC-11-2012-0288

Publisher

:

Emerald Group Publishing Limited

Copyright © 2014, Emerald Group Publishing Limited