Efficiency issues of evolutionary k-means

Naldi, M.C., Campello, R.J.G.B., Hruschka, E.R., and Carvalho, A.C.P.L.F. (2011) Efficiency issues of evolutionary k-means. Applied Soft Computing, 11 (2). pp. 1938-1952.

PDF (Published Version) - Published Version
Restricted to Repository staff only

DOI: 10.1016/j.asoc.2010.06.010

View at Publisher Website: http://dx.doi.org/10.1016/j.asoc.2010.06...

Abstract

One of the top ten most influential data mining algorithms, k-means, is known for being simple and scalable. However, it is sensitive to initialization of prototypes and requires that the number of clusters be specified in advance. This paper shows that evolutionary techniques conceived to guide the application of k-means can be more computationally efficient than systematic (i.e., repetitive) approaches that try to get around the above-mentioned drawbacks by repeatedly running the algorithm from different configurations for the number of clusters and initial positions of prototypes. To do so, a modified version of a (k-means based) fast evolutionary algorithm for clustering is employed. Theoretical complexity analyses for the systematic and evolutionary algorithms under interest are provided. Computational experiments and statistical analyses of the results are presented for artificial and text mining data sets.


Item ID:	47616
Item Type:	Article (Research - C1)
ISSN:	1872-9681
Keywords:	k-means, Evolutionary clustering, Data mining
Funders:	Coordenação de Aperfeiçoamento de Pessoal de Nivel Superior (CAPES), Conselho Nacional de Desenvolvimento Científico e Tecnológico (CNPq), São Paulo Research Foundation (FAPESP)
Date Deposited:	08 Mar 2017 07:40
FoR Codes:	01 MATHEMATICAL SCIENCES > 0104 Statistics > 010401 Applied Statistics @ 100%
SEO Codes:	97 EXPANDING KNOWLEDGE > 970101 Expanding Knowledge in the Mathematical Sciences @ 100%
Downloads:	Total: 8
	More Statistics

Actions (Repository Staff Only)

Item Control Page