Clustering error rate or, clustering accuracy is used as evaluation metrics to measure the performance of k-means algorithm. When this assignment process is over, a new centroid is calculated for each cluster using the pixels in it. Further enhancements will include the study of higher dimensional data sets and large data set for clustering. Initial points affect the clustering process and results.

Genetic algorithm has been used for optimal centroid selection. In this paper, we address a brief survey of ant-based clustering algorithms and an overview of some of its applications.

And because randomness is one of the techniques used in initializing many of clustering techniques, and giving grpv point an equal opportunity to be an initial one, it is considered the main point of weakness that has to be solved.

The initialization phase randomly generates the initial population P0 of Z solutions which might end up with illegal strings.

Genetic algorithm has been used for optimal centroid selection. When applied to data clustering problem IGA performs better compared to K-means in all data set under study in this paper.

According to Figure 2, class1 and class2 have greater similarity or smaller distance and are merged together in the first level.

A simple approach is to compare the results of multiple runs with different k clusters and choose the best one according to a given criterion.

One drawback of K-means is that it is sensitive to the initially selected points, and so it does not always thdsis the same output. The clustering are used in some important thesos like Pattern recognition, Image analysis, Bioinformatics, Earthquake studies, Insurance.

It generates the initial division by AP partition.

The algorithm attempts to determine K partitions that minimize the squared-error function. First is the seed generation problem, second is the generation of right number of cluster and third one is content validation problem. Clustering error rate or, clustering accuracy is used as evaluation metrics to measure the performance of k-means algorithm.

The k-means algorithm, where each cluster is represented by the mean value of the objects in the cluster.

Four widely used measures for distance between clusters are as follows, where p-p’ is the distance between two objects or points p and p’, m, is the mean for cluster C, and n, is the number rgpg objects of in Ci[5].

The input data points are then allocated to one of the existing clusters according to the square of the Euclidean distance from the clusters, choosing the closest. Further enhancements will include the study of higher dimensional data 16 sets and large data set for clustering.

For each cluster, the mean value tch be calculated for the coordinates of all the points in that cluster and set as the coordinates of the new center.

