Summary
This explains clustering and K-means algorithm in an efficient way using a live demo in Silverlight. The demo can be used to understand the working of k-means algorithm through user-defined data points.
Silverlight Screenshot
Source Code Explanation
The following data-structure classes are created. The
Point class represents a point in 2D space. The
PointCollection represents a set of points and/or cluster.
public class Point
{
public int Id { get; set; }
public double X { get; set; }
public double Y { get; set; }
}
public class PointCollection : List<Point>
{
public Point Centroid { get; set; }
}
The following code implements the K-means algorithm, using the data-structures defined above.
public static List<PointCollection> DoKMeans(PointCollection points, int clusterCount)
{
//divide points into equal clusters
List<PointCollection> allClusters = new List<PointCollection>();
List<List<Point>> allGroups = ListUtility.SplitList<Point>(points, clusterCount);
foreach (List<Point> group in allGroups)
{
PointCollection cluster = new PointCollection();
cluster.AddRange(group);
allClusters.Add(cluster);
}
//start k-means clustering
int movements = 1;
while (movements > 0)
{
movements = 0;
foreach (PointCollection cluster in allClusters) //for all clusters
{
for (int pointIndex = 0; pointIndex < cluster.Count; pointIndex++) //for all points in each cluster
{
Point point = cluster[pointIndex];
int nearestCluster = FindNearestCluster(allClusters, point);
if (nearestCluster != allClusters.IndexOf(cluster)) //if point has moved
{
if (cluster.Count > 1) //each cluster shall have minimum one point
{
Point removedPoint = cluster.RemovePoint(point);
allClusters[nearestCluster].AddPoint(removedPoint);
movements += 1;
}
}
}
}
}
return (allClusters);
}