What have we done so far¶

So far we have extracted the points from the image and created a few helper functions for things like calculating the center of a cluster of points and calculating the distance between two points.

The next step -- running the algorithm¶

The code can be found in the next cell, the algorithm is k-means

Our goal is to find where the points tend to form “clumps”. Since we want to group the numbers into k clusters, we’ll pick k points randomly from the data to use as the initial “clusters”.

We’ll iterate over every point in the data and calculate its distance to each of the k clusters. Find the nearest cluster and associate that point with the cluster. When you’ve iterated over all the points they should all be assigned to one of the clusters. Now, for each cluster recalculate its center by averaging the distances of all the associated points and start over.

When the centers stop moving very much we can stop looping. To find the dominant colors, simply take the centers of the clusters!