Image Compression and Color-based segmentation with KMeans

This problem appears in an exercise of the Coursera ML course by Andrew Ng.

Couple of colored images are used as input 3D dataset and the Kmeans clustring is run with k arbitray cluster centroids. After the algorithm converges, each pixel x is assigned to the color of the nearest cluster \(argmin_{i}d(x,c_i)\) .

. If the input image contains n different colors, the output image can be represented using only k << n colors, achieving high level of compression, by grouping regions with similar colors in a single cluster.

The following animations show the result of the clustering with two different images and for a few diffeent k values.

Next, a bird image is segmented with k=16.

Next, a face image is segmented with k=16.

Next, the same face image is segmented with k=8. As can be noticed, the image quality becomes poorer, as lesser and lesser number of colors are used.

Next, the same face image is segmented with k=2 to obtain a binary image. Also, the convergence is achieved much faster (within first two iterations) in this case. Using PCA and MDS for Visualization in the reduced dimension This problem also appears in an exercise of the Coursera ML course by Andrew Ng. First, the 3-D RGB reprentation of the bird image is compressed with kmean clustering. First, PCA and then MDS are used to reduce the dimension (to n=2) of the compressed image and then it is visualized in reduced dimension. As can be seen, although the spatial neighborhood is not preserved in the very representation of the image, still some structure of the bird is preserved even in the reduced dimensions. Next the face dataset is used (with 5000 images of size 32×32) and PCA is used to find the top k dominant eigenvectors for different k. The eigenvectors look like faces, so they are called eigenfaces. The following image shows 100 of those faces and a few of the principal component eigenfaces. The next figure / animation shows 100 of those normalized eigenfaces and the faces reconstructed with top k eigenfaces using X_{approx} ≈ X . e[ 1 : k] .( e[ 1 : k])’ . As can be seen that the original faces can be closely reconstructed, when only k=100 principal components are used, instead of using all of 1024 dimensions.



Using SVD, PCA and FFT for dimension reduction and compression

First, the following two different implementations of the PCA will be used to reduce the dimensions of a 453×378 image and reconstruction of the image in the reduced dimension. Implemented with the SVD (numerically stable, as done by R prcomp) Implemented with the Eigen-decomposition of covariance matrix computed with the z-score-normalized image matrix (numerically less stable, as done by R princomp)

In both the above cases, only the first k orthonormal principal components will be used to reconstruct the image, as shown in the following animation (the first one being the SVD implementation of PCA).

As can be seen from above, the SVD implementation of PCA is much more robust and less susceptible to numerical errors, only first few principal components suffice to have a very close representation of the image.

Next, FFT will be used to transform the image to frequency domain and then only first k orthonormal Fourier basis vectors in the frequency domain will be used to reconstruct the image, as shown in the following animation.

As can be seen from above, PCA works better than DFT in terms of quality of image reconstructed in the reduced dimensions.

The following figures show the decrease in errors (in between the original and the approximated image using the Frobenius norm / MSE).