This reminds me a bit of eigenfaces in facial recognition datasets, where you get dark blurs that highlight the most prominent features of a human face -- the eyes, nose, and mouth. The results are usually terrifying images of unsettling faces with deep, dark, sunken eyes and creepy smiles.

The reconstructions of the eigensheep above at lower temperature setting reveal more interesting patterns. For the first eigensheep, at $\tau=0.1$, you mostly see a pattern of round, black heads with two long, round legs, and some kind of tail (or possibly cape?) This axis seems to be along the direction that consists of variations in the structure of the head and legs of the sheep, and it would make sense that this accounts for most of the variance in the sketches. A look at the variation in the grid of reconstructed sketches above, going from left to right, seems to be consistent with this conjecture.

Looking at the second eigensheep, again concentrating on the samples generated with the lower temperatures, we see some roughly scribbled circles representing the body, with 3-4 loosely attached legs and a tiny round head. Unlike the first principal axis, this one seems to be along the direction that consists of variations primarily in the structure of the body of thye sheep. Again, studying the transitions from top to bottom in the grid above seems consistent with this.

The t-distributed Stochastic Neighbor Embedding (t-SNE) is a popular non-linear dimensionality reduction technique that is commonly used to visualize high-dimensional data. It is one of several embedding methods that seeks to embed data points in a lower dimensional space in such a way as to preserve the pairwise distances of the points in the original, higher dimensional space. t-SNE does so in a non-linear fashion and adapts to the underlying data, performing different transformations on different regions .

In particular, t-SNE is commonly used in deep learning to inspect and visualize what a deep neural network has learned. For example, in an image classification problem, we can view a convolutional neural net as a series of transformations that gradually turns an image into a representation such that the classes can be more easily separated by a linear classifier . Therefore, we can use the output of the final layer preceding the classifier as the 'code'. The code representations of the images can then be embedded and visualized in 2-dimensions using t-SNE.

Here, we embed the 128-dimensional latent code of each vector drawing in 2-dimensions and visualize them in this subspace.