Welcome to Latent Space

I’ve written before about BigGAN, an image-generating neural net that Google trained recently. It generates its best images for each of the 1,000 different categories in the standard ImageNet dataset, from goldfish to planetarium to toilet tissue. And the images it produces are both beautifully textured and deeply weird. Some of the categories - scabbard, rocking chair, stopwatch - are delightfully aesthetic.

[scabbard, rocking chair, stopwatch]

Google has made the trained BigGAN model available to the research/art community, which is nice, since people have estimated that today it would take around $60k in cloud computing time to train one’s own.

But there’s more lurking in the BigGAN model besides the 1,000 ImageNet categories. The model thinks of each category as a big set of numbers that describes exactly how to smoosh and stretch and color random noise. Following one set of numbers will transform noise into a flower, while following another set will turn that same noise into a dog instead. But another thing a set of number is, is a position in space: latitude and longitude for example, or x,y,z coordinates - in math terms, we call the set of numbers a vector. And in machine learning, all the positions in space (granted, an approximately 100-dimensional space) that a model’s vectors can point to is called vector space.

So one set of numbers - the flower vector - points you to some location in vector space, and another set of numbers - the dog vector - points you to a different location.

[daisy, saluki dog]

But here is where it gets fun. The vectors are just numbers, which means you could, in theory, average them. What happens when you average together “saluki dog” and “daisy”? There’s no ImageNet category there, so what’s lurking in that spot in vector space, halfway between the two? Delightfully, dogflowers.

This, it turns out, is so cool. Joel Simon has put together an app called ganbreeder.app that lets you mix and match categories.

So, this is what you get when you travel to the point in vector space midway between bedlington terrier and geyser, with a little dingo thrown in.

And this spot in latent space is somewhere between Pembroke Terrier and espresso.

This aesthetic delight is bookshop + radio telescope, with a teensy bit of boston bull. (It turns out that since the ImageNet dataset is full of dogs, vector space is too)

Want to make something adorably small? Add a bit of thimble. (This is the bit of latent space midway between thimble + zucchini)

Want to make it really ornate and fancy? Throw in some church organ, or perhaps some saxophone. This, for the record, is conch + organ + sax + scabbard + book jacket.

This spot around electric locomotive + greenhouse + prison + vault + rocking chair + shoji is very beautiful.

I’m also fond of trilobite + carpenter’s kit + french horn + ladle + streetcar.

While the less said about the bit of latent space midway between bathtub + butcher shop, the better.

Go explore ganbreeder.app, which is free and so so fascinating!

And check out a few more of my favorite spots in latent space here in the bonus material!