I.e. ‘human’ - ‘god’ = ‘animal’ means that the bot has randomly picked the words ‘human’ and ‘god’, and randomly decided to perform a subtraction. It subtracts the vector for ‘god’ from the vector for ‘human’, and finds and tweets the closest word to that point, in this case ‘animal’ (actually it tweets the top five closest words, here I just hand-picked some of my favourite results).

Above you can see some fully genuine, untampered results. But I should point out that there are hundreds (if not thousands?) of results, and I cherry-picked a few of my favourites. (I haven’t actually thoroughly looked through them all, there might be much more interesting ones).

Initially I was curating and trying to impose rules on what words the bot should pick from, so that the results would be more ‘sensible’ and interesting. But in doing so, I realised I was actually limiting the ability of the bot to find more ‘creative’ (and arguably more interesting, or unexpected) results. So I removed any constraints that I had imposed, and let the bot explore the space a lot more freely. It now produces results which are often more nonsensical, and sometimes a lot harder to make sense of.

And in fact this is what this project ended up being about.

It’s not about what the model tells us, but what we look for and see in the outcome.

Tons of examples can be found on twitter. Below are a few I’ve selected. Some of the first few examples are probably quite easy to interpret.

human - god = animal

This is an interesting one. It could be interpreted as: “if we don’t have / believe in god, we will descend to the level of primitive animals” or alternatively: “what sets humans apart from other animals, is that we were created in the image of god”. Or maybe: “humans are just animals, that have invented religions and beliefs in god” etc.

There are probably many other ways of interpreting this, and I’d be curious to hear some other ideas. But the truth is, I don’t think it means any of those things. Because there is no one behind this, saying it, to give it any meaning. It’s just noise, shaped by a filter, and then we project whatever we want onto it. It’s just a starting point for us to shape into what we desire, consciously or unconsciously.

Some might disagree and say that the model has learnt from the massive corpus of text that it has trained on, and that this artifact produced by the model carries the meanings embedded in the corpus. This is of course true to some degree, and can be verified with the examples given earlier, such as king-man+woman=queen, or walking-walked+swam=swimming. Surely it’s not a coincidence that the model is returning such meaningful results in those cases?

It does seem like the model has learnt something. But when we start to push the boundaries of the model, it’s best if we resist the temptation of jumping to conclusions as to what the model has learnt vs what may be ‘semi-random’ results, with our brain completing the rest of the picture. I’m not suggesting that there is a cut-off point as to when the model stops making sense and starts generating random results. It’s more of a spectrum. The more we sway away from what the model is ‘comfortable’ with (i.e. has seen in abundance during training, has learnt and is able to generalise), the more significance noise carries in the output (i.e. lower signal-to-noise ratio), and potentially the more fertile the output for our biased interpretations.

I will expand on this in more detail a bit later. But first some more examples.

nature - god = dynamics

I particularly like this one. I interpret it as “without the need for a god, nature is just the laws of physics”.

twitter + bot = memes

I couldn’t believe this one when I saw it. It almost needs no explanation. “bots on twitter become memes”. Too good to be true.

sex - love = intercourse, masturbation, prostitution, rape

This is a powerful one. I interpret it as “Sex without love is just intercourse”, or “prostitution is sex without love”, or “rape involves sex and hate (as the opposite of love)”. These results are very interesting. But again, it should not be assumed that the model is learning this particular interpretation from the training data. In most likeliness, all of these words are somewhere in the vicinity of ‘sex’ and/or ‘love’, since they are all related words. And yes perhaps these words do lie in a particular direction of ‘love’ or ‘sex’. But there is a difference between a bunch of words being laid out in space, and the sentence “sex without love is intercourse or prostitution…”. The latter is my interpretation of the spatial layout.

authorities - philosophy = police, governments

I have to push my creativity to be able to make sense of this one. I ask myself “If we think of philosophy as the act of thinking, of being logical and critical; then perhaps this sentence says that police and governments are authorities that don’t think, and are not logical?”. Or in other words “what kinds of authorities lack critical thinking? Police and governments”.

beard - justified - space + doctrine = theology, preacher

This one pushes the limits of my creativity even further. But I can still find meaning if I try hard. E.g. Let’s assume that a beard traditionally and stereo-typically signifies wisdom. Imagine a beard, that is not justified — i.e. it pretends to signify wisdom, but actually it doesn’t. In fact, this particular beard also replaces space (which I liberally assume to represent the ‘universe’, ‘knowledge’, ‘science’) with doctrine. Where might we find such a beard, pretending to be wise, but replacing science with doctrine? In theology of course, e.g. a preacher.

Of course this is me trying quite hard to fit a square peg into a round hole, trying to make sense of this ‘semi-random’ sentence which the model has spat out. I wouldn’t be surprised if somebody was able to interpret this sentence to mean the exact opposite to how I chose to interpret it.