Several weeks ago, artist and coder Janelle Shane tried to train a neural network to name paint colors. The results were...not good. "Stanky Bean" was a kind of dull pink, and "Stoner Blue" was gray. Then there were the three shades of brown known as "Dope," "Burble Simp," and "Turdly."

These results were so bad that they turned the corner into delightful hilarity, and Shane's blog post about them went viral. Almost immediately, AI coders started offering tips on how she could tweak the algorithm to get better results. So Shane dutifully went back to the virtual drawing board, adjusted the AI's creativity levels, and gave it some new datasets. The results were...well, you decide.

First, Shane realized that part of the initial problem was that she'd cranked up the neural net's "temperature" variable, which meant that it was picking less likely (or "more creative") possibilities as it generated paint names letter-by-letter. So she turned the temperature variable down, and found that the names were still pretty silly but they at least matched the colors most of the time. Plus, the colors themselves seemed more varied:

Next, Shane incorporated some suggestions from experts about how to tweak her dataset. Mostly, what people wanted to know was whether she'd get better results if she changed the way she represented colors. For her original dataset, she used RGB colors, which assign a numeric value to a number based on its blend of red, green, and blue. The AI would randomly generate a number within the RGB space, and slap a name on it.

The problem is, according to Shane, that RGB doesn't do a good job representing color the way human eyes perceive it. Plus, the AI kept coming up with muddy, boring colors that seemed like variations on gray and brown.

She tried two other color schemes: HSV and LAB. On her blog, Shane writes:

In HSV representation, a single color is represented by three numbers like in RGB, but this time they stand for Hue, Saturation, and Value. You can think of the Hue number as representing the color, Saturation as representing how intense (vs gray) the color is, and Value as representing the brightness. Other than the way of representing the color, everything else about the dataset and the neural network are the same...In [the LAB] color space, the first number stands for lightness, the second number stands for the amount of green vs red, and the third number stands for the the amount of blue vs. yellow.

Sadly, HSV and LAB colors didn't really produce results that made more sense than the RGB ones. Shane wound up deciding that RBG was her best option. "Maybe it’s more resistant to disruption when the temperature setting introduces randomness," Shane mused. The color names, however, were still "pretty bad."

But then one person sent Shane this cleaned-up dataset, which had no capital letters (so there wouldn't be two versions of each letter). It combined paint names from Behr and Benjamin Moore, as well as ones from a user-generated list created by readers of XKCD. Shane called the results "surprisingly good," and noted that the lesson here is that better data generally produces better results. And here they are:

Shane also included a "hall of fame" from her experiments with all the color schemes, which truly represent the promise of machine learning in our modern world:



Janelle Shane



Listing image by Janelle Shane