One-hot encoding

We can view the character to integer mapping by inspecting the t.word_index property of the instance of Keras’ Tokenizer.

{' ': 4, 'a': 2, 'b': 18, 'c': 11, 'd': 13, 'e': 1, 'f': 22, 'g': 14, 'h': 16, 'i': 5, 'j': 26, 'k': 21, 'l': 7, 'm': 17, 'n': 6, 'o': 8, 'p': 15, 'q': 25, 'r': 3, 's': 10, 't': 9,'u': 12, 'v': 23, 'w': 20, 'x': 27, 'y': 19, 'z': 24}

The integer values have no natural ordered relationship between each other and our model may not be able to harness any benefit from it. What’s worse, our model will initially assume such an ordering relationship among those characters (i.e. “a” is 2 and “e” is 1 but that should not signify a relationship), which can lead to an unwanted result. We will use one-hot encoding to represent the input sequence.

Each integer will be represented by a boolean array where only one element in the array will have a value of 1. The max integer value will determine the length of the boolean array in the character dictionary.

In our case, the max integer value is ‘x’: 27 , so the length of a one-hot boolean array will be 28 (considering the lowest value starts with 0, which is the padding).

For example, instead of using the integer value 2 to represent character ‘a’, we’re going to use one-hot array [0, 0, 1, 0 …….. 0].

One-hot encoding is also accessible in Keras.

One hot encoding

The resulting one_hot_names has the shape (14157, 25, 28), which stands for (# of training samples, max sequence length, # of unique tokens)

Data normalization

Remember we’re predicting 3 color channel values, each value ranging between 0–255. There is no golden rule for data normalization. Data normalization is purely practical because in practice it could take a model forever to converge if the training data values are spread out too much. A common normalization technique is to scale values to [-1, 1]. In our model, we’re using a ReLu activation function in the last layer. Since ReLu outputs non-negative numbers, we’ll normalize the values to [0, 1].

Data normalization

Build the model

To build our model we’re going to use two types of neural networks: a feed-forward neural network and a recurrent neural network. The feed-forward neural network is by far the most common type of neural network. In this neural network, the information comes into the input units and flows in one direction through hidden layers until each reaches the output units.

In recurrent neural networks, information can flow around in cycles. These networks can remember information for a long time. Recurrent networks are a very natural way to model sequential data. In our specific model, we’re using one of the most powerful recurrent networks named long short term memory (LSTM).

The easiest way to build up a deep learning model in Keras is to use its sequential API, and we simply connect each of the neural network layers by calling its model.add() function like connecting LEGO bricks.

Build the model

Training a model cannot be any easier by calling model.fit() function . Notice that we’re reserving 10% of the samples for validation purpose. If it turns out the model is achieving great accuracy on the training set but much lower on the validation set, it’s likely the model is overfitting. You can get more information about dealing with overfitting on my other blog: Two Simple Recipes for Over Fitted Model.

Train the model

Generate RGB colors

Let’s define some functions to generate and show the color predicted.

For a color name input, we need to transform it into the same one-hot representation. To achieve this, we tokenize characters to integers with the same tokenizer with which we processed the training data, pad it to the max sequence length of 25, then apply the one-hot encoding to the integer sequence.

And for the output RGB values, we need to scale it back to 0–255, so we can display them correctly.

Predict color

Let’s give the predict() function a try.

Generate new colors

Generated colors

“keras red” looks a bit darker than one we’re familiar with, but anyway, that was what the model proposed.

Conclusion and further reading

In this post, we talked about how to build a Keras model that can take any color name and come up with an RGB color value. More specifically, we looked at how to apply the one-hot encoding to character level language models, building a neural network model with a feed-forward neural network and recurrent neural network.

Here’s a diagram to summarize what we have built in the post, starting from the bottom and showing every step of the data flow.

Build summary

If you’re new to deep learning or the Keras library, there are some great resources that are easy and fun to read or experiment with.

TensorFlow playground: an interactive visualization of neural networks run on your browser.

Coursera deep learning course: learn the foundations of deep learning and lots of practical advice.

Keras get started guide: the official guide for the user-friendly, modular deep Python deep learning library.

Also, check out the source code for this post in my GitHub repo.