Let’s Make One!

For those of you that would prefer to jump straight to the code: here it is.

The rest of this example has the following steps:

Setup

First, make sure you have signed up for Skafos and created a log-in. On the dashboard, create a new standard project. If you have just created a new account, you will be presented with a screen like below. Select the “standard project workflow” near the top:

Figure 2. Create a new standard project on Skafos to get started.

Enter a name and description for the project and then navigate to the project’s JupyterLab instance (JLab). This is basically a cloud-based IDE running on Skafos, provisioned just for you:

To get the code that goes with this example, fork this github repo to your own account, and then clone it into your JLab instance using the provided terminal window:

$ git clone https://github.com/<your-account-name>/WordLanguageModel.ipynb

Once you have the code in your JLab, enter the directory and install the python dependencies in the terminal window:

$ pip install -r requirements.txt

Now open up the word_language_model.ipynb” notebook and import the python libraries you just installed:

Figure 3. Import python dependencies.

Gather Training Data

The training data we need to build a language model is just any bit of formatted text (meaning sentences that are for the most part grammatically correct). The more data you can get your hands on, and the more you like the tone and diction of the text, the better your model will meet your standards.

For this example, and for the sake of speed, we will use an existing dataset of Yelp business reviews. This is the same dataset used to train and deploy the Text Classifier model with Skafos. I chose this dataset for the language modeling example because it is familiar, easily accessible, and quite frankly… just funny! Don’t feel like you need to stay with that data beyond this example. In fact, I encourage you to go find your own text data to build a language model on.

Figure 4. Load text training data.

Pre-process Text

Now define a couple helper functions to process the text data we just downloaded:

Figure 5. Helper functions to process the text.

Figure 6. Parse and clean the text.

Now that we have a long list of our cleaned tokens (words), we will organize them into sequences of at max 11 words. Why 11? Well, I want to use an “input length” of 10 for the model. And I then want the model to predict the likely next word. 10 + 1 = 11. I also made sure to handle cases when a sequence splits a sentence into pieces:

Figure 7. Split text into sequences.

Lastly, we need to convert the processed text, now organized in sequences, to numeric form. We do this by mapping each word to a unique integer index. Fortunately, Keras has a nice tool that allows us to do this easily (Tokenizer).

Figure 8. Convert sequences of text to sequences of integers.

One last thing! Before building the model, we need to split the data into “X” and “y”. What is this? In order for a ML model to learn patterns, we explicitly define some input data (X) that maps to an output (y). In this case, our initial input of 10 words is our X, and the last word of the sequence is our y. We use array slicing to achieve this:

Figure 9. Split into X and y for model training.

Train the Model

First things first, we declare a base neural network class for us to start building our architecture, layer by layer. The first layer in the network is an embedding layer that takes a sequence of 10 integers (each mapping to a corresponding word) and extracts a 32-length vector for each to use as inputs to the LSTM layer (this is the recurrent layer of the network). Lastly, we use 2 Dense layers to output the probabilities of each word in the vocabulary given the inputs. I included links to the Keras documentation where you can learn more about each specific layer.

Figure 10. Construct and train a RNN model.

With any neural network, picking the right hyper-parameters is key (and also more art than science, if you ask me). Most of those choices depend on your training data, time frame, and required performance threshold.

The most common parameters to tweak include:

Number of training epochs

Number of samples included in a batch (a bunch of small batches make up an epoch)

The loss function, optimizer, and evaluation metric

Number of hidden layers (LSTM, Dense, etc… use more or less)

Number of units within each layer

But honestly, don’t even worry about that part until you get something working end-to-end. So many data scientists and machine learning engineers get lost in this step. We iterate for weeks on end, following a rigorous scientific method, but often run into integration challenges after they have burned precious weeks building the perfect model. Don’t do that.

Get your model hooked up to your app first, and then iterate on it. Because Skafos can “push” model updates to your app, you can iterate to your heart’s desire.

Test the Language Model

Now that the model has been trained, here is a function you can use (in python) to generate new text based on some input. This is a good thing to do in the JLab before deploying to your app to make sure it’s doing what you’d expect.

Figure 11. Code to test the language model by generating new sequences of text given some seed text.

Give it a whirl! Try out other seed text to see where the model takes it. The longer you train, the more likely it will make sense. However, the longer you train, the more likely the model will over fit to the particular linguistic style of the text you trained on. The results can be quite humorous..

Figure 12. Try out the language model!

Save the Model

To push this model to your iOS app, you need to do the following:

Convert the model to Core ML format: This makes the model class accessible in Xcode.

Figure 13. Export model to Core ML format.

If you haven’t already done so, configure your iOS app settings with Skafos on the project page of the dashboard. Enter the required ID’s and Keys, then head over to the integration guide.

Figure 14. Project page — model delivery settings for integrating with iOS app.

During step 1 of the integration guide, instead of downloading an initial model from the drop down list, click here to download a pre-trained model along with a word-index dictionary JSON file (zipped). Add those to your Xcode project resources.

From here on out, the Skafos framework will handle automated model updates to your devices. You can trigger model updates over-the-air using the Skafos SDK save and deliver methods.

This way, you can retrain or update your model in the JLab environment on Skafos, and have your changes propagate to your applications.

Warning

This model is a bit more advanced, so it is a bit trickier to deploy and use on devices (I trust you can still do it)! Remember that pre-processing work we did, converting text to integers and what not? Well, in order to use a language model like this in your app, you also need to include the mappings so that the results of the model can be translated into human readable text. Each time you save and deliver a model, you also need to include the word-index dictionary .json file that you can integrate into your app. Skafos can deliver these as a zipped bundle!

Coming Soon

Stay tuned for another blog post coming that will show you how to build a sample iOS application using the model we just saved.