In this tutorial, we will make a pre-trained deep learning model named Word2Vec available to other services by building a REST API from the ground up.

By Arun Kirubarajan, Alibaba Cloud Tech Share Author. Tech Share is Alibaba Cloud's incentive program to encourage the sharing of technical knowledge and best practices within the cloud community.

In production, standard methods of developing computationally expensive programs are too slow for programmers to reliably use. Development on a laptop or local machine can be sluggish to train the machine learning model for deep learning engineers, often taking hours or days for a single build. Thus, it is the industry standard to make use of cloud resources with more compute hardware to both train and subsequently run our machine learning models. This is good practice since we abstract complex computation and instead make AJAX requests as necessary. In this tutorial, we will make a pre-trained deep learning model named Word2Vec available to other services by building a REST API from the ground up with Alibaba Cloud Elastic Compute Service (ECS).

Prerequisite Knowledge

A Unix-based machine such as Alibaba Cloud Elastic Compute Service (ECS) instances, preferably with more compute power. Understanding of python and pip commands Knowledge of how to use the Linux operating system to create/navigate/edit folders and files

An Introduction to Word Vectors

Word Vectors have recently been shaking up the deep learning world due to their flexibility and ease of training. Word embeddings has revolutionized the field of NLP.

At its core, word embeddings are word vectors that each correspond to a single word such that the vectors "mean" the words. This can be demonstrated by certain phenomena such as the vector for king - queen = boy - girl. Word vectors are used to build everything from recommendation engines to chatbots that actually understand the English language.

Another point worth considering is how we obtain word embeddings as no two sets of word embeddings are the same. Word embeddings aren't random; they're generated by training a neural network. A recent powerful word embedding implementation comes from Google named Word2Vec which is trained by predicting words that appear next to other words in a language. For example, for the word "cat", the neural network will predict the words "kitten" and "feline". This intuition of words appearing "near" each other allows us to place them in vector space.

However, it is an industry standard to use the pre-trained models of other large corporations such as Google in order to quickly prototype and to simplify deployment processes. In this tutorial we will download and use Google's Word2Vec pre-trained word embeddings. We can do this by running the following command in our working directory:

wget http://magnitude.plasticity.ai/word2vec/GoogleNews-vectors-negative300.magnitude

Setting Up Python Environment

Setting up a Python environment is obviously a crucial component in developing a machine learning application. However, this process is often under-looked. Best practice when using Python dependencies is to use a virtual environment in tandem with an explicit requirement.txt file. This makes managing libraries easier for both deployment and development across multiple machines and environments.

First, we install virtualenv , a Python module that allows us to separate our working directories such that libraries don't interfere with one another.

pip3 install virtualenv

Next, we create a virtual environment named venv . Note that it is important to both specify and consistently use the same python version. It is recommended that you use Python 3 for best support. The venv folder will contain all the python modules specified in requirements.txt

virtualenv -p python3 venv

Although we've created a virtual environment, we haven't activated yet. Whenever we want to use the project and it's dependencies, we must source it using source . The file we actually want to call source on is named activate located in a folder named bin .

source venv/bin/activate

Once we are finished with our project, or we want to switch virtual environments, we can use the deactivate command to exit the virtual environment.

deactivate

Installing the Magnitude Package

The word embedding model we downloaded is in a .magnitude format. This format allows us to query the model efficiently using SQL, and is therefore the optimal embedding format for production servers. Since we need to be able to read the .magnitude format, we'll install the pymagnitude package. We'll also install flask to later serve the deep learning predictions made by the model.

pip3 install pymagnitude flask

We'll also add it to our dependency tracker with the following command. This creates a file named requirements.txt and saves our Python libraries so we can re-install them at a later time.

pip3 freeze > requirements.txt

Making Model Predictions

To begin, we'll create a file to handle opening and querying the word embeddings.

touch model.py

Next, we'll add the following lines to model.py to import Magnitude.

from pymagnitude import Magnitude vectors = Magnitude('GoogleNews-vectors-negative300.magnitude')

We can play around with the gensim package and the deep learning model by using the query method, providing an argument for a word.

cat_vector = vectors.query('cat') print(cat_vector)

However, for the core of our API, we will define a function to return the different in meaning between two words. This is the backbone for most deep learning solutions for things such as recommendation engines (i.e. showing content with similar words).

We can play around with this function by using the similarity and most_similar functions.

print(vectors.similarity("cat", "dog")) print(vectors.most_similar("cat", topn=100))

We implement the similarity calculator as follows. This method will be called by the Flask API in the next section. Note that this function returns a real value between 0 and 1.

def similarity(word1, word2): return vectors.similarity(word1, word2)

Wrapping The Model in a REST API

We'll create our server in a file named service.py with the following contents. We will use a server framework named Flask to serve our contents. Although other web-based server frameworks exist such as Django, we will use Flask due to its minimal overhead, easy integration and support within the deep learning community.

We will create a file named service.py with the following contents. We import flask and request to handle our server capabilities and we import the similarity engine from the module we wrote earlier.

from flask import Flask, request from model import similarity app = Flask(__name__) @app.route("/", methods=['GET']) def welcome(): return "Welcome to our Machine Learning REST API!" @app.route("/similarity", methods=['GET']) def similarity_route(): word1 = request.args.get("word1") word2 = request.args.get("word2") return str(similarity(word1, word2)) if __name__ == "__main__": app.run(host='0.0.0.0', port=5000, debug=True)

Our server is rather bare bones, but can easily be extendable by creating more routes using the @app.route decorator.

Dockerizing the Application

Docker is a useful tool to containerize applications. A container is a self-sufficient application that contains all the dependencies it needs to operate. In addition to making development and testing easier, this is especially convenient for deployment, when we often use multiple machines. Docker containers are lightweight since they only virtualize at the operating system layer, not the hardware layer that heavier virtualization uses such as virtual machines.

To begin the containerization process, we will begin by creating a Dockerfile. A Dockerfile is the entry point for the entire Docker process. We use this file to define dependencies, access files, set environment variables and to run our application.

touch Dockerfile

Next, we will add a command for Docker to be aware that our current directory is the directory container the Dockerfile. Then, we will install our Python dependencies for the server.

WORKDIR / ADD requirements.txt / RUN pip install -r requirements.txt

Next, we will install wget so that we can download the word embeddings. We'll rename them to match the convention we use in our Flask server using the MV Docker command.

RUN apt install wget RUN wget http://magnitude.plasticity.ai/word2vec/GoogleNews-vectors-negative300.magnitude

Finally, we can start our server by adding the final line to our Dockerfile. This runs our Flask server.

CMD [ "python", "./service.py" ]

Running the Dockerized Application

Now, we can build our model into its own standardized container by using the Docker to create an image: a standardized set of instructions to instantly create an instance of our model.

We will first run the docker build command, specifying a -t flag to create a name for our image and a . to tell Docker our Dockerfile is in our current directory.

docker build -t model .

Finally, we'll run our image using the docker run command, specifying a -p flag to bind our model to port 8000 (the port our Flask server is running on) and expose it to port 8000 (the port we want to use on our localhost).

docker run -p 8000:8000 model

Making API Calls

Our server will now be available at localhost:8000. We can query our database at localhost:8000/similarity?word1=cat&word2=dog and we view the response either in our browser or through another AJAX client.

Another option to test our API is using the command line. While we can use the browser (i.e. Chrome or Safari) to test our API by using GET routes, we are limited by being unable to use POST requests. An alternative is to use the curl tool, which comes bundled with Unix operating systems.

We use curl to specify both the word1 and word2 arguments and to view the response in the command line.

curl -X GET 'http://localhost:8000/similarity?word1=dog&word2=cat'

In our terminal, we should be able to see our response accurately classified.

Deployment