This article is an extension to my previous article titled — “Using Python Trained Machine Learning Models from Phoenix (Elixir) Applications”. In my previous article, I showed how to use python trained machined learning models from Elixir applications using ErlPort. However, the model parameters passed from Phoenix (Elixir) app to make prediction using Python code was a simple list data-type and it was supported by built in data-type mappings of ErlPort.

However, for complex data we would need some mechanism to serialize/de-serialize the data. In this article, I will extend my previous project to do this data serialiaztion/de-serialization using — “Protocol Buffers”.

What is Protocol Buffers?

Invented by Google — “Protocol buffers are Google’s language-neutral, platform-neutral, extensible mechanism for serializing structured data — think XML, but smaller, faster, and simpler. You define how you want your data to be structured once, then you can use special generated source code to easily write and read your structured data to and from a variety of data streams and using a variety of languages.” (src: https://developers.google.com/protocol-buffers/)

Google provides extensible documentation for protocol buffers on their website, specailly developers’ guide and code example with tutorials in supported languages are available here — https://developers.google.com/protocol-buffers/docs/overview

How Does it Work?

The way protocol buffers work is quite simple. We need to write a .proto file that defines the message format. This .proto file gets compiled by protocol compiler (protoc) which generates code in the desired languages.

.proto → protoc compiler → generated language code

In this article, we will pass data from Elixir code to Python. So we need to generate code for both Elixir and Python using protoc compiler. Python support is already built into the compiler. But for Elixir part, we will use protobuf-elixir (https://github.com/tony612/protobuf-elixir) which is a pure Elixir implementation of Protocol Buffers.

Install Protobuf

First we need to install protocol buffer compiler. We can do this using Homebrew (https://formulae.brew.sh/formula/protobuf) on Mac with below command —

brew install protobuf

Phoenix (Elixir) Part

Modify mix.exs and add dependencies:

defp deps do

[

{:phoenix, "~> 1.4.0"},

{:phoenix_pubsub, "~> 1.1"},

{:phoenix_html, "~> 2.11"},

{:phoenix_live_reload, "~> 1.2", only: :dev},

{:gettext, "~> 0.11"},

{:jason, "~> 1.0"},

{:plug_cowboy, "~> 2.0"},

{:erlport, "~> 0.10.0"},

{:protobuf, "~> 0.5.3"},

# Only for files generated from Google's protos.

# Can be ignored if you don't use Google's protos.

{:google_protos, "~> 0.1"}

]

end

We need to add the bold part in mix.exs file and update dependencies using below command —

mix deps.get

Install protoc plugin:

We need to install protoc plugin protoc-gen-elixir for Elixir using the below command —

mix escript.install hex protobuf

NOTE: protoc-gen-elixir needs to be in PATH.

Define message (.proto file):

We placed our .proto files and generated code under lib/phoenix_ml/protobuf folder. Our message definition file iris.proto looks as below —

syntax = "proto3";



package PhoenixMl;



message IrisParams {

float sepal_length = 1;

float sepal_width = 2;

float petal_length = 3;

float petal_width = 4;

}

We will use proto3 syntax which is detailed here — https://developers.google.com/protocol-buffers/docs/proto3

Basically, our message format is simple and contains four float values for our model parameters — sepal_length, sepal_width, petal_length, petal_width.

Compile .proto file and generate Elixir code:

We will use the below command to generate Elixir code from above .proto file —

protoc — elixir_out=. iris.proto (ran from inside lib/phoenix_ml/protobuf folder)

This will generate iris.pb.ex file.

Update source code:

Next we will update page_controller.ex file with below code (edited) —

alias PhoenixMl.IrisParams, as: Iris ... ... with {sepal_length, _} <- Float.parse(sepal_length),

{sepal_width, _} <- Float.parse(sepal_width),

{petal_length, _} <- Float.parse(petal_length),

{petal_width, _} <- Float.parse(petal_width) do

iris_params = %Iris{

sepal_length: sepal_length,

sepal_width: sepal_width,

petal_length: petal_length,

petal_width: petal_width

}



class = ML.predict([Iris.encode(iris_params)])

The bold parts are the necessary code changes. Here we —

imported PohenixMl.IrisParams module as Iris

module as defined %Iris{} struct with the parameters

struct with the parameters called Iris.encode to generate the serialized data

Python Part

Compile .proto file and generate Python code:

We will use the below command to generate Python code from above .proto file —

protoc — python_out=. iris.proto (ran from inside lib/phoenix_ml/protobuf folder)

This will generate iris_pb2.py file.

Update source code:

Next, we will update classifier.py file with below code (edited) —

import os

from sklearn.externals import joblib

import sys

sys.path.insert(0, 'lib/phoenix_ml/protobuf')



import iris_pb2



def load_model():

path = os.path.abspath('lib/phoenix_ml/model/classifier.pkl')

return joblib.load(path)



def predict_model(args):

iris_params = iris_pb2.IrisParams()

iris_params.ParseFromString(args)

model_params = [[iris_params.sepal_length, iris_params.sepal_width, iris_params.petal_length, iris_params.petal_width]]



iris_classifier = load_model()

return iris_classifier.predict(model_params)[0]

The bold parts are the necessary code changes. Here we —

imported generated iris_pb2 module

module de-serialized the args we received from Elixir app — ParseFromString(args) and defined a 2-D array model_params

we received from Elixir app — and defined a 2-D array made prediction with model_params

At this stage, our changes are done and we should be able to make predictions as before as shown in the screenshot below —

Source Code

Source code for earlier project can be found here — https://github.com/imeraj/Phoenix_Playground/tree/master/1.4/phoenix_ml (master branch)

Updated source code using protocol buffers can be found here — https://github.com/imeraj/Phoenix_Playground/tree/protobuf_file/1.4/phoenix_ml (protobuf_file branch)

References

3. https://github.com/tony612/protobuf-elixir

I hope this articles helped some readers to understand how it’s possible to use Protocol Buffers to serialize and de-serialize data and pass around between Elixir and Python to build useful applications.

For more elaborate and in depth future technical posts please follow me here or on twitter.