In this post I’ll start with data preparation, a small graph and use TensorBoard.

Background

In my previous article, I showed how to prepare a project on MacOS where you can use TensorFlow and compile in XCode.

In this article we will start coding, create a small graph, run it and visualize it.

Data preparation is probably half of the work when you work on a Machine Learning project, so we will see how to code that using TensorFlow APIs in C++.

When you search for C++ implementations of DNN, you usually find ways to load a pre-trained model (that was created in Python, trained and saved to disk), so you can load it to a C++ code for runtime optimization.

As stated in my first article, my goal is to start from scratch, and do everything in C++.

So eventually my project will have, all in C++:

1. Data preparation

2. DNN model creation

3. Training

4. Validation

When I started learning about DNNs, I found this 4-article series: Applied Deep Learning on Medium by Arden Dertat. In my mind this is one of the best tutorials for Deep Learning.

What I will try to do, is a C++ version of the model developed in part 4 of this series.

It is a scaled down version of AlexNet (the first CNN if I’m not wrong) but powerful enough to distinguish between a dog and a cat.

The model in Arden’s article is developed in Python and Keras, I will try to do that in C++.

But before we get into the Model, let review a few basic things.

Google examples

I found a couple of C++ examples in the TensorFlow code that you can clone from GitHub:

1. Label Image example — which shows how to load an image, load a pre-trained graph and run the image through the graph for classification.

2. Example Trainer — a simple Graph that is trained in multi-thread and multi-step fashion for better utilization of compute power.

If you followed my instructions from the previous post, you should already have the code.

In any case, I will use the first example code as my base for this article.

Basic concepts

If you are already familiar with TensorFlow C++ API, you can move on to the next section.

Graph

Graph is made out of Operations (the nodes) and Tensors (the edges). You build a Graph by adding nodes. The edges are the inputs and outputs of the nodes. Usually you will have also an input to the graph and an output from the graph at the beginning and the end.

The important thing to understand is that building a graph is building a model. This means the operations will not run immediately, but will compose the graph step by step and only when you execute the run command — it will actually run.

Think of it like coding in C++. You first write the code, then compile and run. When you run, you can change the input and get different results. In our case though, the model itself can also change like in the case where you train, and the weights are changing.

Sub-graphs are parts of the main graph which you can evaluate instead of the whole graph.

Running a graph is simply giving the engine the nodes you would like to evaluate (in most cases the last node you added). Since the graph is actually defining the dependencies between the nodes, the engine will walk the graph and evaluate the dependent nodes until it evaluates the nodes you asked for and give them as a result. If it does not need to run the whole graph to give the result, it will only run the relevant sub-graph.

The engine also allows for parallelism and distributed execution.

Scope

A scope is an object that provides context. It is holding the graph(s) and also the physical resources like CPU.

When using scope, you start by creating the root scope and from it create child scopes.

Scopes have names that are constructed by the names you give them and the operations within this scope.

Because scope also holds status, the names help you understand what’s going on in each step of constructing the graph.

Don’t confuse TensorFlow Scope with the C++ code scope, though the concepts are similar.

Session

A session is your connection to the TensorFlow engine. You create a client-session to run a graph and get the results.

As written above, when running, you give the session the inputs to the graphs, the nodes to evaluate, and optionally some options. You also give it a container for the results.

Operations and tensors

Operations are the nodes and tensors are the edges in the graph.

Nodes can be also operations-like nodes (Constants and Variables) and Edges can be Tensor-like objects (scalars, strings, etc.)

Let’s look at an example from the C++ guide (with some modifications):

Scope root = Scope::NewRootScope(); // 2x2 matrix auto a = Const(root, { {1, 2}, {2, 4} }); // 2x2 matrix auto b = Const(root, { {2, 2}, {1, 1} }); // a x b auto m = MatMul(root, a, b); ClientSession session(root); std::vector<Tensor> outputs; session.Run({m}, &outputs); // outputs[0] == [ [4, 4], [8, 8] ] cout << outputs[0].DebugString(4);

We create 2 2x2 matrices and multiply them (mathematically); the result is extracted from the output vector and sent to the out stream.

Let’s start some work!

The code for this article is provided in this repository

If you followed the instructions in the previous post, you should already have a project that can compile in XCode linked with the TensorFlow libraries.

Simply use the main.cpp file to create the example below.

Like I wrote, I want to base my code on the Image Label example mentioned above.

The main function I want to take is the one that reads an image, does some manipulations to it and returns a tensor you can feed another graph to.

This work is being done by a graph that will run in the TensorFlow engine.

I’m taking the function header as is from the example — the input parameters to the function are: file name, image size, normalization values and output reference. I added another bool flag for TensorBoard, we will see later why.

First, we create a new scope

auto root = Scope::NewRootScope();

Now let’s create a variable that will be fed by value when we run the graph

auto file_name_var = Placeholder(root.WithOpName(“input”), DT_STRING);

Note a few things:

1. We use auto since TensorFlow types are very confusing and hard to guess (Output can also be used as Input…). You will get used to it.

2. We create a child scope from root by giving it a name “input” which simply tags this node by this name.

3. The variable is not typed as string, this is a Tensor with elements of strings (a string is considered as a primitive type ).

Next, we want to create an operation that reads an image file. The framework provides many operations you can use. The API documentation lists (most of) them. It is important to look for the specific arguments it accepts and what type.

auto file_reader = ReadFile(root.WithOpName(“file_readr”), file_name_var);

Here note:

1. A new child scope is created with a new name.

2. We take the node we got from the previous operation and put it in this operation. That is the (edge) tensor flowing between the two nodes. This also defines the dependency between the second and the first node.

3. The returned output is another node that we will use in the next operation.

Now let’s decode the file using another operation:

const int wanted_channels = 3; auto image_reader = DecodeJpeg(root.WithOpName(“jpeg_reader”), file_reader, DecodeJpeg::Channels(wanted_channels));

1. There are a few decoders, the original example shows how to use other image types.

2. If you want to see what the operation accepts as input go to the documentation or look at the implementation file.

3. The operation also accepts attributes as a third input param (struct). In this case we pass the number of channels (colors) in the image. Again, the documentation may help here.

The output from the previous operation is a Tensor of uint8. We want to cast the elements to float numbers so we can do some math on them. This is typical to TensorFlow.

auto float_caster = Cast(root.WithOpName(“float_caster”), image_reader, DT_FLOAT);

Next, we want to add a fourth dimension to the existing three (height, width, channel). The fourth one is for the batch. This is also typical to TensorFlow in the case of image processing, having 4-dimension tensors of batch, height, width, channel (called NHWC).

auto dims_expander = ExpandDims(root.WithOpName(“dim”), float_caster, 0);

Note that 0 here means the dim is inserted at the beginning.

Now let’s resize the image to the size specified in the input to the function. This is done since when images are fed to a deep network, they are expected to be all the same size.

auto resized = ResizeBilinear(root.WithOpName(“size”), dims_expander, Const(root, {input_height, input_width}));

The ResizeBilinear accepts as size a one-dimension tensor with two elements: height and width. In our case we simply give it the output of a Const operation that creates that tensor for us.

Next is normalization of the float elements. Networks typically perform better when the values in the elements are between 0 and 1.

auto d = Div(root.WithOpName(“normalized”), Sub(root, resized, {input_mean}), {input_std});

In this line we again nest two operations in one — first the subtraction is done and then the division. In both, operations are element-wise, the second argument can be a scalar (one dimensional Tensor with one element) and hence the curled brackets.

Ready to run?

The graph is ready. Now we want to run it.

For that we need:

1. A session objects

2. Feed input to the variables

3. Decide on what node do we want to evaluate

4. Prepare a container for the output

ClientSession session(root); TF_CHECK_OK(session.Run({{file_name_var, file_name}}, {d}, out_tensors));

The Run method can have many forms, the basic one is a vector of nodes to evaluate and a pointer to a vector of output tensors.

We use another parameter for the input values.

Let’s review all of the requirements listed above:

1. We create a new ClientSession object based on the scope. The scope implicitly gives a reference of the graph created to the session object. Note that there could be more than one graph, but in our case, we use only the default one.

2. The input is provided as a map object where the index is a node and the element is the value. In our case we have only one input (the file name) so we provide one pair of variable node returned from Placeholder and a value of string. This is within another pair of curled brackets due to that being a list (a map).

3. The second parameter to the run function is a vector of nodes we want to evaluate. In our case this is only one — the last one we added to the graph.

4. out_tensors is a pointer to a vector<Tensor> object provide to the function from main. Each element in the returned vector corresponds to a node in the evaluation list. We expect to get only one element.

We use TF_CHECK_OK macro to check the result of the call to Run and log errors in case it fails.

As we expected, the result in out_tensors[0] is a tensor of 4 dimensions of a shape [1,299,299,3] (the input to the function had size of 299 by 299).

That’s a lot of work only for loading one image! And the graph we created is not trivial (for beginners). I heard there’s a way to visualize graphs.

TensorBoard

- So, you want to use TensorBoard? - Yes - No problems, simply run your model once and then call tf.summary.FileWriter… - That’s python, right? - Yes… - I want to use C++. - Hmm, in that case you have to do some work, you need to create an Event object and then serialize the graph… - No.

Fortunately, there is a hidden(?) file in the TensorFlow clone that does the same thing the python method does. You will need to add the summary_file_writer.cc file to the project (from /tensorflow/core/summary/) and include the summary_file_writer.h in your main.ccp file.

TensorBoard is a visualization utility to many things, among them the graph structure.

How TensorBoard works

First you need to create a folder. I created a folder “graphs” under my project folder. This is where you want to dump your graph summary data that TensorBoard knows to read.

TensorBoard is a background service that you trigger in a terminal and then open a browser in the port specified in your terminal.

Let’s start with the code part:

GraphDef graph; TF_RETURN_IF_ERROR(root.ToGraphDef(&graph)); SummaryWriterInterface* w; TF_CHECK_OK(CreateSummaryFileWriter(1, 0, "/Users/bennyfriedman/Code/TF2example/TF2example/graphs", ".img-graph", Env::Default(), &w)); TF_CHECK_OK(w->WriteGraph(0, make_unique<GraphDef>(graph)));

If you remember, we said that the scope object holds the graph. We need to extract the graph object using this utility method from scope.

Next we create a SummaryFileWriter by passing it the folder where to write the files to. The first two parameters control how frequently it will write, in our case we want it to write immediately, so max queue is 1 and wait for 0 milliseconds. The fourth param is an extension, this is in case you want to write more than one graph, so you know which one is which.

Next use the writer to write the file giving it the graph object.

After running this once, you should get a file in the folder with a name like events.out.tfevents.1556982975.mb-friedman.local.img-graph.

You don’t want to use this every time you run your graphs, so this is why I passed a flag in the function to be able to control this.

Now let’s fire up TensorBoard:

If you never installed it, first open terminal and run

pip install TensorBoard

Then run this:

tensorboard — logdir ~/Code/TF2example/TF2example/graphs/

where the folder provided after the — logdir is the one you have created and used in the code above.

TensorBoard will respond with something like:

TensorBoard 1.14.0a20190301 at http://mb-friedman.local:6006 (Press CTRL+C to quit)

Now you can open a browser and copy the URL to the address bar.

You will see something like this:

TensorBoard Graph visualization

Nice! This helps us debug more complex graphs and see what will actually run, who is feeding who? And what constants are being fed where.

Back to the code: What do we do with the outputs?

Eventually we will have many images extracted and we will create a DNN to be trained by them, but for now let’s make sure we did not damage the image.

To check that, let’s create another graph that takes the tensor and puts it back to an image.

auto root = Scope::NewRootScope(); auto un_normalized = Multiply(root.WithOpName("un_normalized"), Add(root, in_tensors[0], {input_mean}), {input_std}); auto shaped = Reshape(root.WithOpName("reshape"), un_normalized, Const(root, {input_height, input_width, 3})); auto casted = Cast(root.WithOpName("cast"), shaped, DT_UINT8); auto image = EncodeJpeg(root.WithOpName("EncodeJpeg"), casted); vector<Tensor> out_tensors; ClientSession session(root); TF_CHECK_OK(session.Run({image}, &out_tensors)); ofstream fs(file_name, ios::binary); fs << out_tensors[0].scalar<string>()();

Check out the image saved to the disk. It should be a resize of the original image.

Putting it all together

This is the code in main:

string image = "/Users/bennyfriedman/Code/TF2example/TF2example/data/grace_hopper.jpg"; int32 input_width = 299; int32 input_height = 299; float input_mean = 0; float input_std = 255; vector<Tensor> resized_tensors; Status read_tensor_status = ReadTensorFromImageFile(image, input_height, input_width, input_mean, input_std, &resized_tensors, true); cout << resized_tensors[0].shape().DebugString(); if (!read_tensor_status.ok()) { LOG(ERROR) << read_tensor_status; return -1; } Status write_tensor_staus = WriteTensorToImageFile("/Users/bennyfriedman/Code/TF2example/TF2example/data/output.jpg", input_height, input_width, input_mean, input_std, resized_tensors);

Troubleshooting

Using Status

You saw already that you need to check the Status object returned from some methods either by using the TF_CHECK_OK macro or simply using this Status return object like this:

Status st = … if(!st.ok()) LOG(ERROR) << st;

It will write a meaningful message to the output.

Scope status

Sometimes when you run the graph, you get a weird error like a Segmentation Fault or Access Violation, and you have no idea what went wrong.

This might be due to a problem with the graph you created. A typical problem is to provide the wrong data type to an operation or the wrong shape.

When creating graph elements, you need to check the scope object like this:

if(!root.ok()) LOG(FATAL) << root.status().ToString();

Also, in this case the output will give you a meaningful message telling you what’s wrong.

Debug string

Some TensorFlow objects support DebugString() method to turn the content of the object into human readable. Use it in Tensor and TensorShape as needed.

Summary

In this part we saw how a graph is created, run and visualized in TensorBoard.

In the next part I’ll start with creating a DNN model and training it.