In this post, we are about to accomplish something less common: building and installing TensorFlow with CPU support-only on Ubuntu server / desktop / laptop. We are targeting machines with older CPU, as for example those without Advanced Vector Extensions (AVX) support. This kind of setup can be a choice when we are not using TensorFlow to build a new AI model but instead only for obtaining the prediction (inference) served by a trained AI model. Compared with model training, the model inference is less computational intensive. Hence, instead of performing the computation using GPU acceleration, the task can be simply handled by CPU.

tl;dr The WHL file from TensorFlow CPU build is available for download from this Github repository.

Since we will build TensorFlow with CPU support only, the physical server will not need to be equipped with additional graphics card(s) to be mounted on the PCI slot(s). This is different with the case when we build TensorFlow with GPU support. For such case, we need to have at least one external (non built-in) graphics card that supports CUDA. Naturally, running TensorFlow with CPU pertains to be an economical approach to deep learning. Then how about the performance? Some benchmark results have shown that GPU performs better than CPU when performing deep learning tasks, especially for model training. However, this does not mean that TensorFlow CPU cannot be a feasible option. With proper CPU optimization, TensorFlow can exhibit improved performance that is comparable to its GPU counterpart. When cost is a more serious issue, let’s say we can only do the model training and inference in the cloud, leaning towards TensorFlow CPU can be a decision that also makes more sense from financial standpoint.

Optimizing the TensorFlow CPU Build

We optimize TensorFlow CPU build by turning on all the computation optimization opportunities provided by the CPU. We are interested in the flags information provided through /proc/cpuinfo. We obtain the flags information with this command:

$ more /proc/cpuinfo | grep flags

A sample output from the command invocation is shown below:

... flags : fpu de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pse36 clflush mmx fxsr sse sse2 syscall nx lm rep_good nopl pni cx16 hypervisor lahf_lm ...

Looking at the output, we may be inundated with cryptic text that looks meaningless. Indeed, there is certain articulation behind each flag. A visit to Linux kernel source helps unravel the meaning for each flag, which corresponds to a CPU feature. For more human-readable explanation about the flags, we can refer to the following post-style article on StackExchange.

We will then customize the TensorFlow source build to take advantage of the availability of some CPU features that contribute to a speedier execution of TensorFlow code. The list of CPU features is provided as follows.

No Flag CPU Feature Additional Info 1 ssse3 Supplemental Streaming SIMD Extensions 3 (SSSE-3) instruction set Link 2 sse4_1 Streaming SIMD Extensions 4.1 (SSE-4.1) instruction set Link 3 sse4_2 Streaming SIDM Extensions 4.2 (SSE-4.2) instruction set Link 4 fma Fused multiply-add (FMA) instruction set Link 5 cx16 CMPXCHG16B instruction (double-width compare-and-swap) Link 6 popcnt Population count instruction (count number of bits set to 1) Link 7 avx Advanced Vector Extensions Link 8 avx2 Advanced Vector Extension 2 Link

From the previous sample of /proc/cpuinfo output, we can see that the CPU does not support AVX and AVX2. The CPU also does not support SSSE-3, SSE-4.2, SSE-4.2, FMA, and POPCNT. Apparently, there is not much performance optimization that can be done for the build. However, in a different machine with more modern CPU, more CPU features shall be available, relative to the sample CPU. This means that we have more opportunity to optimize the TensorFlow performance.

Populating System Information

Prior to the building the source, we need to first populate the current system information. The build process described in this post was tested on Ubuntu 16.04 LTS with Python 2.7. Your mileage may vary if you perform the build on a machine with different Ubuntu or Python version. Let’s proceed with obtaining the necessary system information as follows.

– Ubuntu version

Command:

$ lsb_release -a| grep "Release" | awk '{print $2}'

– Python version

Command:

$ python --version 2>&1 | awk '{print $2}'

– GCC version

Command:

$ gcc --version | grep "gcc" | awk '{print $4}'

– TensorFlow optimization flags

The optimization flags will be supplied when configuring the TensorFlow source build. The following command is used to populate the optimization flags:

$ grep flags -m1 /proc/cpuinfo | cut -d ":" -f 2 | tr '[:upper:]' '[:lower:]' | { read FLAGS; OPT="-march=native"; for flag in $FLAGS; do case "$flag" in "sse4_1" | "sse4_2" | "ssse3" | "fma" | "cx16" | "popcnt" | "avx" | "avx2") OPT+=" -m$flag";; esac; done; MODOPT=${OPT//_/\.}; echo "$MODOPT"; }

After invoking all commands above, we then put the system information gathered in the following table. Put the output of each command invocation in the “Current” column.

Item Expected Current Ubuntu version 16.04 ... Python version >= 2.7.12 ... GCC version >= 5.4.0 ... TF optimization flags* -march=native -mcx16 ...

* Specific to TF optimization flags, the value in the “Expected” column is only an example taken from the sample CPU instead of an expected value

Pre-Installation

We will install TensorFlow in an isolated environment. To do so, we need to first create the Python virtual environment using virtualenv. Additionaly, we will also install Bazel, that will be used to build TensorFlow source code. The steps are explained as follows.

Step 1: Set locale to UTF-8

$ export LC_ALL="en_US.UTF-8" $ export LC_CTYPE="en_US.UTF-8" $ sudo dpkg-reconfigure locales

Step 2: Install pip and virtualenv for Python 2 and TensorFlow

$ sudo apt-get -y install python-pip python-dev python-virtualenv python-numpy python-wheel

Step 3: Create virtualenv environment for Python 2 (Virtualenv location: ~/virtualenv/tensorflow)

$ mkdir -p ~/virtualenv/tensorflow $ virtualenv --system-site-packages ~/virtualenv/tensorflow

Step 4: Activate the virtualenv environment

$ source ~/virtualenv/tensorflow/bin/activate

Verify the prompt is changed to:

(tensorflow) $

Step 5: (virtualenv) Ensure pip >= 8.1 is installed and upgrade to the latest version

– Get currently installed pip version

(tensorflow) $ pip --version | awk '{print $2}'

– Upgrade pip if necessary

(tensorflow) $ pip install --upgrade pip

Step 6: (virtualenv) Deactivate the virtualenv

(tensorflow) $ deactivate

Step 7: Install bazel to build TensorFlow

– Install Java JDK 8 (Open JDK) if there is no JDK installed

$ sudo apt-get install openjdk-8-jdk

– Add bazel private repository into source repository list

$ echo "deb [arch=amd64] http://storage.googleapis.com/bazel-apt stable jdk1.8" | sudo tee /etc/apt/sources.list.d/bazel.list $ curl https://bazel.build/bazel-release.pub.gpg | sudo apt-key add -

– Install the latest version of bezel

$ sudo apt-get update && sudo apt-get -y install bazel

The Installation

We are now ready to build and install TensorFlow. For the installation steps, we will proceed as follows.

Step 1: Create directory for the source

$ sudo mkdir -p ~/installers/tensorflow/tf-cpu

Step 2: Download the latest stable release of TensorFlow (release 1.10.0 at the time this post is written) into the source directory

$ cd ~/installers/tensorflow/tf-cpu $ wget https://github.com/tensorflow/tensorflow/archive/v1.10.0.zip

Step 3: Unzip the installer

$ unzip v1.10.0.zip

Step 4: Go to the inflated TensorFlow source directory

$ cd tensorflow-1.10.0

Step 5: Activate the virtualenv

$ source ~/virtualenv/tensorflow/bin/activate

Step 6: (virtualenv) Install additional Python modules required to build TensorFlow (enum and mock)

(tensorflow) $ pip install --upgrade enum34 mock

Step 7: (virtualenv) Configure the build file. We will configure TensorFlow without CUDA and with CPU optimization.

(tensorflow) $ ./configure

Sample configuration for reference:

Please specify the location of python. [Default is /home/MYUSER/virtualenv/tensorflow/bin/python]: /home/MYUSER/virtualenv/tensorflow/bin/python Please input the desired Python library path to use. Default is [/home/MYUSER/virtualenv/tensorflow/lib/python2.7/site-packages] /home/MYUSER/virtualenv/tensorflow/lib/python2.7/site-packages Do you wish to build TensorFlow with jemalloc as malloc support? [Y/n]: Y Do you wish to build TensorFlow with Google Cloud Platform support? [Y/n]: Y Do you wish to build TensorFlow with Hadoop File System support? [Y/n]: Y Do you wish to build TensorFlow with Amazon AWS Platform support? [Y/n]: Y Do you wish to build TensorFlow with Apache Kafka Platform support? [Y/n]: Y Do you wish to build TensorFlow with XLA JIT support? [y/N]: N Do you wish to build TensorFlow with GDR support? [y/N]: N Do you wish to build TensorFlow with VERBS support? [y/N]: N Do you wish to build TensorFlow with OpenCL SYCL support? [y/N]: N Do you wish to build TensorFlow with CUDA support? [y/N]: N Do you wish to download a fresh release of clang? (Experimental) [y/N]: N Do you wish to build TensorFlow with MPI support? [y/N]: N Please specify optimization flags to use during compilation when bazel option "--config=opt" is specified [Default is -march=native]: -march=native -mcx16 Would you like to interactively configure ./WORKSPACE for Android builds? [y/N]: N

Step 8: (virtualenv) Build TensorFlow source

– For GCC >= 5.x

(tensorflow) $ bazel build --config=opt --cxxopt="-D_GLIBCXX_USE_CXX11_ABI=0" //tensorflow/tools/pip_package:build_pip_package

– For GCC 4.x:

(tensorflow) $ bazel build --config=opt //tensorflow/tools/pip_package:build_pip_package

Step 9: (virtualenv) Create the .whl file from the bazel build

(tensorflow) $ mkdir tensorflow-pkg (tensorflow) $ ./bazel-bin/tensorflow/tools/pip_package/build_pip_package tensorflow-pkg

– Install the .whl file

(tensorflow) $ cd tensorflow-pkg && ls -al

After knowing the .whl file name:

(tensorflow) $ pip install tensorflow-1.10.0-cp27-cp27mu-linux_x86_64.whl

Step 10: (virtualenv) Verify the installation

– Check the installed TensorFlow version in the virtualenv

(tensorflow) $ python -c 'import tensorflow as tf; print(tf.__version__)'

– Run TensorFlow Hello World

(tensorflow) $ python -c 'import tensorflow as tf; hello = tf.constant("Hello, TensorFlow!"); sess = tf.Session(); print(sess.run(hello));'

Output of the last command:

Hello, TensorFlow!

Concluding Remark

We have successfully installed the latest TensorFlow with CPU support-only. If you are interested in running TensorFlow without CUDA GPU, you can start building from source as described in this post. I have also created a Github repository that hosts the WHL file created from the build. You can also check it out.

Future work will include performance benchmark between TensorFlow CPU and GPU. If this is something that you want to see in the future post, please write in the comment section.