How to Build and Install The Latest TensorFlow without CUDA GPU and with Optimized CPU Performance on Ubuntu

In this post, we are about to accomplish something less common: building and installing TensorFlow with CPU support-only on Ubuntu server / desktop / laptop. We are targeting machines with older CPU, as for example those without Advanced Vector Extensions (AVX) support. This kind of setup can be a choice when we are not using TensorFlow to build a new AI model but instead only for obtaining the prediction (inference) served by a trained AI model. Compared with model training, the model inference is less computational intensive. Hence, instead of performing the computation using GPU acceleration, the task can be simply handled by CPU.

tl;dr The WHL file from TensorFlow CPU build is available for download from this Github repository.

Since we will build TensorFlow with CPU support only, the physical server will not need to be equipped with additional graphics card(s) to be mounted on the PCI slot(s). This is different with the case when we build TensorFlow with GPU support. For such case, we need to have at least one external (non built-in) graphics card that supports CUDA. Naturally, running TensorFlow with CPU pertains to be an economical approach to deep learning. Then how about the performance? Some benchmark results have shown that GPU performs better than CPU when performing deep learning tasks, especially for model training. However, this does not mean that TensorFlow CPU cannot be a feasible option. With proper CPU optimization, TensorFlow can exhibit improved performance that is comparable to its GPU counterpart. When cost is a more serious issue, let’s say we can only do the model training and inference in the cloud, leaning towards TensorFlow CPU can be a decision that also makes more sense from financial standpoint.

Optimizing the TensorFlow CPU Build

We optimize TensorFlow CPU build by turning on all the computation optimization opportunities provided by the CPU. We are interested in the flags information provided through /proc/cpuinfo. We obtain the flags information with this command:

$ more /proc/cpuinfo | grep flags

A sample output from the command invocation is shown below:

...
flags   : fpu de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pse36 clflush mmx fxsr sse sse2 syscall nx lm rep_good nopl pni cx16 hypervisor lahf_lm
...

Looking at the output, we may be inundated with cryptic text that looks meaningless. Indeed, there is certain articulation behind each flag. A visit to Linux kernel source helps unravel the meaning for each flag, which corresponds to a CPU feature. For more human-readable explanation about the flags, we can refer to the following post-style article on StackExchange.

We will then customize the TensorFlow source build to take advantage of the availability of some CPU features that contribute to a speedier execution of TensorFlow code. The list of CPU features is provided as follows.

NoFlagCPU FeatureAdditional Info
1ssse3Supplemental Streaming SIMD Extensions 3 (SSSE-3) instruction setLink
2sse4_1Streaming SIMD Extensions 4.1 (SSE-4.1) instruction setLink
3sse4_2Streaming SIDM Extensions 4.2 (SSE-4.2) instruction setLink
4fmaFused multiply-add (FMA) instruction setLink
5cx16CMPXCHG16B instruction (double-width compare-and-swap)Link
6popcntPopulation count instruction (count number of bits set to 1)Link
7avxAdvanced Vector ExtensionsLink
8avx2Advanced Vector Extension 2Link

From the previous sample of /proc/cpuinfo output, we can see that the CPU does not support AVX and AVX2. The CPU also does not support SSSE-3, SSE-4.2, SSE-4.2, FMA, and POPCNT. Apparently, there is not much performance optimization that can be done for the build. However, in a different machine with more modern CPU, more CPU features shall be available, relative to the sample CPU. This means that we have more opportunity to optimize the TensorFlow performance.

Populating System Information

Prior to the building the source, we need to first populate the current system information. The build process described in this post was tested on Ubuntu 16.04 LTS with Python 2.7. Your mileage may vary if you perform the build on a machine with different Ubuntu or Python version. Let’s proceed with obtaining the necessary system information as follows.

– Ubuntu version
Command:

$ lsb_release -a| grep "Release" | awk '{print $2}'

– Python version
Command:

$ python --version 2>&1 | awk '{print $2}'

– GCC version
Command:

$ gcc --version | grep "gcc" | awk '{print $4}'

– TensorFlow optimization flags

The optimization flags will be supplied when configuring the TensorFlow source build. The following command is used to populate the optimization flags:

$ grep flags -m1 /proc/cpuinfo | cut -d ":" -f 2 | tr '[:upper:]' '[:lower:]' | { read FLAGS; OPT="-march=native"; for flag in $FLAGS; do case "$flag" in "sse4_1" | "sse4_2" | "ssse3" | "fma" | "cx16" | "popcnt" | "avx" | "avx2") OPT+=" -m$flag";; esac; done; MODOPT=${OPT//_/\.}; echo "$MODOPT"; }

After invoking all commands above, we then put the system information gathered in the following table. Put the output of each command invocation in the “Current” column.

ItemExpectedCurrent
Ubuntu version16.04...
Python version>= 2.7.12...
GCC version>= 5.4.0...
TF optimization flags*-march=native -mcx16 ...

* Specific to TF optimization flags, the value in the “Expected” column is only an example taken from the sample CPU instead of an expected value

Pre-Installation

We will install TensorFlow in an isolated environment. To do so, we need to first create the Python virtual environment using virtualenv. Additionaly, we will also install Bazel, that will be used to build TensorFlow source code. The steps are explained as follows.

Step 1: Set locale to UTF-8

$ export LC_ALL="en_US.UTF-8"
$ export LC_CTYPE="en_US.UTF-8"
$ sudo dpkg-reconfigure locales 

Step 2: Install pip and virtualenv for Python 2 and TensorFlow

$ sudo apt-get -y install python-pip python-dev python-virtualenv python-numpy python-wheel

Step 3: Create virtualenv environment for Python 2 (Virtualenv location: ~/virtualenv/tensorflow)

$ mkdir -p ~/virtualenv/tensorflow
$ virtualenv --system-site-packages ~/virtualenv/tensorflow

Step 4: Activate the virtualenv environment

$ source ~/virtualenv/tensorflow/bin/activate

Verify the prompt is changed to:

(tensorflow) $

Step 5: (virtualenv) Ensure pip >= 8.1 is installed and upgrade to the latest version
– Get currently installed pip version

(tensorflow) $ pip --version | awk '{print $2}'

– Upgrade pip if necessary

(tensorflow) $ pip install --upgrade pip

Step 6: (virtualenv) Deactivate the virtualenv

(tensorflow) $ deactivate

Step 7: Install bazel to build TensorFlow

– Install Java JDK 8 (Open JDK) if there is no JDK installed

$ sudo apt-get install openjdk-8-jdk

– Add bazel private repository into source repository list

$ echo "deb [arch=amd64] http://storage.googleapis.com/bazel-apt stable jdk1.8" | sudo tee /etc/apt/sources.list.d/bazel.list
$ curl https://bazel.build/bazel-release.pub.gpg | sudo apt-key add -

– Install the latest version of bezel

$ sudo apt-get update && sudo apt-get -y install bazel

The Installation

We are now ready to build and install TensorFlow. For the installation steps, we will proceed as follows.

Step 1: Create directory for the source

$ sudo mkdir -p ~/installers/tensorflow/tf-cpu

Step 2: Download the latest stable release of TensorFlow (release 1.10.0 at the time this post is written) into the source directory

$ cd ~/installers/tensorflow/tf-cpu
$ wget https://github.com/tensorflow/tensorflow/archive/v1.10.0.zip

Step 3: Unzip the installer

$ unzip v1.10.0.zip

Step 4: Go to the inflated TensorFlow source directory

$ cd tensorflow-1.10.0

Step 5: Activate the virtualenv

$ source ~/virtualenv/tensorflow/bin/activate

Step 6: (virtualenv) Install additional Python modules required to build TensorFlow (enum and mock)

(tensorflow) $ pip install --upgrade enum34 mock

Step 7: (virtualenv) Configure the build file. We will configure TensorFlow without CUDA and with CPU optimization.

(tensorflow) $ ./configure

Sample configuration for reference:

Please specify the location of python. [Default is /home/MYUSER/virtualenv/tensorflow/bin/python]: /home/MYUSER/virtualenv/tensorflow/bin/python
Please input the desired Python library path to use.  Default is [/home/MYUSER/virtualenv/tensorflow/lib/python2.7/site-packages]
/home/MYUSER/virtualenv/tensorflow/lib/python2.7/site-packages
Do you wish to build TensorFlow with jemalloc as malloc support? [Y/n]: Y
Do you wish to build TensorFlow with Google Cloud Platform support? [Y/n]: Y
Do you wish to build TensorFlow with Hadoop File System support? [Y/n]: Y
Do you wish to build TensorFlow with Amazon AWS Platform support? [Y/n]: Y
Do you wish to build TensorFlow with Apache Kafka Platform support? [Y/n]: Y
Do you wish to build TensorFlow with XLA JIT support? [y/N]: N
Do you wish to build TensorFlow with GDR support? [y/N]: N
Do you wish to build TensorFlow with VERBS support? [y/N]: N
Do you wish to build TensorFlow with OpenCL SYCL support? [y/N]: N
Do you wish to build TensorFlow with CUDA support? [y/N]: N
Do you wish to download a fresh release of clang? (Experimental) [y/N]: N
Do you wish to build TensorFlow with MPI support? [y/N]: N
Please specify optimization flags to use during compilation when bazel option "--config=opt" is specified [Default is -march=native]: -march=native -mcx16
Would you like to interactively configure ./WORKSPACE for Android builds? [y/N]: N

Step 8: (virtualenv) Build TensorFlow source

– For GCC >= 5.x

(tensorflow) $ bazel build --config=opt --cxxopt="-D_GLIBCXX_USE_CXX11_ABI=0" //tensorflow/tools/pip_package:build_pip_package

– For GCC 4.x:

(tensorflow) $ bazel build --config=opt //tensorflow/tools/pip_package:build_pip_package

Step 9: (virtualenv) Create the .whl file from the bazel build

(tensorflow) $ mkdir tensorflow-pkg
(tensorflow) $ ./bazel-bin/tensorflow/tools/pip_package/build_pip_package tensorflow-pkg

– Install the .whl file

(tensorflow) $ cd tensorflow-pkg && ls -al

After knowing the .whl file name:

(tensorflow) $ pip install tensorflow-1.10.0-cp27-cp27mu-linux_x86_64.whl

Step 10: (virtualenv) Verify the installation

– Check the installed TensorFlow version in the virtualenv

(tensorflow) $ python -c 'import tensorflow as tf; print(tf.__version__)'

– Run TensorFlow Hello World

(tensorflow) $ python -c 'import tensorflow as tf; hello = tf.constant("Hello, TensorFlow!"); sess = tf.Session(); print(sess.run(hello));'

Output of the last command:

Hello, TensorFlow!

Concluding Remark

We have successfully installed the latest TensorFlow with CPU support-only. If you are interested in running TensorFlow without CUDA GPU, you can start building from source as described in this post. I have also created a Github repository that hosts the WHL file created from the build. You can also check it out.

Future work will include performance benchmark between TensorFlow CPU and GPU. If this is something that you want to see in the future post, please write in the comment section.

20 thoughts on “How to Build and Install The Latest TensorFlow without CUDA GPU and with Optimized CPU Performance on Ubuntu

  1. Pingback: How to Resolve Error “Illegal instruction (core dumped)” when Running “import tensorflow” in a Python Program | Amikelive | Technology Blog

  2. Jonti

    Thanks so much for the post. My computer has some instruction extentions but not AVX so the instructions given by the main website failed with the dreded “illegal instruction”. Using an old v 1.5 version of tensorflow didn’t work and instead I got all sorts of problems with running tensorflow examples. I can’t beleive how dificult it is to compile tensorflow and the problems with python version etc. However copying and pasting your post it compiled and installed without any issues. It took a few hours to compile and lots of warnings as it was doing it but no fatal errors. I’m on kubuntu 18.04 LTS.

    18.04
    2.7.15rc1
    7.3.0
    -march=native -mssse3 -mcx16 -msse4.1 -msse4.2 -mpopcnt

    Cheers
    Jonti

    Reply
    1. Ninh Huong

      Hi Mikael,
      Thanks for your guide for install tensorflow. I have already installed successfully. I though so because I could use tensor1.10 in python. However I can find the libtensorflow file for API C++. I dont know what wrong with my installation processing.

      Thanks !

    2. Mikael Fernandus Simalango Post author

      Any additional information about the error? At what stage did it occur and how can one reproduce the error? Additional error logs / messages will also help.

    3. tokap

      Hello!
      – I’m running Linux Mint 20.3,my Ubuntu version is 20.3
      – my Python version is 3.8.10
      – my GCC version is 9.4.0
      – I have -march=native and -mcx16

      Can I still follow this guide to build Tensorflow?
      or can I install the WHL file from yuor Github repository?

    4. Mikael Fernandus Simalango Post author

      You can give it a try. There has been a lot of changes on Tensorflow side so I am not sure if it still works on your machine. If it fails, better to reach out to Tensorflow community. The whl file is for older Python and Ubuntu version. I don’t think it will work on your machine.

  3. Chris English

    Mikael,

    this was a great starting point that eventually got to tensorflow-2.0a.0 installed on an old cpu and laptop.

    I shared your wonderful grep on stackoverflow (https://stackoverflow.com/questions/53723217/is-there-a-version-of-tensorflow-not-compiled-for-avx-instructions/55165620#55165620) .

    I would suggest that it appears that native in ‘-march=native’ is a kind of placeholder for a more definitive ‘what is the family name’ of one’s cpu that can be found via

    gcc -march=native -Q –help=target|grep march

    that I found at (https://stackoverflow.com/questions/53723217/is-there-a-version-of-tensorflow-not-compiled-for-avx-instructions/55165620#55165620). replacing the =native with the actual quoted family name and extension instructions, both positive i.e. -mssse4.2, and negative -mno-avx, finally got me there, but none would have been possible absent your post. thank you,
    ubuntu 16.04 LTS

    Chris

    Reply
    1. Mikael Fernandus Simalango Post author

      Hi Chris, glad to know that the post provides some pointers to the resolve. Thanks for providing the link to the original article.

  4. Mario Uganda

    Thank you it, this was really helpful to me!

    My CPU is the Intel N5000, which is quite recent actually but since it is a budget component, it does not support AVX instructions.

    One note: I followed the procedure without setting up a virtual environment because I wanted it global, and I had no problems. Python 3.6 (selected the 3.6m as path) and TensorFlow 2.0.0b1

    Another note: The build may fail telling you that cannot find reference to header file “Python.h”. Simply install with pip3 all the packages that both this guide and the official build guide tell you to install.

    Reply
    1. Mikael Fernandus Simalango Post author

      Thanks for the additional note. The article was written with Python 2.7 installed on the system. For more recent version of TensorFlow and Python 3.x, some adjustment may be needed as you pointed out.

  5. Dananjaya Sakalasuriya

    $ bazel build –config=opt //tensorflow/tools/pip_package:build_pip_package
    WARNING: The following rc files are no longer being read, please transfer their contents or import their path into one of the standard rc files:
    /home/mys/installer/tensorflow/tf-cpu/tensorflow-1.10.0/tools/bazel.rc
    INFO: Writing tracer profile to ‘/home/mys/.cache/bazel/_bazel_mys/4e6398390da2ef9d809df3d09dbd281f/command.profile.gz’
    ERROR: /home/mys/installer/tensorflow/tf-cpu/tensorflow-1.10.0/WORKSPACE:3:1: name ‘http_archive’ is not defined
    ERROR: Error evaluating WORKSPACE file
    ERROR: error loading package ”: Encountered error while reading extension file ‘closure/defs.bzl’: no such package ‘@io_bazel_rules_closure//closure’: error loading package ‘external’: Could not load //external package
    ERROR: error loading package ”: Encountered error while reading extension file ‘closure/defs.bzl’: no such package ‘@io_bazel_rules_closure//closure’: error loading package ‘external’: Could not load //external package
    INFO: Elapsed time: 0.055s
    INFO: 0 processes.
    FAILED: Build did NOT complete successfully (0 packages loaded)

    i got this error when i run bazel build

    Reply
    1. Mikael Fernandus Simalango Post author

      As displayed in the error messsage:

      ERROR: error loading package ”: Encountered error while reading extension file ‘closure/defs.bzl’: no such package ‘@io_bazel_rules_closure//closure’: error loading package ‘external’: Could not load //external package

      Which indicates version incompatibility between bazel and TensorFlow. Check the version of TensorFlow you want to build and the minimum bazel version required.

  6. Julian Darley

    thanks for this article. unfortunately, even after many hours of effort, i could not get past this error (using ubuntu 20.4 and trying to install tensorflow 2.4.1):

    ERROR: no such package ‘@io_bazel_rules_go//go’: Traceback (most recent call last):
    File “/home/jd/.cache/bazel/_bazel_jd/0ec3a626668b7dbd02b7cf058958e7d9/external/bazel_tools/tools/build_defs/repo/git.bzl”, line 177
    _clone_or_update(ctx)
    File “/home/jd/.cache/bazel/_bazel_jd/0ec3a626668b7dbd02b7cf058958e7d9/external/bazel_tools/tools/build_defs/repo/git.bzl”, line 36, in _clone_or_update
    git_repo(ctx, directory)
    File “/home/jd/.cache/bazel/_bazel_jd/0ec3a626668b7dbd02b7cf058958e7d9/external/bazel_tools/tools/build_defs/repo/git_worker.bzl”, line 91, in git_repo
    _update(ctx, git_repo)
    File “/home/jd/.cache/bazel/_bazel_jd/0ec3a626668b7dbd02b7cf058958e7d9/external/bazel_tools/tools/build_defs/repo/git_worker.bzl”, line 101, in _update
    init(ctx, git_repo)
    File “/home/jd/.cache/bazel/_bazel_jd/0ec3a626668b7dbd02b7cf058958e7d9/external/bazel_tools/tools/build_defs/repo/git_worker.bzl”, line 115, in init
    _error(ctx.name, cl, st.stderr)
    File “/home/jd/.cache/bazel/_bazel_jd/0ec3a626668b7dbd02b7cf058958e7d9/external/bazel_tools/tools/build_defs/repo/git_worker.bzl”, line 181, in _error
    fail()
    error running ‘git init /home/jd/.cache/bazel/_bazel_jd/0ec3a626668b7dbd02b7cf058958e7d9/external/io_bazel_rules_go’ while working with @io_bazel_rules_go:
    src/main/tools/process-wrapper-legacy.cc:58: “execvp(git, …)”: No such file or directory
    INFO: Elapsed time: 8.336s
    INFO: 0 processes.

    i have searched high and low for some explanation but nothing seems to help. i am trying to install tensorflow 2.4.1 in a machine that has no cuda, an ancient gpu and no avx. one of the many issues i face is that ubuntu keeps wanting to default to 3.8 which afaik tensorflow doesn’t like.

    unfortunately i have more old boxes like this one that i would really like to install tensorflow on (so that i can run deepspeech ultimately).

    any help gratefully received.

    julian

    Reply
    1. Mikael Fernandus Simalango Post author

      i remember i had similar issue when using newer version of Bazel. you may need to specify the Bazel version used to compile the sources.

Leave a Reply

Your email address will not be published. Required fields are marked *