How to Build and Install The Latest TensorFlow without CUDA GPU and with Optimized CPU Performance on Ubuntu

20 Replies

In this post, we are about to accomplish something less common: building and installing TensorFlow with CPU support-only on Ubuntu server / desktop / laptop. We are targeting machines with older CPU, as for example those without Advanced Vector Extensions (AVX) support. This kind of setup can be a choice when we are not using TensorFlow to build a new AI model but instead only for obtaining the prediction (inference) served by a trained AI model. Compared with model training, the model inference is less computational intensive. Hence, instead of performing the computation using GPU acceleration, the task can be simply handled by CPU.

tl;dr The WHL file from TensorFlow CPU build is available for download from this Github repository.

Since we will build TensorFlow with CPU support only, the physical server will not need to be equipped with additional graphics card(s) to be mounted on the PCI slot(s). This is different with the case when we build TensorFlow with GPU support. For such case, we need to have at least one external (non built-in) graphics card that supports CUDA. Naturally, running TensorFlow with CPU pertains to be an economical approach to deep learning. Then how about the performance? Some benchmark results have shown that GPU performs better than CPU when performing deep learning tasks, especially for model training. However, this does not mean that TensorFlow CPU cannot be a feasible option. With proper CPU optimization, TensorFlow can exhibit improved performance that is comparable to its GPU counterpart. When cost is a more serious issue, let’s say we can only do the model training and inference in the cloud, leaning towards TensorFlow CPU can be a decision that also makes more sense from financial standpoint.

Optimizing the TensorFlow CPU Build

We optimize TensorFlow CPU build by turning on all the computation optimization opportunities provided by the CPU. We are interested in the flags information provided through /proc/cpuinfo. We obtain the flags information with this command:

$ more /proc/cpuinfo | grep flags

A sample output from the command invocation is shown below:

...
flags   : fpu de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pse36 clflush mmx fxsr sse sse2 syscall nx lm rep_good nopl pni cx16 hypervisor lahf_lm
...

Looking at the output, we may be inundated with cryptic text that looks meaningless. Indeed, there is certain articulation behind each flag. A visit to Linux kernel source helps unravel the meaning for each flag, which corresponds to a CPU feature. For more human-readable explanation about the flags, we can refer to the following post-style article on StackExchange.

We will then customize the TensorFlow source build to take advantage of the availability of some CPU features that contribute to a speedier execution of TensorFlow code. The list of CPU features is provided as follows.

No	Flag	CPU Feature	Additional Info
1	ssse3	Supplemental Streaming SIMD Extensions 3 (SSSE-3) instruction set	Link
2	sse4_1	Streaming SIMD Extensions 4.1 (SSE-4.1) instruction set	Link
3	sse4_2	Streaming SIDM Extensions 4.2 (SSE-4.2) instruction set	Link
4	fma	Fused multiply-add (FMA) instruction set	Link
5	cx16	CMPXCHG16B instruction (double-width compare-and-swap)	Link
6	popcnt	Population count instruction (count number of bits set to 1)	Link
7	avx	Advanced Vector Extensions	Link
8	avx2	Advanced Vector Extension 2	Link

From the previous sample of /proc/cpuinfo output, we can see that the CPU does not support AVX and AVX2. The CPU also does not support SSSE-3, SSE-4.2, SSE-4.2, FMA, and POPCNT. Apparently, there is not much performance optimization that can be done for the build. However, in a different machine with more modern CPU, more CPU features shall be available, relative to the sample CPU. This means that we have more opportunity to optimize the TensorFlow performance.

Populating System Information

Prior to the building the source, we need to first populate the current system information. The build process described in this post was tested on Ubuntu 16.04 LTS with Python 2.7. Your mileage may vary if you perform the build on a machine with different Ubuntu or Python version. Let’s proceed with obtaining the necessary system information as follows.

– Ubuntu version
Command:

$ lsb_release -a| grep "Release" | awk '{print $2}'

– Python version
Command:

$ python --version 2>&1 | awk '{print $2}'

– GCC version
Command:

$ gcc --version | grep "gcc" | awk '{print $4}'

– TensorFlow optimization flags

The optimization flags will be supplied when configuring the TensorFlow source build. The following command is used to populate the optimization flags:

$ grep flags -m1 /proc/cpuinfo | cut -d ":" -f 2 | tr '[:upper:]' '[:lower:]' | { read FLAGS; OPT="-march=native"; for flag in $FLAGS; do case "$flag" in "sse4_1" | "sse4_2" | "ssse3" | "fma" | "cx16" | "popcnt" | "avx" | "avx2") OPT+=" -m$flag";; esac; done; MODOPT=${OPT//_/\.}; echo "$MODOPT"; }

After invoking all commands above, we then put the system information gathered in the following table. Put the output of each command invocation in the “Current” column.

Item	Expected	Current
Ubuntu version	16.04	...
Python version	>= 2.7.12	...
GCC version	>= 5.4.0	...
TF optimization flags*	-march=native -mcx16	...

* Specific to TF optimization flags, the value in the “Expected” column is only an example taken from the sample CPU instead of an expected value

Pre-Installation

We will install TensorFlow in an isolated environment. To do so, we need to first create the Python virtual environment using virtualenv. Additionaly, we will also install Bazel, that will be used to build TensorFlow source code. The steps are explained as follows.

Step 1: Set locale to UTF-8

$ export LC_ALL="en_US.UTF-8"
$ export LC_CTYPE="en_US.UTF-8"
$ sudo dpkg-reconfigure locales

Step 2: Install pip and virtualenv for Python 2 and TensorFlow

$ sudo apt-get -y install python-pip python-dev python-virtualenv python-numpy python-wheel

Step 3: Create virtualenv environment for Python 2 (Virtualenv location: ~/virtualenv/tensorflow)

$ mkdir -p ~/virtualenv/tensorflow
$ virtualenv --system-site-packages ~/virtualenv/tensorflow

Step 4: Activate the virtualenv environment

$ source ~/virtualenv/tensorflow/bin/activate

Verify the prompt is changed to:

(tensorflow) $

Step 5: (virtualenv) Ensure pip >= 8.1 is installed and upgrade to the latest version
– Get currently installed pip version

(tensorflow) $ pip --version | awk '{print $2}'

– Upgrade pip if necessary

(tensorflow) $ pip install --upgrade pip

Step 6: (virtualenv) Deactivate the virtualenv

(tensorflow) $ deactivate

Step 7: Install bazel to build TensorFlow

– Install Java JDK 8 (Open JDK) if there is no JDK installed

$ sudo apt-get install openjdk-8-jdk

– Add bazel private repository into source repository list

$ echo "deb [arch=amd64] http://storage.googleapis.com/bazel-apt stable jdk1.8" | sudo tee /etc/apt/sources.list.d/bazel.list
$ curl https://bazel.build/bazel-release.pub.gpg | sudo apt-key add -

– Install the latest version of bezel

$ sudo apt-get update && sudo apt-get -y install bazel

The Installation

We are now ready to build and install TensorFlow. For the installation steps, we will proceed as follows.

Step 1: Create directory for the source

$ sudo mkdir -p ~/installers/tensorflow/tf-cpu

Step 2: Download the latest stable release of TensorFlow (release 1.10.0 at the time this post is written) into the source directory

$ cd ~/installers/tensorflow/tf-cpu
$ wget https://github.com/tensorflow/tensorflow/archive/v1.10.0.zip

Step 3: Unzip the installer

$ unzip v1.10.0.zip

Step 4: Go to the inflated TensorFlow source directory

$ cd tensorflow-1.10.0

Step 5: Activate the virtualenv

$ source ~/virtualenv/tensorflow/bin/activate

Step 6: (virtualenv) Install additional Python modules required to build TensorFlow (enum and mock)

(tensorflow) $ pip install --upgrade enum34 mock

Step 7: (virtualenv) Configure the build file. We will configure TensorFlow without CUDA and with CPU optimization.

(tensorflow) $ ./configure

Sample configuration for reference:

Please specify the location of python. [Default is /home/MYUSER/virtualenv/tensorflow/bin/python]: /home/MYUSER/virtualenv/tensorflow/bin/python
Please input the desired Python library path to use.  Default is [/home/MYUSER/virtualenv/tensorflow/lib/python2.7/site-packages]
/home/MYUSER/virtualenv/tensorflow/lib/python2.7/site-packages
Do you wish to build TensorFlow with jemalloc as malloc support? [Y/n]: Y
Do you wish to build TensorFlow with Google Cloud Platform support? [Y/n]: Y
Do you wish to build TensorFlow with Hadoop File System support? [Y/n]: Y
Do you wish to build TensorFlow with Amazon AWS Platform support? [Y/n]: Y
Do you wish to build TensorFlow with Apache Kafka Platform support? [Y/n]: Y
Do you wish to build TensorFlow with XLA JIT support? [y/N]: N
Do you wish to build TensorFlow with GDR support? [y/N]: N
Do you wish to build TensorFlow with VERBS support? [y/N]: N
Do you wish to build TensorFlow with OpenCL SYCL support? [y/N]: N
Do you wish to build TensorFlow with CUDA support? [y/N]: N
Do you wish to download a fresh release of clang? (Experimental) [y/N]: N
Do you wish to build TensorFlow with MPI support? [y/N]: N
Please specify optimization flags to use during compilation when bazel option "--config=opt" is specified [Default is -march=native]: -march=native -mcx16
Would you like to interactively configure ./WORKSPACE for Android builds? [y/N]: N

Step 8: (virtualenv) Build TensorFlow source

– For GCC >= 5.x

(tensorflow) $ bazel build --config=opt --cxxopt="-D_GLIBCXX_USE_CXX11_ABI=0" //tensorflow/tools/pip_package:build_pip_package

– For GCC 4.x:

(tensorflow) $ bazel build --config=opt //tensorflow/tools/pip_package:build_pip_package

Step 9: (virtualenv) Create the .whl file from the bazel build

(tensorflow) $ mkdir tensorflow-pkg
(tensorflow) $ ./bazel-bin/tensorflow/tools/pip_package/build_pip_package tensorflow-pkg

– Install the .whl file

(tensorflow) $ cd tensorflow-pkg && ls -al

After knowing the .whl file name:

(tensorflow) $ pip install tensorflow-1.10.0-cp27-cp27mu-linux_x86_64.whl

Step 10: (virtualenv) Verify the installation

– Check the installed TensorFlow version in the virtualenv

(tensorflow) $ python -c 'import tensorflow as tf; print(tf.__version__)'

– Run TensorFlow Hello World

(tensorflow) $ python -c 'import tensorflow as tf; hello = tf.constant("Hello, TensorFlow!"); sess = tf.Session(); print(sess.run(hello));'

Output of the last command:

Hello, TensorFlow!

Concluding Remark

We have successfully installed the latest TensorFlow with CPU support-only. If you are interested in running TensorFlow without CUDA GPU, you can start building from source as described in this post. I have also created a Github repository that hosts the WHL file created from the build. You can also check it out.

Future work will include performance benchmark between TensorFlow CPU and GPU. If this is something that you want to see in the future post, please write in the comment section.

20 thoughts on “How to Build and Install The Latest TensorFlow without CUDA GPU and with Optimized CPU Performance on Ubuntu”

Pingback: How to Resolve Error “Illegal instruction (core dumped)” when Running “import tensorflow” in a Python Program | Amikelive | Technology Blog
Jonti November 5, 2018 at 10:20 pm

Thanks so much for the post. My computer has some instruction extentions but not AVX so the instructions given by the main website failed with the dreded “illegal instruction”. Using an old v 1.5 version of tensorflow didn’t work and instead I got all sorts of problems with running tensorflow examples. I can’t beleive how dificult it is to compile tensorflow and the problems with python version etc. However copying and pasting your post it compiled and installed without any issues. It took a few hours to compile and lots of warnings as it was doing it but no fatal errors. I’m on kubuntu 18.04 LTS.

18.04
2.7.15rc1
7.3.0
-march=native -mssse3 -mcx16 -msse4.1 -msse4.2 -mpopcnt

Cheers
Jonti

Reply ↓
1. Mikael Fernandus Simalango Post authorNovember 7, 2018 at 9:27 pm
  
  Hi Jonti, glad to know that the post helps!
2. Ninh Huong May 23, 2019 at 8:28 pm
  
  Hi Mikael,
  Thanks for your guide for install tensorflow. I have already installed successfully. I though so because I could use tensor1.10 in python. However I can find the libtensorflow file for API C++. I dont know what wrong with my installation processing.
  
  Thanks !
3. Mikael Fernandus Simalango Post authorMay 29, 2019 at 1:37 am
  
  Any additional information about the error? At what stage did it occur and how can one reproduce the error? Additional error logs / messages will also help.
4. tokap October 6, 2022 at 3:04 am
  
  Hello!
  – I’m running Linux Mint 20.3,my Ubuntu version is 20.3
  – my Python version is 3.8.10
  – my GCC version is 9.4.0
  – I have -march=native and -mcx16
  
  Can I still follow this guide to build Tensorflow?
  or can I install the WHL file from yuor Github repository?
5. Mikael Fernandus Simalango Post authorOctober 6, 2022 at 6:32 pm
  
  You can give it a try. There has been a lot of changes on Tensorflow side so I am not sure if it still works on your machine. If it fails, better to reach out to Tensorflow community. The whl file is for older Python and Ubuntu version. I don’t think it will work on your machine.
Chris English March 15, 2019 at 5:28 pm

Mikael,

this was a great starting point that eventually got to tensorflow-2.0a.0 installed on an old cpu and laptop.

I shared your wonderful grep on stackoverflow (https://stackoverflow.com/questions/53723217/is-there-a-version-of-tensorflow-not-compiled-for-avx-instructions/55165620#55165620) .

I would suggest that it appears that native in ‘-march=native’ is a kind of placeholder for a more definitive ‘what is the family name’ of one’s cpu that can be found via

gcc -march=native -Q –help=target|grep march

that I found at (https://stackoverflow.com/questions/53723217/is-there-a-version-of-tensorflow-not-compiled-for-avx-instructions/55165620#55165620). replacing the =native with the actual quoted family name and extension instructions, both positive i.e. -mssse4.2, and negative -mno-avx, finally got me there, but none would have been possible absent your post. thank you,
ubuntu 16.04 LTS

Chris

Reply ↓
1. Mikael Fernandus Simalango Post authorMarch 19, 2019 at 2:31 am
  
  Hi Chris, glad to know that the post provides some pointers to the resolve. Thanks for providing the link to the original article.
isaac_Techno May 16, 2019 at 9:26 am

Thanks a lot for this article, I very interested with this published

Reply ↓
Alex Shcherbyna (@al__droid) May 29, 2019 at 6:00 am

Great tutorial! I had to add couple flags to “– For GCC >= 5.x” because basel version is 0.26.0 now but all flags were suggested in bazel error stacktrace. The rest was perfect, thank you.

Reply ↓
1. Mikael Fernandus Simalango Post authorJune 9, 2019 at 8:34 pm
  
  Great to know the article helps.
Mario Uganda August 21, 2019 at 5:55 am

Thank you it, this was really helpful to me!

My CPU is the Intel N5000, which is quite recent actually but since it is a budget component, it does not support AVX instructions.

One note: I followed the procedure without setting up a virtual environment because I wanted it global, and I had no problems. Python 3.6 (selected the 3.6m as path) and TensorFlow 2.0.0b1

Another note: The build may fail telling you that cannot find reference to header file “Python.h”. Simply install with pip3 all the packages that both this guide and the official build guide tell you to install.

Reply ↓
1. Mikael Fernandus Simalango Post authorAugust 30, 2019 at 8:49 pm
  
  Thanks for the additional note. The article was written with Python 2.7 installed on the system. For more recent version of TensorFlow and Python 3.x, some adjustment may be needed as you pointed out.
Dananjaya Sakalasuriya November 12, 2019 at 11:30 pm

$ bazel build –config=opt //tensorflow/tools/pip_package:build_pip_package
WARNING: The following rc files are no longer being read, please transfer their contents or import their path into one of the standard rc files:
/home/mys/installer/tensorflow/tf-cpu/tensorflow-1.10.0/tools/bazel.rc
INFO: Writing tracer profile to ‘/home/mys/.cache/bazel/_bazel_mys/4e6398390da2ef9d809df3d09dbd281f/command.profile.gz’
ERROR: /home/mys/installer/tensorflow/tf-cpu/tensorflow-1.10.0/WORKSPACE:3:1: name ‘http_archive’ is not defined
ERROR: Error evaluating WORKSPACE file
ERROR: error loading package ”: Encountered error while reading extension file ‘closure/defs.bzl’: no such package ‘@io_bazel_rules_closure//closure’: error loading package ‘external’: Could not load //external package
ERROR: error loading package ”: Encountered error while reading extension file ‘closure/defs.bzl’: no such package ‘@io_bazel_rules_closure//closure’: error loading package ‘external’: Could not load //external package
INFO: Elapsed time: 0.055s
INFO: 0 processes.
FAILED: Build did NOT complete successfully (0 packages loaded)

i got this error when i run bazel build

Reply ↓
1. Mikael Fernandus Simalango Post authorApril 7, 2020 at 9:49 pm
  
  As displayed in the error messsage:
  
  ERROR: error loading package ”: Encountered error while reading extension file ‘closure/defs.bzl’: no such package ‘@io_bazel_rules_closure//closure’: error loading package ‘external’: Could not load //external package
  
  Which indicates version incompatibility between bazel and TensorFlow. Check the version of TensorFlow you want to build and the minimum bazel version required.
djamila_st January 5, 2020 at 2:56 am

Beautiful article. thank you

Reply ↓
djamila_st June 24, 2020 at 1:14 pm

Thank you a lot for this web site

Reply ↓
Julian Darley March 7, 2021 at 8:11 am

thanks for this article. unfortunately, even after many hours of effort, i could not get past this error (using ubuntu 20.4 and trying to install tensorflow 2.4.1):

ERROR: no such package ‘@io_bazel_rules_go//go’: Traceback (most recent call last):
File “/home/jd/.cache/bazel/_bazel_jd/0ec3a626668b7dbd02b7cf058958e7d9/external/bazel_tools/tools/build_defs/repo/git.bzl”, line 177
_clone_or_update(ctx)
File “/home/jd/.cache/bazel/_bazel_jd/0ec3a626668b7dbd02b7cf058958e7d9/external/bazel_tools/tools/build_defs/repo/git.bzl”, line 36, in _clone_or_update
git_repo(ctx, directory)
File “/home/jd/.cache/bazel/_bazel_jd/0ec3a626668b7dbd02b7cf058958e7d9/external/bazel_tools/tools/build_defs/repo/git_worker.bzl”, line 91, in git_repo
_update(ctx, git_repo)
File “/home/jd/.cache/bazel/_bazel_jd/0ec3a626668b7dbd02b7cf058958e7d9/external/bazel_tools/tools/build_defs/repo/git_worker.bzl”, line 101, in _update
init(ctx, git_repo)
File “/home/jd/.cache/bazel/_bazel_jd/0ec3a626668b7dbd02b7cf058958e7d9/external/bazel_tools/tools/build_defs/repo/git_worker.bzl”, line 115, in init
_error(ctx.name, cl, st.stderr)
File “/home/jd/.cache/bazel/_bazel_jd/0ec3a626668b7dbd02b7cf058958e7d9/external/bazel_tools/tools/build_defs/repo/git_worker.bzl”, line 181, in _error
fail()
error running ‘git init /home/jd/.cache/bazel/_bazel_jd/0ec3a626668b7dbd02b7cf058958e7d9/external/io_bazel_rules_go’ while working with @io_bazel_rules_go:
src/main/tools/process-wrapper-legacy.cc:58: “execvp(git, …)”: No such file or directory
INFO: Elapsed time: 8.336s
INFO: 0 processes.

i have searched high and low for some explanation but nothing seems to help. i am trying to install tensorflow 2.4.1 in a machine that has no cuda, an ancient gpu and no avx. one of the many issues i face is that ubuntu keeps wanting to default to 3.8 which afaik tensorflow doesn’t like.

unfortunately i have more old boxes like this one that i would really like to install tensorflow on (so that i can run deepspeech ultimately).

any help gratefully received.

julian

Reply ↓
1. Mikael Fernandus Simalango Post authorMay 12, 2021 at 6:19 am
  
  i remember i had similar issue when using newer version of Bazel. you may need to specify the Bazel version used to compile the sources.

Cookie	Duration	Description
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.

Amikelive | Technology Blog

Read . Enjoy . Share