Guide: Installing Tensor Flow 1.8 with GPU Support against CUDA 9.1 and cuDNN 7.1 on Ubuntu 16.04

What is interesting in the deep learning ecosystem is the plentiful choices of deep learning frameworks. On the other side, of course there is another equation; more options equate to more confusion, especially in choosing the most appropriate framework for the entire gamut of the problems. At the end of the day, instead of using one, we may need to stick with multiple deep learning frameworks with each usage depending on the nature of the problem to solve.

TensorFlow is one of the popular (de facto most popular in terms of Github stars) deep learning frameworks. TensorFlow comes with excellent documentation. This also includes the documentation for installation. If you go to the official documentation page for installation, you will be provided with elaborate installation guide for multiple OS platforms. Then why this post?

The latest version of TensorFlow with GPU support (version 1.8 at the time this post is published) is built against CUDA 9.0. However, NVIDIA has released CUDA 9.1 and there is possibility of newer version release in the near future. Given that TensorFlow is lagging behind the CUDA GA version, the publicly released TensorFlow bundle cannot immediately work on the system having only the latest CUDA version installed. A remedy for this is by installing from source, which can be non-trivial especially for those who are not so familiar with the source build mechanism.

The final system setup after completing the installation steps explained in the posts will be as follows.

OSUbuntu 16.04
NVIDIA driver version390.48
CUDA version9.1
cuDNN version7.1.3
NCCL version2.1.15
Python version2.7.12
Python install methodvirtualenv
TensorFlow version1.8.0

Note that the components will be updated in the future. This implies version upgrade for the components. It is expected that this post will still be valid even after version upgrade. Under the circumstances where this post becomes invalid, the content will be updated or another post will be written. Yet, this would be realized with sufficient comments or feedback regarding existing content.

Python 3 users may also wonder if the steps can also be replicated for Python 3. The answer is yes but not immediately. It is recommended to proceed with the installation in another virtualenv that is used specifically by Python 3 and perform the installation with Python 3 equivalent of the command.

Pre Installation

Prior to installation, we will perform some system check for the prerequisites.

Pre 0: Update apt and install the latest package

$ sudo apt-get -y update && sudo apt-get -y upgrade

Pre 1: Check Python version

$ python -V

Output (expected): Python 2.7.12 (or other 2.7 version)

Pre 2: Check GCC version

$ gcc --version

Output (expected): 5.4.0 (or other 5.x version)

Pre 3: Check NVIDIA driver install status

$ nvidia-smi

Output (expected): GPU information


bash: nvidia-smi: command not found

Meaning: Graphics driver has not been installed
Resolve: Install NVIDIA graphics driver. Refer to this article for the installation steps.

Pre 4: Check CUDA install status

$ nvcc --version

Output (expected): “… Cuda compilation tools, release 9.1 …”


bash: nvcc: command not found

Meaning: CUDA toolkit has not been installed
Resolve: Install CUDA toolkit. Refer to this article for the installation steps.

Pre 5: Check cuDNN install status

$ locate

Output (expected): /usr/lib/x86_64-linux-gnu/


Empty result returned

Meaning: cuDNN has not been installed.
Resolve: Install cuDNN. Refer to this article for the installation steps

Pre 6: Check NCCL install status

$ locate

Output (expected): /usr/local/nccl-2.1/lib/


Empty result returned

Meaning: NCCL has not been installed.
Resolve: Install NCCL. Refer to this article for the installation steps

Please note that even hough NCCL is optional in a TensorFlow installation, we set it as a prerequisite in order to harness GPU parallelism in future use.

Pre 7: Check CUDA profiling tools install status

$ dpkg --get-selections | grep cuda-command-line-tools

Output (expected): “cuda-command-line-tools-9-1 install”


Empty result returned or status is not “install”

Meaning: CUDA profiling tools library has not been installed.
Resolve: Install the library

Steps to install CUDA profiling tools
7.1 Install CUDA profiling tools using apt

$ sudo apt-get install cuda-command-line-tools-9-1

7.2 Add the profiling tools library to LD_LIBRARY_PATH

$ vi ~/.profile

Installation Steps

After confirming all checks are passed in the pre installation phase, we will proceed with the TensorFlow installation. We will install the virtualenv at ~/virtualenv/tensorflow.

Step 1: Set locale to UTF-8

$ export LC_ALL="en_US.UTF-8"
$ export LC_CTYPE="en_US.UTF-8"
$ sudo dpkg-reconfigure locales

Step 2: Install pip and virtualenv for Python 2

$ sudo apt-get install python-pip python-dev python-virtualenv

Step 3: Create virtualenv environment for Python 2 (Virtualenv location: ~/virtualenv/tensorflow)

$ mkdir -p ~/virtualenv/tensorflow
$ virtualenv --system-site-packages ~/virtualenv/tensorflow

Step 4: Activate the virtualenv environment

$ source ~/virtualenv/tensorflow/bin/activate

Verify the prompt is changed to:

(tensorflow) $

Step 5: (virtualenv) Ensure pip >= 8.1 is installed

(tensorflow) $ easy_install -U pip

Step 6: (virtualenv) Deactivate the virtualenv

(tensorflow) $ deactivate

Step 7: Install bazel to build TensorFlow from source

7.1 Install Java JDK 8 (Open JDK)

$ sudo apt-get install openjdk-8-jdk

7.2 Add bazel private repository into source repository list

$ echo "deb [arch=amd64] stable jdk1.8" | sudo tee /etc/apt/sources.list.d/bazel.list
$ curl | sudo apt-key add -

7.3 Install the latest version of bezel

$ sudo apt-get update && sudo apt-get install bazel

Step 8: Install TensorFlow Python dependencies

$ sudo apt-get install python-numpy python-dev python-pip python-wheel

Step 9: Build TensorFlow from source

9.1 Download the latest stable release of TensorFlow (release 1.8.0). We set target directory to ~/installers/tensorflow

$ mkdir -p ~/installers/tensorflow && cd ~/installers/tensorflow
$ wget

9.2 Unzip the installer

$ unzip v1.8.0

9.3 Go the inflated TensorFlow source directory

$ cd tensorflow-1.8.0

9.4 Configure the build file

$ ./configure

Sample configuration (note that other than CUDA configuration, your configuration may vary):

Please input the desired Python library path to use.  Default is [/usr/local/lib/python2.7/dist-packages] 
Do you wish to build TensorFlow with jemalloc as malloc support? [Y/n]: Y
Do you wish to build TensorFlow with Google Cloud Platform support? [Y/n]: Y
Do you wish to build TensorFlow with Hadoop File System support? [Y/n]: Y
Do you wish to build TensorFlow with Amazon S3 File System support? [Y/n]: Y
Do you wish to build TensorFlow with Apache Kafka Platform support? [Y/n]: Y
Do you wish to build TensorFlow with XLA JIT support? [y/N]: N
Do you wish to build TensorFlow with GDR support? [y/N]: N
Do you wish to build TensorFlow with VERBS support? [y/N]: N
Do you wish to build TensorFlow with OpenCL SYCL support? [y/N]: N
Do you wish to build TensorFlow with CUDA support? [y/N]: Y
Please specify the CUDA SDK version you want to use, e.g. 7.0. [Leave empty to default to CUDA 9.0]: 9.1
Please specify the location where CUDA 9.1 toolkit is installed. Refer to for more details. [Default is /usr/local/cuda]: /usr/local/cuda-9.1
Please specify the cuDNN version you want to use. [Leave empty to default to cuDNN 7.0]: 7.1.3
Please specify the location where cuDNN 7 library is installed. Refer to for more details. [Default is /usr/local/cuda-9.1]:/usr/lib/x86_64-linux-gnu
Do you wish to build TensorFlow with TensorRT support? [y/N]: N
Please specify the NCCL version you want to use. [Leave empty to default to NCCL 1.3]:2.1.15
Please specify the location where NCCL 2 library is installed. Refer to for more details. [Default is /usr/local/cuda-9.1]: /usr/local/nccl-2.1
Please specify a list of comma-separated Cuda compute capabilities you want to build with.
You can find the compute capability of your device at:
Please note that each additional compute capability significantly increases your build time and binary size. [Default is: 3.7,3.7]3.7
Do you want to use clang as CUDA compiler? [y/N]: N
Please specify which gcc should be used by nvcc as the host compiler. [Default is /usr/bin/gcc]: /usr/bin/gcc
Do you wish to build TensorFlow with MPI support? [y/N]: N
Please specify optimization flags to use during compilation when bazel option "--config=opt" is specified [Default is -march=native]: -march=native
Would you like to interactively configure ./WORKSPACE for Android builds? [y/N]: N

9.5 Build TensorFlow with GPU support
Since the GCC version is >= 5.0

$ bazel build --config=opt --config=cuda --cxxopt="-D_GLIBCXX_USE_CXX11_ABI=0" //tensorflow/tools/pip_package:build_pip_package

For GCC version 4 and older:

$ bazel build --config=opt --config=cuda  //tensorflow/tools/pip_package:build_pip_package


ImportError: No module named enum

Meaning: The enum module is not available until Python 3.4.
Resolve: Manually install enum module

$ pip install --upgrade enum34
ImportError: No module named mock

Meaning: The mock module is not installed
Resolve: Install mock module

$ pip install mock
no such package @nasm...All mirrors are down: [java 1="Could" 2="not" 3="generate" 4="DH" 5="keypair," 6="" 7="End" 8="user" 9="tried" 10="to" 11="act" 12="as" 13="a" 14="CA" language=".lang.RuntimeException:"][/java]

Meaning: One of or all the nasm package mirrors are down. This is a known isssue.
Resolve: Add a new mirror for nasm

$ vi tensorflow/workspace.bzl
      name = "nasm",
      urls = [
      sha256 = "00b0891c678c065446ca59bcee64719d0096d54d6886e6e472aeee2e170ae324",
      strip_prefix = "nasm-2.12.02",
      build_file = clean_dep("//third_party:nasm.BUILD"),

9.6 Create the .whl file from the bazel build (We create a directory named tensorflow-pkg)

$ mkdir tensorflow-pkg
$ ./bazel-bin/tensorflow/tools/pip_package/build_pip_package tensorflow-pkg

9.7 Activate the virtualenv

$ source ~/virtualenv/tensorflow/bin/activate

9.8 (virtualenv) Install the .whl file
– Obtain the .whl file name

(tensorflow) $ cd tensorflow-pkg && ls -al

– Install the .whl file via pip (example)

(tensorflow) $ pip install tensorflow-1.8.0-cp27-cp27mu-linux_x86_64.whl

9.9 (virtualenv) Verify the installation

(tensorflow) $ python
import tensorflow as tf
hello = tf.string_join([‘Hello’,'TensorFlow!’],’ ')
sess = tf.Session()

Output after the last line:

Hello TensorFlow!

The installation is now complete. We can now use TensorFlow in the system. Don’t forget that since TensorFlow is running in a virtualenv, we need to make sure that the virtualenv is activated when running a TensorFlow program.

Closing Remark

Making TensorFlow work with CUDA 9.1 should not be too daunting. If you have problem with your TensorFlow – CUDA 9.1 installation, or probably tips and trick for the installation, you can simply write in the comment section.

7 thoughts on “Guide: Installing Tensor Flow 1.8 with GPU Support against CUDA 9.1 and cuDNN 7.1 on Ubuntu 16.04

  1. Pingback: How to Install Jupyter Notebook as Service for Tensor Flow and Deep Learning on Ubuntu 16.04 « Amikelive | Technology Blog

  2. Osanda Gunasena

    Hi, I can compile with the following instruction but i am having issue while running. Specifically this: NotFoundError: cannot open shared object file: No such file or directory
    I installed into the main ubuntu python directory and running a jupyter notebook. (I did try with virtual env, didnt work so ditched that to reduce complexity.)

    The full error is as follows:
    NotFoundError Traceback (most recent call last)
    in ()
    6 training_targets=training_targets,
    7 validation_examples=validation_examples,
    —-> 8 validation_targets=validation_targets)

    in train_linear_classification_model(learning_rate, steps, batch_size, training_examples, training_targets, validation_examples, validation_targets)
    40 # Create a LinearClassifier object.
    41 my_optimizer = tf.train.AdagradOptimizer(learning_rate=learning_rate)
    —> 42 my_optimizer = tf.contrib.estimator.clip_gradients_by_norm(my_optimizer, 5.0)
    43 classifier = tf.estimator.LinearClassifier(
    44 feature_columns=construct_feature_columns(),

    /usr/local/lib/python2.7/dist-packages/tensorflow/python/util/lazy_loader.pyc in __getattr__(self, item)
    52 def __getattr__(self, item):
    —> 53 module = self._load()
    54 return getattr(module, item)

    /usr/local/lib/python2.7/dist-packages/tensorflow/python/util/lazy_loader.pyc in _load(self)
    40 def _load(self):
    41 # Import the target module and insert it into the parent’s namespace
    —> 42 module = importlib.import_module(self.__name__)
    43 self._parent_module_globals[self._local_name] = module

    /usr/lib/python2.7/importlib/__init__.pyc in import_module(name, package)
    35 level += 1
    36 name = _resolve_name(name[level:], package, level)
    —> 37 __import__(name)
    38 return sys.modules[name]

    /usr/local/lib/python2.7/dist-packages/tensorflow/contrib/ in ()
    34 from tensorflow.contrib import data
    35 from tensorflow.contrib import deprecated
    —> 36 from tensorflow.contrib import distribute
    37 from tensorflow.contrib import distributions
    38 from tensorflow.contrib import estimator

    /usr/local/lib/python2.7/dist-packages/tensorflow/contrib/distribute/ in ()
    21 # pylint: disable=unused-import,wildcard-import
    —> 22 from tensorflow.contrib.distribute.python.cross_tower_ops import *
    23 from tensorflow.contrib.distribute.python.mirrored_strategy import MirroredStrategy
    24 from tensorflow.contrib.distribute.python.monitor import Monitor

    /usr/local/lib/python2.7/dist-packages/tensorflow/contrib/distribute/python/ in ()
    21 import six
    —> 23 from tensorflow.contrib.distribute.python import cross_tower_utils
    24 from tensorflow.contrib.distribute.python import values as value_lib
    25 from tensorflow.python.client import device_lib

    /usr/local/lib/python2.7/dist-packages/tensorflow/contrib/distribute/python/ in ()
    21 import collections as pycoll
    —> 23 from tensorflow.contrib import nccl
    24 from tensorflow.python.framework import dtypes
    25 from tensorflow.python.framework import ops

    /usr/local/lib/python2.7/dist-packages/tensorflow/contrib/nccl/ in ()
    28 from __future__ import print_function
    —> 30 from tensorflow.contrib.nccl.python.ops.nccl_ops import all_max
    31 from tensorflow.contrib.nccl.python.ops.nccl_ops import all_min
    32 from tensorflow.contrib.nccl.python.ops.nccl_ops import all_prod

    /usr/local/lib/python2.7/dist-packages/tensorflow/contrib/nccl/python/ops/ in ()
    29 _nccl_ops_so = loader.load_op_library(
    —> 30 resource_loader.get_path_to_datafile(‘’))

    /usr/local/lib/python2.7/dist-packages/tensorflow/contrib/util/loader.pyc in load_op_library(path)
    54 return None
    55 path = resource_loader.get_path_to_datafile(path)
    —> 56 ret = load_library.load_op_library(path)
    57 assert ret, ‘Could not load %s’ % path
    58 return ret

    /usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/load_library.pyc in load_op_library(library_filename)
    54 RuntimeError: when unable to load the library or get the python wrappers.
    55 “””
    —> 56 lib_handle = py_tf.TF_LoadLibrary(library_filename)
    58 op_list_str = py_tf.TF_GetOpList(lib_handle)

    NotFoundError: cannot open shared object file: No such file or directory

    1. Mikael Fernandus Simalango Post author

      hi Osanda,

      glad to know that your issue was resolved. i was a little bit occupied by other things and couldn’t spend time looking at the new comments but you can stay tuned for upcoming posts.

Leave a Reply

Your email address will not be published. Required fields are marked *