How to Properly Install NVIDIA Graphics Driver on Ubuntu 16.04

In the recent posts, we have been going through the installation of deep learning framework like Caffe2 and its dependencies, such as CUDA or cuDNN. In this post, we will go few steps back to the very basic prerequisite of setting up a GPU-powered deep learning system: display driver installation. We will specifically focus on NVIDIA display driver installation due to the pervasiveness and robustness of NVIDIA GPUs as deep learning infrastructure.

Key Terminologies

Before proceeding to the installation, let’s discuss some key terminologies related with the use of NVIDIA GPUs as the computing infrastructure in a deep learning system.

GPU: Graphical / Graphics Processing Unit. A unit of computation, in a form of a small chip on the graphics card, traditionally intended to perform rapid computation for image / graphics rendering and display purpose. A graphics card can contain one or more GPUs while one GPU can be built of hundreds or thousands of cores.

CUDA: A parallel programming model and the implementation as a computing platform developed by NVIDIA to perform computation on the GPUs. CUDA was designed to speed up computation by harnessing the power of the parallel computation utilizing hundreds or thousands of the GPU cores.

CUDA-enabled GPUs: NVIDIA GPUs that support CUDA programming model and implementation

CUDA compute capability: A number that refers to the general specifications and available features especially in terms of parallel computing methods of a CUDA-enabled GPU. The full list of the available features in each compute capability can be seen here.

Note on CUDA compute capability and deep learning:
It is important to note that if you plan to use an NVIDIA GPU for deep learning purpose, you need to make sure that the compute capability of the GPU is at least 3.0 (Kepler architecture).

Installation Steps

After ensuring that you already have the right graphics card and have it properly mounted on the PCI / PCI-e slot, we’ll now proceed with the graphics card installation. After rebooting / turning on the machine, let’s open a terminal session for command line installation.

Step 0: Perform a quick test for checking driver installation status
If you are not installing the driver on a freshly created Ubuntu VM or newly installed Ubuntu OS, you may need to perform a quick check for the driver status. Invoke this command from the terminal:


$ nvidia-smi

If you see this error:

bash: nvidia-smi: command not found

It means that the driver truly is not installed yet and we can safely proceed with the remaining installation steps. Otherwise, you may check the driver version and perform version upgrade if necessary.

Step 1: Check that the graphics card is connected to PCI bus
$ lspci | grep -i nvidia

Output:

84:00.0 3D controller: NVIDIA Corporation GK210GL [Tesla K80] (rev a1)
85:00.0 3D controller: NVIDIA Corporation GK210GL [Tesla K80] (rev a1)

Explanation: From the output above, we can see that the graphics card model is Tesla K80.

Step 2: Check the recommended driver version from NVidia website.

From the NVIDIA driver download page, we provide the graphics card, OS, the CUDA toolkit information.
For Tesla K80 to be installed on Ubuntu 16.04 with CUDA toolkit 9.1, the recommended driver version was 390.46.

Step 3: Check existing NVIDIA driver packages cached by apt


$ sudo apt-cache search nvidia | grep -E “nvidia-[0-9]{3}"

Note: If the output contains entries with nvidia-xxx (xxx= number) that corresponds to the package names for NVIDIA drivers, it means that the driver(s) have been cached by apt.

Step 4: Check the NVIDIA driver that is currently installed

$ dpkg -l | grep -E “nvidia-[0-9]{3}"

Sample output:

ii  nvidia-375    384.111-0ubuntu0.16.04.1    amd64    Transitional package for nvidia-384
ii  nvidia-384    384.111-0ubuntu0.16.04.1    amd64    NVIDIA binary driver - version 384.111

Note:
From the output list above, the entries “ii nvidia-375 …” and “ii nvidia-384 …” correspond to installed version of NVIDIA drivers. We then need to remove them from the system prior to installing a newer driver.


$ sudo apt-get purge nvidia-*

Step 5: Add NVIDIA private repository for graphics drivers

$ sudo add-apt-repository ppa:graphics-drivers/ppa

Troubleshooting:
– If error is shown when executing the command: “sudo: add-apt-repository: command not found
Resolve: install software-properties-common package

$ sudo apt-get install software-properties-common

Step 6: Update apt index

$ sudo apt-get update

Step 7: Verify once again the NVIDIA driver packages to install

$ sudo apt-cache search nvidia | grep -E “nvidia\-[0-9]{3}"

Sample output:

nvidia-304-dev - NVIDIA binary Xorg driver development files
nvidia-331 - Transitional package for nvidia-331
nvidia-331-dev - Transitional package for nvidia-340-dev
nvidia-331-updates - Transitional package for nvidia-340
nvidia-331-updates-dev - Transitional package for nvidia-340-dev
nvidia-331-updates-uvm - Transitional package for nvidia-340
nvidia-331-uvm - Transitional package for nvidia-340
nvidia-340-dev - NVIDIA binary Xorg driver development files
nvidia-340-updates - Transitional package for nvidia-340
nvidia-340-updates-dev - Transitional package for nvidia-340-dev
nvidia-340-updates-uvm - Transitional package for nvidia-340-updates
nvidia-340-uvm - Transitional package for nvidia-340
nvidia-346 - Transitional package for nvidia-346
nvidia-346-dev - Transitional package for nvidia-352-dev
nvidia-346-updates - Transitional package for nvidia-346-updates
nvidia-346-updates-dev - Transitional package for nvidia-352-updates-dev
nvidia-352 - Transitional package for nvidia-361
nvidia-352-dev - Transitional package for nvidia-361-dev
nvidia-352-updates - Transitional package for nvidia-361
nvidia-352-updates-dev - Transitional package for nvidia-361-dev
nvidia-361-updates - Transitional package for nvidia-361
nvidia-361-updates-dev - Transitional package for nvidia-361-dev
nvidia-304-updates - Transitional package for nvidia-304
nvidia-304-updates-dev - Transitional package for nvidia-304-dev
nvidia-361 - Transitional package for nvidia-367
nvidia-361-dev - Transitional package for nvidia-367-dev
nvidia-367 - Transitional package for nvidia-375
nvidia-367-dev - Transitional package for nvidia-375-dev
nvidia-375 - Transitional package for nvidia-384
nvidia-375-dev - Transitional package for nvidia-384-dev
nvidia-384-dev - NVIDIA binary Xorg driver development files
nvidia-304 - NVIDIA legacy binary driver - version 304.137
nvidia-340 - NVIDIA binary driver - version 340.106
nvidia-384 - NVIDIA binary driver - version 384.130
nvidia-387-dev - Transitional package for nvidia-390-dev
nvidia-387 - Transitional package for nvidia-390
nvidia-390-dev - NVIDIA binary Xorg driver development files
nvidia-390 - NVIDIA binary driver - version 390.48
nvidia-396-dev - NVIDIA binary Xorg driver development files
nvidia-396 - NVIDIA binary driver - version 396.18

Step 8: Install the latest / recommended NVIDIA driver. This time we will install the recommended version.

$ sudo apt-get install nvidia-390

Step 9: Reboot the system

$ sudo reboot

Troubleshooting: try to disable secure boot when rebooting after installing NVIDIA drivers

Step 10: Check the custom driver has been installed by invoking nvidia-smi command.

$ nvidia-smi

Sample output:

+-----------------------------------------------------------------------------+
| NVIDIA-SMI 390.48                 Driver Version: 390.48                    |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  Tesla K80           Off  | 00000000:84:00.0 Off |                    0 |
| N/A   33C    P0    57W / 149W |      0MiB / 11441MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
|   1  Tesla K80           Off  | 00000000:85:00.0 Off |                    0 |
| N/A   26C    P0    70W / 149W |      0MiB / 11441MiB |     92%      Default |
+-------------------------------+----------------------+----------------------+

Concluding remark

Installing NVIDIA custom graphics driver should not be too daunting. How about your experience? Did you face specific issue that you want to share with others?

5 thoughts on “How to Properly Install NVIDIA Graphics Driver on Ubuntu 16.04

  1. Pingback: Guide: Installing Cuda Toolkit 9.1 on Ubuntu 16.04 « Amikelive | Technology Blog

  2. Pingback: Guide: Installing Tensor Flow 1.8 with GPU Support against CUDA 9.1 and cuDNN 7.1 on Ubuntu 16.04 « Amikelive | Technology Blog

  3. Pingback: Comprehensive Guide: Installing Caffe2 with GPU Support by Building from Source on Ubuntu 16.04 « Amikelive | Technology Blog

  4. Pingback: Installing CUDA Toolkit 9.2 on Ubuntu 16.04: Fresh Install, Install by Removing Older Version, Install and Retain Old Version | Amikelive | Technology Blog

  5. Pingback: CUDA Compatibility of NVIDIA Display / GPU Drivers | Amikelive | Technology Blog

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.