Last update: June 1st, 2022
The last post about CUDA installation guide was for CUDA 9.2. We went through several types of CUDA installation methods, including the multiple-version CUDA installs. While the guide is still valid for CUDA 9.2, NVIDIA keeps releasing newer versions of CUDA. As a concrete example, when this article was first written in December 2018, the latest CUDA version was CUDA 10, taking the spotlight from CUDA 9.2. If we are about to upgrade to CUDA 10, how can we achieve so? Can we simply upgrade the CUDA toolkit without upgrading the display driver?
Handling CUDA Version Upgrade
CUDA version upgrade itself can be a misleading term because since CUDA 8.0, multiple versions of CUDA can be installed on the same machine. But let’s have a simple scenario where we already have CUDA 9.1 installed and only want to upgrade to CUDA 10. NVIDIA states that each version of CUDA toolkit requires certain minimum NVIDIA display version that should be satisfied. This means that when upgrading to newer version of CUDA toolkit, we need to make sure that the currently installed display driver version is newer/bigger than the minimum compatible display driver version. In other words, standard CUDA upgrade involves two upgrade processes: CUDA (toolkit) upgrade and driver upgrade. The following picture visualizes the standard upgrade process from CUDA 9.1 to CUDA 10: the toolkit is upgraded from 9.1 to 10 and the driver is upgraded from 390 to 410.
In the previous post, we’ve proceeded with CUDA 9.1 installation on Ubuntu 16.04 LTS. As with other software that evolves, NVIDIA released CUDA 9.2 back in May. It is also safe to assume that CUDA 9.2 will not be final version. Newer version will may come soon or later and here we are left with the bogging question: “How can we upgrade safely without clobbering the currently working system?” Moreover, we may also wonder if there is a mechanism to rollback the change and live with current setup while recognizing that it’s not yet the time to upgrade.
This post will cover three scenarios of CUDA 9.2 installation: 1) fresh installation, 2) install to upgrade by removing old version, 3) install to upgrade and keep multiple versions. Continue reading
NVIDIA Collective Communications Library (NCCL) is a library developed to provide parallel computation primitives on multi-GPU and multi-node environment. The idea is to enable GPUs to collectively work to complete certain computing task. This is especially helpful when the computation is complex. With multiple GPUs working together, the task will be completed in less time, rendering a more performing system. People with background or experience in distributed system, such as Hadoop, may immediately relate this concept with similar model applied in the traditional distributed system. Hadoop, for example, supports MapReduce programming model that splits a compute job into chunks that are spread into the slave nodes and collected back by the master to produce the final output. Continue reading
In the recent posts, we have been going through the installation of deep learning framework like Caffe2 and its dependencies, such as CUDA or cuDNN. In this post, we will go few steps back to the very basic prerequisite of setting up a GPU-powered deep learning system: display driver installation. We will specifically focus on NVIDIA display driver installation due to the pervasiveness and robustness of NVIDIA GPUs as deep learning infrastructure.
Before proceeding to the installation, let’s discuss some key terminologies related with the use of NVIDIA GPUs as the computing infrastructure in a deep learning system.
GPU: Graphical / Graphics Processing Unit. A unit of computation, in a form of a small chip on the graphics card, traditionally intended to perform rapid computation for image / graphics rendering and display purpose. A graphics card can contain one or more GPUs while one GPU can be built of hundreds or thousands of cores.
CUDA: A parallel programming model and the implementation as a computing platform developed by NVIDIA to perform computation on the GPUs. CUDA was designed to speed up computation by harnessing the power of the parallel computation utilizing hundreds or thousands of the GPU cores.
CUDA-enabled GPUs: NVIDIA GPUs that support CUDA programming model and implementation
CUDA compute capability: A number that refers to the general specifications and available features especially in terms of parallel computing methods of a CUDA-enabled GPU. The full list of the available features in each compute capability can be seen here.
Note on CUDA compute capability and deep learning:
It is important to note that if you plan to use an NVIDIA GPU for deep learning purpose, you need to make sure that the compute capability of the GPU is at least 3.0 (Kepler architecture). Continue reading
When performing deep learning tasks especially on a single physical machine, there can be a moment where we need to execute tasks in parallel. Suppose that we are evaluating different models. We may need a task to calculate the precision and recall of a certain model while at the same time we are in need for training another model. We can proceed with the sequential operation, doing the tasks one by one. But life will be much easier if the tasks can be done in parallel. A possible route to achieving this is by creating several containers and perform distinct task in each container.
NVIDIA provides a utility called nvidia-docker. The utility enables creation of Docker containers that leverage CUDA GPU computing when being run. Under the hood, nvidia-docker will add a new Docker runtime called nvidia during the installation. By specifying this runtime when invoking a command in a (new) Docker container, the command execution will be accelerated with the GPUs. Continue reading