In this post, we are about to accomplish something less common: building and installing TensorFlow with CPU support-only on Ubuntu server / desktop / laptop. We are targeting machines with older CPU, as for example those without Advanced Vector Extensions (AVX) support. This kind of setup can be a choice when we are not using TensorFlow to build a new AI model but instead only for obtaining the prediction (inference) served by a trained AI model. Compared with model training, the model inference is less computational intensive. Hence, instead of performing the computation using GPU acceleration, the task can be simply handled by CPU.
tl;dr The WHL file from TensorFlow CPU build is available for download from this Github repository.
Since we will build TensorFlow with CPU support only, the physical server will not need to be equipped with additional graphics card(s) to be mounted on the PCI slot(s). This is different with the case when we build TensorFlow with GPU support. For such case, we need to have at least one external (non built-in) graphics card that supports CUDA. Naturally, running TensorFlow with CPU pertains to be an economical approach to deep learning. Then how about the performance? Some benchmark results have shown that GPU performs better than CPU when performing deep learning tasks, especially for model training. However, this does not mean that TensorFlow CPU cannot be a feasible option. With proper CPU optimization, TensorFlow can exhibit improved performance that is comparable to its GPU counterpart. When cost is a more serious issue, let’s say we can only do the model training and inference in the cloud, leaning towards TensorFlow CPU can be a decision that also makes more sense from financial standpoint. Continue reading
In the previous posts, we’ve walked through the installations and configurations for various components and libraries required for doing deep learning / artificial intelligence on a Ubuntu 16.04 box. The next step is to be productive, crunching codes and solving problems by applying various algorithms. At this stage, visits to StackOverflow, Github or other similar sites become more frequent. And here is when the problem may arise. Not all codes or snippets copied and pasted from such online references can immediately work. One of the reasons is that the code was indeed written for same software, library, or tool but at different version.
Interestingly, software components for machine learning present different way to obtain the versions. These variations can sometimes result in additional time spent to query “ubuntu get xyz version” on the search engine. This is okay for one component, but when the system becomes complex enough (for example machine learning meets big data for ETL), this can turn into a productivity killer due to unjustifiable time taken for navigating the search engine.
Why not build a list for that?
This post summarizes the shell commands used for obtaining the versions of machine learning-related software and libraries. Commands are embodied in categories that reflect the logical / functional unit the software component belongs to. Continue reading
When developing a deep-learning system, especially during the modeling stage, a lot of trials and errors can be involved in evolving the codebase. The easy remedy to reduce errors will be by using a robust IDE that provides productivity-boosting features such as code completion, method definition, codestyle suggestion, advanced debugging, user-friendly UI, and so forth.
In this article, we will go into more details about Jupyter Notebook installation and configuration on Ubuntu 16.04. However, it’s important to note that the configuration depends on some pre-requisites. This article is the continuation of the previous article about TensorFlow installation. Please make sure you have read the article to understand the pre-requisites, otherwise some steps explained in this article may not work. Continue reading
What is interesting in the deep learning ecosystem is the plentiful choices of deep learning frameworks. On the other side, of course there is another equation; more options equate to more confusion, especially in choosing the most appropriate framework for the entire gamut of the problems. At the end of the day, instead of using one, we may need to stick with multiple deep learning frameworks with each usage depending on the nature of the problem to solve.
TensorFlow is one of the popular (de facto most popular in terms of Github stars) deep learning frameworks. TensorFlow comes with excellent documentation. This also includes the documentation for installation. If you go to the official documentation page for installation, you will be provided with elaborate installation guide for multiple OS platforms. Then why this post?
The latest version of TensorFlow with GPU support (version 1.8 at the time this post is published) is built against CUDA 9.0. However, NVIDIA has released CUDA 9.1 and there is possibility of newer version release in the near future. Given that TensorFlow is lagging behind the CUDA GA version, the publicly released TensorFlow bundle cannot immediately work on the system having only the latest CUDA version installed. A remedy for this is by installing from source, which can be non-trivial especially for those who are not so familiar with the source build mechanism.
The final system setup after completing the installation steps explained in the posts will be as follows.
|NVIDIA driver version||390.48
|Python install method||virtualenv
Note that the components will be updated in the future. This implies version upgrade for the components. It is expected that this post will still be valid even after version upgrade. Under the circumstances where this post becomes invalid, the content will be updated or another post will be written. Yet, this would be realized with sufficient comments or feedback regarding existing content. Continue reading
NVIDIA Collective Communications Library (NCCL) is a library developed to provide parallel computation primitives on multi-GPU and multi-node environment. The idea is to enable GPUs to collectively work to complete certain computing task. This is especially helpful when the computation is complex. With multiple GPUs working together, the task will be completed in less time, rendering a more performing system. People with background or experience in distributed system, such as Hadoop, may immediately relate this concept with similar model applied in the traditional distributed system. Hadoop, for example, supports MapReduce programming model that splits a compute job into chunks that are spread into the slave nodes and collected back by the master to produce the final output. Continue reading