Category Archives: Computer Science

All about computer science

How to Build and Install The Latest TensorFlow without CUDA GPU and with Optimized CPU Performance on Ubuntu

In this post, we are about to accomplish something less common: building and installing TensorFlow with CPU support-only on Ubuntu server / desktop / laptop. We are targeting machines with older CPU, as for example those without Advanced Vector Extensions (AVX) support. This kind of setup can be a choice when we are not using TensorFlow to build a new AI model but instead only for obtaining the prediction (inference) served by a trained AI model. Compared with model training, the model inference is less computational intensive. Hence, instead of performing the computation using GPU acceleration, the task can be simply handled by CPU.

tl;dr The WHL file from TensorFlow CPU build is available for download from this Github repository.

Since we will build TensorFlow with CPU support only, the physical server will not need to be equipped with additional graphics card(s) to be mounted on the PCI slot(s). This is different with the case when we build TensorFlow with GPU support. For such case, we need to have at least one external (non built-in) graphics card that supports CUDA. Naturally, running TensorFlow with CPU pertains to be an economical approach to deep learning. Then how about the performance? Some benchmark results have shown that GPU performs better than CPU when performing deep learning tasks, especially for model training. However, this does not mean that TensorFlow CPU cannot be a feasible option. With proper CPU optimization, TensorFlow can exhibit improved performance that is comparable to its GPU counterpart. When cost is a more serious issue, let’s say we can only do the model training and inference in the cloud, leaning towards TensorFlow CPU can be a decision that also makes more sense from financial standpoint. Continue reading

Command Cheatsheet: Checking Versions of Installed Software / Libraries / Tools for Deep Learning on Ubuntu 16.04

In the previous posts, we’ve walked through the installations and configurations for various components and libraries required for doing deep learning / artificial intelligence on a Ubuntu 16.04 box. The next step is to be productive, crunching codes and solving problems by applying various algorithms. At this stage, visits to StackOverflow, Github or other similar sites become more frequent. And here is when the problem may arise. Not all codes or snippets copied and pasted from such online references can immediately work. One of the reasons is that the code was indeed written for same software, library, or tool but at different version.

Interestingly, software components for machine learning present different way to obtain the versions. These variations can sometimes result in additional time spent to query “ubuntu get xyz version” on the search engine. This is okay for one component, but when the system becomes complex enough (for example machine learning meets big data for ETL), this can turn into a productivity killer due to unjustifiable time taken for navigating the search engine.

Why not build a list for that?

This post summarizes the shell commands used for obtaining the versions of machine learning-related software and libraries. Commands are embodied in categories that reflect the logical / functional unit the software component belongs to. Continue reading

What Object Categories / Labels Are In COCO Dataset?

One important element of deep learning and machine learning at large is dataset. A good dataset will contribute to a model with good precision and recall. In the realm of object detection in images or motion pictures, there are some household names commonly used and referenced by researchers and practitioners. The names in the list include Pascal, ImageNet, SUN, and COCO. In this post, we will briefly discuss about COCO dataset, especially on its distinct feature and labeled objects.

tl;dr The COCO dataset labels from the original paper and the released versions in 2014 and 2017 can be viewed and downloaded from this repository. Continue reading

Xpath Basics: Introduction to XPath with an Example Java Project

xpathXPath is a W3C recommendation used to search and find parts of an XML document through a path expression. The elements or attributes that match the path expression will be returned for further processing by the invoking command, module, actor, or component.

In this post, I will explain about the basic concept of XPath via presentation slides. The presentation starts with a revisit to some of the XML key concepts. Subsequently, it shows sufficient elaboration of the basic concept of XPath. It concisely describes the key features of XPath that are worth knowing and practically useful especially when searching inside XML files.

The final part of the presentation consists of a sample project accompanied with some screenshots provided for readers to experiment with. In an upcoming post, I will show how the sample project can be converted into a Maven project for more convenient use and distribution.

You can download the slides from the following URL:

Xpath Basics (965 downloads)

Why Events Are a Bad Idea

With the always increasing needs for more responsive and better performing application, modern applications these mostly adopt concurrent computing model. In this model, a task is divided or split into multiple parts and then passed to a number of processing workers that will work on each part and then coordinate to help build the whole solution to the problem. Another case is when a stack of tasks or problems is forwarded to a number of processing workers having identical processing routine so that stack can be emptied faster. The application which applies the concurrent computing model is called concurrent application.

Two popular approaches have been widely used to address computation in a concurrent application: thread-based approach and event-based approach. Nowadays, event-based approach is a more likely to be found in the implementation of a concurrent application. Nonetheless, this does not mean that thread-based model never gains traction.

This article summarizes the paper with the same title, which offers thought-provoking argumentations on the merit of thread-based approach over event-based approach in developing a highly concurrent application. What’s interesting from the paper is not only does it provide conceptual and theoretical argumentation, it also shows some empirical results in the defense of the provoking statements.

You can download the presentation from the link below:

Why Events Are a Bad Idea (1854 downloads)