This article will elaborate a bit about Content Delivery Networks or CDN. You will find information about what constitutes a CDN and how it can speed up content retrieval.
Content Delivery Networks (CDN) is sometimes also referred to as Content Distribution Networks. It can be defined as dedicated collections of servers located on the internet which attempt to offload work from origin servers by delivering content on their behalf. In order to get better understanding of this definition, you will be provided a picture which represents an example of CDN.
Now let’s assume there is a company X whose headquarter is located in Northern America. This company operates an online business and serves client worldwide. When an online company has a big user base, it will have to also be capable of serving massive traffic. One of the parameters hinting the capability is the web application load time. Global users should experience fast or at least acceptable load time for the web application. Users don’t want to wait. It is natural and makes a perfect sense. If the web application run by the company offers good service and quality content while users can retrieve the content and other services quickly, it is promising that users will be loyal to the web application or at least keep coming as frequent visitors. Otherwise, the company may face user attrition problem.
Here comes the question. What strategies can be implemented to improve the speed of content load on the web application? There are various strategies and one of them is using CDN.
Centralized (No-CDN) Content Load Strategy
Let’s start the motivation behind CDN by visiting a case of centralized content load strategy. In the picture, you can see that headquarter is equipped with several types of infrastructure; servers, storage, network. If user A who is also located in northern America wants to open and use the web application, he may feel that the speed of content retrieval from the web application is fast enough. This is since he is located near the server which hosts the application. However, user B who resides thousands of miles from the company headquarter and accesses the content directly from the headquarter server may feel that the application loads slowly. This is easily explained as follows. Imagine the world as a Cartesian plane. The headquarter is the center of coordinate or (0,0). The distance of user A to (0,0) is less than the distance of user B to (0,0). If we assume the packets traveling in the networks have the same speed, using classical formula velocity=distance/time, it is apparent that a packet takes more time to travel to a longer distance. In our case, user B experiences slower page load since it takes more time to receive all the bytes from the server.
However, the previous equal network speed assumption does not always hold true. Backbone network is in most case faster than a LAN network. Hence there can be a case that a user C who is physically located farther to the server can access the application at faster rate compared to another user A who is physically located closer to the server. It is since user C accesses the application through a high-speed network. Given this situation, it is wise to keep in mind that the term “distance” previously mentioned refers to “network distance”. Sometimes “network distance” is equivalent with “physical distance” but sometimes it is not.
Distributed Content Load Strategy with CDN
Since we already notice that network distance has some impact on the content load speed, it is logical that a way to improve the speed is by reducing the network distance. In the picture, you can see a CDN with several replicas distributed geographically. A replica is also known as a gateway. It implements caching and content replication and is designated as the contact point of a CDN.
Let’s discuss about a content request scenario from user B who resides in Australia. In the previous centralized strategy, user B should retrieve the content from the headquarter server. As we now use CDN, partial or even whole content can be retrieved from the closest replica. Referring to the picture, the closest replica is the one located in Australia. As the network distance gets closer, the content will also be retrieved faster.
You can be curious whether we have to cache and replicate everything from the central server. I can not provide you the detailed answer for this since the answer means a new CDN business. Several algorithms are implemented in a CDN for the caching and content replication tasks. A replica may only replicate some popular content while letting “inactive” content to be loaded from the central. A replica can also replicate only partial content while letting the rest of the content to be served from other replica. Each CDN provider implements their own optimization techniques.
Examples of CDN
Big companies like Google and Amazon has lots of servers distributed geographically. Those servers naturally form a CDN. Does it mean that to use a CDN we have to be a big enterprise? The answer is no. There are some companies offering CDN service to users. A web application Y which uses the service will use the CDN provider’s infrastructure to serve content request from its user base. The CDN provider will charge the owner of the web application based on a contract agreed upon. With this approach, web app users can experience better content load while web application owner does not have to worry about spending huge amount of money to buy the infrastructure.
If your web application caters to millions of users and you care about the speed of your webapp, you may consider using CDN service.
Further reading for technical paper avid readers:
 B. Khrishnamurthy, C. Wills, and Y. Zhang. On the Use and Performance of Content Distribution Networks. In Proc. of 1st ACM SIGCOMM Workshop on Internet Measurement, 2001.
 G. Pierre and M. van Steen. Globule: A Collaborative Content Delivery Network. IEEE Communications Magazine 44(8), August 2006.