What Is A CDN & Where Does It Shine?
According to Wikipedia, a Content Delivery Network (CDN) is a “geographically distributed network of servers and their data centers.” Although this is a simple definition of a complex system, suffice it to say that a CDN is a network located inside another network. We will elaborate about this concept later.
Typically, CDNs are associated with acceleration, availability, scalability, and security of web applications.
Without a CDN, an end user who enters a website address in his browser establishes a connection similar to the one shown in the following figure.
Behind the scenes, the website name resolves to an IP address using DNS. The user queries his Local DNS or LDNS (such as the DNS server provided by the ISP or a public DNS resolution server) to perform the resolution. If the DNS or LDNS cannot resolve the IP address, it recursively asks upstream DNS servers for resolution. Ultimately, the request may pass to the authoritative DNS server where the zone is hosted. This DNS server resolves the address and returns it to the user.
Then the user’s browser directly connects to the origin (a web or an application server) and downloads the content. Each subsequent request is served by the origin directly, and the static assets are cached locally on the user’s machine. If another user from a similar or other location tries to access the same site, he will go through the same sequence. Every time, requests will hit the origin and the origin will reply with content.
Every step along the way adds latency. If the origin is located far from the user, response times will suffer from significant latency, delivering a poor user experience.
Acceleration, or “latency reduction”, addresses this issue by storing content in proximity to the end-users and providing request routing optimization capabilities to end users. Let’s explore this further using our CDN model.
User-initiated DNS requests are received by his LDNS, which forwards the requests to one of the CDN’s DNS servers. These servers are part of the Global Server Load Balancer infrastructure (or “GSLB”). The GSLB literally measures the entire Internet, and keeps tracking information about all available resources and their performance.
With this knowledge, the GSLB resolves the DNS request using the best performing edge address (usually in proximity to the user). An “edge” is a set of servers that caches and delivers the content. After DNS resolution is completed, the user makes the HTTPS request to the edge. When the edge receives the request another type of machinery kicks in. With the help of GSLB, edge servers forward the requests following the optimal route to the origin.
Then it fetches the requested data, delivers it to the end-user who requested it, and stores that data locally. All subsequent requests will be served from the local dataset without having to query the origin server again.
Content stored on the edge can be delivered even if the origin becomes unavailable for any reason.
A cool thing to note is that it is possible to prefetch content to the edge in advance. This ensures that the data you are going to deliver is stored in all CDN data centers. In CDN parlance, these Data Centers are called Points of Presence (or “POPs”).
For example, assume that you run an ad campaign and advertise your service or product among millions of potential customers who will rush to your site after reading the post. This scenario is especially true if you deal with influencers who have good audience engagement rates. In this use case, the load will be distributed between CDN edge servers, and everyone will get the response. Because only a small fraction of requests will reach the origin, you can be assured that your servers will not experience massive traffic spikes, 502 errors, and overloaded upstream network channels.
These traffic spikes are similar to the DDoS attacks. When a massive amount of requests “attack” your application, you most likely have no idea how to deal with it, while your customers suffer.
Fortunately, the CDN is a DDoS mitigation platform. GSLB and edge servers are capable of handling enormous amounts of requests and distributing the load equally across the entire capacity of the network.
Security is not only about DDoS mitigation and load distribution during spikes. CDNs can provide certificate management and automatic certificate generation and renewal. To protect your application from L7 attacks, a web application firewall can be enabled.
How Else Can A CDN Be Helpful?
The CDN is not limited to the benefits explained above. A modern CDN platform delivers many more advantages to your business and engineering teams.
It can be used to manage access from different regions on the planet. While you allow access for some regions, you can deny access to others.
You can easily offload application logic to the edge and close to your customers. You can process and transform the request/response headers and body, route requests between different origins based on request attributes, or delegate authentication tasks to the edge.
Large amounts of traffic require an infrastructure for log collection and processing for further analysis. CDNs collect the logs and provide an interface to conveniently analyze the data generated by the visitors.
It is only natural that something becomes easy to use when you are already familiar with it. For that reason, CDN360 edges are NGINX based. This means you can perform tasks using standard NGINX directives.
Our engineering team spent thousands of hours extending NGINX.
Static & Dynamic Acceleration
Dynamic acceleration applies to something that cannot be cached on the edge due to its dynamic nature. Imagine a WebSocket application that listens for events from a server or API endpoint whose response differs, depending on credentials, location, or other parameters. It is hard to leverage the cache machinery on the edge in a way that is similar to caching static content. In some cases, tighter integration between the app and the CDN may help; however, in some cases, something other than caching should be used. For dynamic acceleration, CDN’s optimized network infrastructure and advanced request/response routing algorithms are used.
Billing model or “What do I pay for?”
Conventionally in a CDN, you pay for the traffic consumed by your end-users and the amount of requests. Additionally, HTTPS requests require more computing resources than HTTP requests, which creates more load on the CDN provider equipment. For this reason, you may pay additional costs for HTTPS requests, while HTTP requests are not billed at an additional cost.
As the computation moves to the edge, the CPU becomes an object of billing. Requests might have various processing pipelines and, as result, will require different amounts of CPU time. It is impractical to bill by the requests count; it is more practical to bill by traffic amount + cpu time used.
Who Uses CDN?
CDN is used by businesses of various sizes to optimize their network presence, availability, and provide a superior user experience for customers. A CDN is particularly popular in the following industries:
- Digital Publishing
- Online Video & Audio
- Online Gaming
- Online Education
- Public Sector
- Financial Services