How Does a CDN Work? Content Delivery Networks from Edge to Origin

The first time I really understood how a CDN works was in 2008, during a product launch that went viral. Our single-origin architecture (a few web servers behind a load balancer in us-east-1) buckled under the load. Page load times for users in Tokyo were 4+ seconds. Users in Sydney were timing out entirely. We threw the site behind Akamai, and within an hour, global response times dropped to under 200ms.

That experience permanently rewired how I think about content delivery. A CDN isn’t just “a cache in front of your website.” It’s a globally distributed system that fundamentally changes how the internet delivers content, and understanding its internals makes you a better architect whether you’re serving static files or building real-time APIs.

What a CDN Actually Does

At its core, a CDN does one thing: it serves content from a location physically close to the user instead of from the origin server, which might be thousands of miles away.

The physics are simple. Light in a fiber optic cable travels at roughly 200,000 km/s (about two-thirds the speed of light in a vacuum). A round trip from New York to Sydney is about 32,000 km of fiber, giving you a minimum round-trip time of ~160ms just from the speed of light, before any processing, routing, or congestion delays. In practice, you’ll see 200-300ms.

Every HTTP request requires at least one round trip (TCP handshake), usually two (TLS handshake), plus the actual data transfer. A page that makes 50 requests could be spending seconds just on network round trips.

A CDN eliminates most of those long-distance round trips by caching content at edge servers (also called Points of Presence or PoPs) distributed globally. Major CDN providers have hundreds of PoPs:

CDN Provider	Approximate PoPs	Notable Coverage
Cloudflare	300+	Strong in developing markets
Akamai	4,000+	Largest network, enterprise-focused
AWS CloudFront	400+	Tightly integrated with AWS
Fastly	80+	Developer-friendly, real-time purge
Google Cloud CDN	150+	Leverages Google’s backbone

How Requests Get Routed to the Nearest Edge

When a user types www.example.com into their browser, they need to end up at the nearest CDN edge server, not the origin. This routing happens through DNS resolution.

DNS-Based Routing

Most CDNs use DNS-based routing. When you configure your domain to use a CDN, you typically CNAME your domain to the CDN’s domain:

www.example.com → CNAME → d1234.cloudfront.net

When a user’s DNS resolver queries d1234.cloudfront.net, the CDN’s authoritative DNS server responds with the IP address of the nearest edge PoP. “Nearest” is determined by a combination of:

GeoDNS: The CDN looks at the source IP of the DNS resolver and maps it to a geographic region
Latency measurements: Some CDNs actively measure latency from their PoPs to major DNS resolvers
Health checks: Unhealthy PoPs are removed from DNS responses
Capacity: Overloaded PoPs may be de-prioritized

Anycast Routing

Some CDNs (notably Cloudflare) use BGP Anycast instead of or in addition to DNS routing. With Anycast, the same IP address is announced from every PoP worldwide. The internet’s BGP routing automatically directs each user to the “nearest” PoP in terms of BGP path length.

Anycast has a key advantage: it works at Layer 3, so it’s faster than DNS (no extra DNS lookup needed) and handles failover automatically. If a PoP goes down, BGP withdraws the route and traffic shifts to the next-nearest PoP.

The downside is that “nearest by BGP” isn’t always “lowest latency.” BGP routing is based on AS path length and policy, not actual measured latency. A PoP three AS hops away might be faster than one that’s two hops away if the intermediate links are congested.

CDN request routing showing DNS resolution directing users to their nearest edge PoP

Cache Architecture: Edge, Mid-Tier, and Origin

A CDN’s cache architecture typically has multiple tiers:

Tier 1: Edge Cache (PoP)

The edge cache is the first line of defense. It’s the server closest to the user. When a request arrives at the edge:

Cache HIT: The content is in the edge cache and hasn’t expired. Serve it immediately. This is the fast path, with latency typically 5-50ms.
Cache MISS: The content isn’t cached at this edge. The edge needs to fetch it from somewhere.

Tier 2: Mid-Tier Cache (Shield/Regional)

Instead of every edge PoP going directly to the origin on a cache miss, most CDNs route misses through a mid-tier cache (Cloudflare calls this “Tiered Caching,” CloudFront calls it “Origin Shield,” Akamai has “SureRoute”). The CDN edge is architecturally a reverse proxy deployed at global scale: it intercepts requests on behalf of your origin servers, serves cached content, and only forwards uncached requests upstream.

The mid-tier cache serves as a shared cache for multiple edge PoPs in a region. If the content is cached at the mid-tier, the edge gets it from there instead of hitting the origin.

This dramatically reduces origin load. Without mid-tier caching, a viral piece of content that’s accessed from 300 PoPs simultaneously would cause 300 concurrent requests to the origin. With mid-tier caching (say, 10 regional mid-tier caches), at most 10 requests hit the origin.

Tier 3: Origin Server

The origin is your actual web server, the source of truth for all content. In a well-configured CDN setup, the origin should only be hit for:

First requests for new/uncached content
Cache revalidation requests (conditional GET with If-Modified-Since or If-None-Match)
Truly dynamic content that can’t be cached

User → [Edge PoP (NYC)] → Cache HIT → Response (5ms)

User → [Edge PoP (NYC)] → Cache MISS
     → [Mid-Tier (US-East)] → Cache HIT → Response (20ms)

User → [Edge PoP (NYC)] → Cache MISS
     → [Mid-Tier (US-East)] → Cache MISS
     → [Origin (us-east-1)] → Response (100ms)

Three-tier CDN cache hierarchy showing edge, mid-tier, and origin with cache hit/miss flows

Cache Control: How the CDN Knows What to Cache

The CDN respects HTTP cache headers from your origin. The main ones:

Cache-Control Header

Cache-Control: public, max-age=86400, s-maxage=604800

public: Any cache (including CDN) can store this
max-age=86400: Browsers should cache for 24 hours
s-maxage=604800: Shared caches (CDN) should cache for 7 days

The s-maxage directive is your best friend. It lets you set different TTLs for the CDN and the browser. You might want the CDN to cache aggressively (long s-maxage) while keeping browser cache short (short max-age) so users get updated content when you purge the CDN.

Common Cache-Control Patterns

Content Type	Recommended Cache-Control	Why
Static assets (JS/CSS with hash)	`public, max-age=31536000, immutable`	Content-addressable, never changes
Images	`public, max-age=86400, s-maxage=2592000`	CDN caches 30 days, browser 1 day
HTML pages	`public, s-maxage=60, max-age=0`	CDN caches 1 min, browser always revalidates
API responses	`private, no-store` or `public, s-maxage=5`	Usually not cached, or very short TTL
User-specific content	`private, no-store`	Never cache on shared caches

The ETag/Last-Modified Dance

Even after a cached object expires, the CDN doesn’t necessarily re-download it. It sends a conditional request:

GET /image.png HTTP/1.1
If-None-Match: "abc123"
If-Modified-Since: Mon, 15 Jan 2024 10:00:00 GMT

If the origin responds with 304 Not Modified, the CDN knows its cached copy is still valid and extends its cache lifetime. This saves bandwidth and origin CPU.

Vary Header: The Cache Key Modifier

The Vary header tells the CDN that the response depends on certain request headers. The most common use:

Vary: Accept-Encoding

This tells the CDN to maintain separate cached copies for different encodings (gzip vs Brotli vs uncompressed). Without this, a CDN might serve a gzip-compressed response to a client that only supports Brotli.

Gotcha: Vary: * means “this response is unique to every request,” effectively disabling caching. I’ve seen applications accidentally set Vary: * and wonder why their CDN hit rate was 0%. Similarly, Vary: Cookie means a separate cached copy for every unique cookie combination, which usually means no caching at all.

Cache Invalidation: The Hard Problem

Phil Karlton famously said there are only two hard things in computer science: cache invalidation and naming things. CDN cache invalidation is a perfect illustration of why.

Purge/Invalidation

Every CDN provides an API to purge cached content:

# CloudFront invalidation
aws cloudfront create-invalidation \
  --distribution-id E1234567890 \
  --paths "/index.html" "/css/*"

# Cloudflare purge
curl -X POST "https://api.cloudflare.com/client/v4/zones/ZONE_ID/purge_cache" \
  -H "Authorization: Bearer TOKEN" \
  -d '{"files":["https://www.example.com/index.html"]}'

But purging has costs:

Propagation delay: A purge at the API doesn’t instantly clear every edge PoP. Akamai quotes 5-10 seconds for Instant Purge; CloudFront invalidations can take minutes.
Thundering herd: Purging popular content from all caches simultaneously causes a stampede of requests to the origin. The smarter CDNs handle this with request coalescing, where only one cache miss request goes to the origin, and all other waiting requests share the response.
CloudFront charges for invalidations: The first 1,000 paths/month are free; after that, it’s $0.005 per path.

The Better Approach: Cache Busting

Instead of purging, use content-addressed URLs (cache busting):

<!-- Instead of purging /app.css, change the URL -->
<link rel="stylesheet" href="/app.a1b2c3d4.css">

<!-- Or use a query parameter (less reliable but simpler) -->
<link rel="stylesheet" href="/app.css?v=1705312800">

When you deploy new content, the URL changes, so the old cached version is irrelevant. The browser requests the new URL, which isn’t in any cache yet. This is why build tools like Webpack, Vite, and esbuild add content hashes to filenames.

For HTML pages (which can’t be cache-busted because the URL is fixed), use short s-maxage values and accept that there will be a brief staleness window.

CDN for Dynamic Content: Beyond Static Files

A common misconception is that CDNs are only for static files. Modern CDNs can accelerate dynamic content too:

Connection Optimization

Even for uncacheable requests, a CDN provides value through connection optimization:

TLS termination at the edge: The user completes the TLS handshake with the nearby edge PoP (fast), not the distant origin (slow)
Persistent connections to origin: The CDN maintains a pool of warm TCP/TLS connections to your origin, eliminating connection setup time for each request
HTTP/2 multiplexing: The CDN speaks HTTP/2 to the user even if your origin only supports HTTP/1.1
Optimized backbone routing: CDNs route traffic between edge and origin over their private backbones, which are faster and more reliable than the public internet

I’ve measured this: a dynamic API request that takes 300ms from the user directly to the origin drops to 180ms through a CDN, even with 0% cache hit rate. The connection optimization alone is worth 30-40% latency reduction.

Edge Computing

The latest evolution is running actual application logic at the edge:

Cloudflare Workers: V8 isolates at every PoP, JavaScript/Wasm
CloudFront Functions / Lambda@Edge: AWS’s edge compute
Fastly Compute@Edge: Wasm-based edge compute
Deno Deploy: Edge-first serverless platform

This lets you do things like A/B testing, authentication, geolocation-based routing, and even database queries at the edge, all without ever hitting the origin for certain requests.

CDN dynamic content acceleration showing TLS termination at edge and optimized backbone routing to origin

CDN and TLS: The SNI Connection

CDNs are one of the biggest users of SNI (Server Name Indication). A single CDN edge server might serve HTTPS for tens of thousands of different domains. Without SNI, each domain would need a dedicated IP address at every PoP, which would require millions of IPv4 addresses.

When your browser connects to a CDN edge, the TLS ClientHello contains the SNI hostname. The CDN uses this to:

Select the correct TLS certificate for your domain
Look up the CDN configuration for your domain (cache settings, origin address, etc.)
Apply any domain-specific rules (WAF, redirects, headers)

This is also why CDNs are at the forefront of Encrypted Client Hello (ECH) deployment. They have both the technical capability and the business incentive to protect user privacy.

Measuring CDN Performance

Key Metrics

Metric	What It Means	Good Value
Cache Hit Ratio	% of requests served from cache	> 90% for static sites, 50-80% for mixed
TTFB (Time to First Byte)	Time from request to first byte of response	< 100ms for cached, < 500ms for origin
Bandwidth Offload	% of bandwidth served from edge	> 95% for media-heavy sites
Origin Requests/sec	How often the CDN hits your origin	Should be a fraction of total traffic
P95/P99 Latency	Tail latency at high percentiles	2-5x the median is acceptable

Debugging Cache Behavior

Most CDNs add response headers that tell you what happened:

$ curl -I https://www.example.com/image.png

X-Cache: Hit from cloudfront           # CloudFront: cache hit
CF-Cache-Status: HIT                    # Cloudflare: cache hit
Age: 3600                               # Object has been cached for 1 hour
X-Served-By: cache-iad-kcgs7200089     # Fastly: which PoP served this

If you see MISS consistently, check your cache headers. If you see BYPASS, the CDN is intentionally not caching (usually due to Set-Cookie or Cache-Control: private in the response).

When NOT to Use a CDN

CDNs aren’t always the answer:

Single-region users: If all your users are in one city and your server is in that city, a CDN adds complexity without much latency benefit
Highly personalized content: If every response is unique to the user (after authentication), cache hit rates will be near zero. You still get connection optimization, but evaluate if the cost is justified.
WebSocket-heavy applications: CDNs can proxy WebSockets, but they’re not optimized for long-lived connections. You might be better with direct connections.
Very small traffic: CDN benefits scale with traffic. If you’re getting 100 requests/day, the CDN overhead (configuration, debugging, additional DNS hop) may not be worth it.
Development environments: Don’t CDN your dev/staging unless you’re specifically testing CDN behavior. Caching makes debugging painful.

Wrapping Up

A CDN is one of the most impactful infrastructure decisions you can make. Adding a CDN in front of a poorly-optimized website can cut load times by 50-80% for global users. But a CDN is not magic. It requires proper cache headers, thoughtful content strategy, and monitoring to work well.

The key concepts to remember:

CDNs work by serving content from edge servers near the user, reducing latency from physics-imposed round-trip times
DNS-based routing and Anycast direct users to the nearest PoP
Multi-tier cache hierarchies (edge → mid-tier → origin) protect your origin from traffic storms
Cache-Control headers are how you communicate caching policy to the CDN
Content-addressed URLs (cache busting) are better than purging
CDNs help dynamic content too, through connection optimization and edge computing
Always monitor your cache hit ratio, which is the single most important CDN metric

If you’re serving users globally and you’re not using a CDN, you’re making your users wait for photons to travel around the world for no good reason. Fix that.

Get Cloud Architecture Insights

Practical deep dives on infrastructure, security, and scaling. No spam, no fluff.