CDN Caching Strategies

You already know a CDN keeps copies of your files close to users so pages load fast. But here’s the catch:

  • A CDN is only useful if it caches the right things.
  • And it’s only useful if those copies stay fresh, not stale and old.
  • Cache the wrong stuff, or hold onto an old copy too long, and the CDN actually hurts you.

So in this lesson we go one level deeper than the intro. We’ll look at how a CDN decides what to keep, how long to keep it, and how you tell it “hey, this changed, drop the old copy.” That’s what CDN caching strategies are all about.

🎯 The Two Hard Questions

Every caching decision on a CDN comes down to two questions, and once you see them clearly the rest is easy.

  • What should I cache? Some content is safe to copy everywhere, like your logo. Some content is dangerous to copy, like a user’s private account page. Getting this wrong leaks data or wastes the cache.
  • When should I refresh it? A cached copy can go out of date the moment you change the original. So you need a plan for how long a copy lives and how you replace it.

So keep these two in your head the whole way down. Everything else, TTL, purging, cache busting, push vs pull, is just a tool for answering one of them.

⏳ TTL and Cache-Control

Let’s start with freshness, because that’s where most people get bitten. The main tool here is something called TTL.

  • TTL stands for Time To Live. It’s just how long an edge server is allowed to keep a copy before it has to check the origin again.
  • A short TTL means fresher content, but more trips back to the origin. A long TTL means faster serving, but the copy might drift out of date.

So how does the edge know the TTL for a file? The origin tells it, using a special HTTP header.

  • A header is just a little piece of extra info attached to a response, like a label on a package.
  • The header that controls caching is called Cache-Control. When the origin sends a file, it attaches this header to say how long the file may be cached.

Here’s what that header looks like coming back from the origin.

Cache-Control: max-age=86400

Let’s read that one line:

  • max-age is the TTL in seconds. So 86400 seconds is one full day.
  • This tells every edge server, “you can keep this copy for a day before you bother me again.”

There are a few other values on Cache-Control worth knowing, because you’ll use them all the time:

  • no-store means “don’t cache this at all, ever.” You use this for private, sensitive responses.
  • no-cache is a bit of a trap name. It doesn’t mean “don’t cache.” It means “you can keep a copy, but check with me before serving it.”
  • private means “only the user’s own browser may cache this, not the shared CDN.” Good for personalized pages.
  • public means “anyone, including the CDN, may cache this.” Good for shared static files.

So when a request hits the edge, the edge runs a simple check. Is my copy still fresh, meaning still inside its TTL? If yes, serve it. If no, go ask the origin. Here’s that decision as a picture.

No

Yes

Yes (within TTL)

No (TTL expired)

Request hits edge

Copy in cache?

Fetch from origin

Still fresh?

Serve cached copy

Cache with TTL, then serve

The origin is the boss of caching

Notice that the origin decides the rules by sending Cache-Control. The CDN just obeys what the header says. So if your files are caching wrong, the fix usually isn’t in the CDN dashboard. It’s in the headers your origin server sends.

🔄 Cache Invalidation and Purging

TTL handles the normal case where a copy just slowly expires. But sometimes you change a file right now and can’t wait for the TTL to run out. That’s where invalidation comes in.

  • Cache invalidation means telling the CDN to drop a cached copy before its TTL is up, because the content changed.
  • The action of forcing the edge to throw away a copy is called a purge. People say “purge it” and “invalidate it” to mean the same thing.

So the flow is simple:

  • You deploy a change to the origin, say you fixed a wrong price on a page.
  • You purge that file on the CDN.
  • The edge drops its old copy. The next request becomes a miss, so the edge fetches the fresh version and starts serving that.

Now here’s the trade-off, and it’s the part interviewers like to poke at:

  • Purging works, but it’s not free. Right after a purge, lots of users hit the edge and find nothing there, so a flood of misses goes back to the origin all at once.
  • If you purge constantly, you’re basically defeating the cache. The origin gets overloaded, which is the exact thing the CDN was supposed to prevent.

So purging is a tool for the occasional “oops, this really changed” moment, not your everyday update plan. For everyday updates there’s a cleaner trick, which is next.

🏷️ Cache Busting

Purging is you chasing the CDN saying “forget the old copy.” Cache busting flips it around so you never have to chase at all.

  • Cache busting means changing the file’s URL or name whenever its contents change.
  • A new name looks like a brand new file to the CDN, so the edge has never seen it and fetches the fresh one automatically. No purge needed.

So instead of editing app.js in place and hoping the cache lets go, you ship a new name:

<!-- Old version -->
<script src="/app.v1.js"></script>
<!-- After a change, the name changes too -->
<script src="/app.v2.js"></script>

Let’s see why this is so clean:

  • The HTML page points to app.v2.js now. That’s a name the CDN has never cached, so the edge grabs the new file straight away.
  • The old app.v1.js can keep a super long TTL, because it will never change again. A file that never changes is the perfect thing to cache forever.
  • In real build tools the version is usually a hash of the file’s contents, like app.3f9a2c.js, so the name changes automatically every time the file does.

So cache busting lets you have the best of both worlds. You cache static files aggressively with a long TTL, and updates show up instantly, because an update is just a new filename.

⬇️⬆️ Push vs Pull CDNs

So far we’ve assumed the CDN gets content from the origin only when someone asks for it. That’s actually just one of two models, and knowing both is a common interview point.

  • A pull CDN fetches content from your origin on the first request, then caches it. It’s lazy. It only grabs a file the moment someone actually wants it.
  • A push CDN is the opposite. You upload your content to the CDN ahead of time, before any user asks for it. You push it up there yourself.

Here’s the easy way to feel the difference:

  • With pull, the first user to request a file pays a small price, because it’s a miss and the edge has to go fetch it. After that everyone gets a fast hit. You barely manage anything.
  • With push, you do the work upfront, so even the very first user gets a hit. But now you are responsible for putting files there and updating them.
Pull CDN Push CDN
How content arrives Edge fetches from origin on first request You upload it to the CDN ahead of time
First request A miss, so it’s a little slow once Already there, so it’s fast
Who manages it Mostly automatic, easy setup You manage uploads and updates
Best for Lots of small files, sites that change often Large files like big videos or downloads

So which do you pick? Pull is the default for most websites because it’s simple and self-managing. Push makes sense when you have a few big files, like a giant video or installer, where you don’t want any user stuck waiting on that first slow miss.

🧩 Static vs Dynamic Content

Now back to the other hard question: what should you even cache? This comes down to whether content is static or dynamic.

  • Static content is the same for everyone and doesn’t change often. Images, CSS, JavaScript, fonts, videos. The same logo goes to every single user.
  • Dynamic content is different per user or changes constantly. A personal account page, a shopping cart, a live score. Everyone needs a different answer.

So the rule of thumb splits cleanly:

  • Cache static content aggressively. Long TTL, cache busting for updates. There’s no risk, because the file is identical for everyone.
  • Be careful with dynamic content. Either use a very short TTL, or don’t cache it at all with no-store.

But here’s a nice middle ground people forget about:

  • Some dynamic content is the same for everyone but still changes a bit, like a news homepage that updates every minute. You can cache that for a short TTL, say 30 or 60 seconds. Even one minute of caching takes a huge load off the origin during a spike.
  • Truly personalized content, the stuff unique to one logged-in user, should stay off the shared cache. Mark it private or no-store.

Never cache one user's private page on the shared CDN

If the edge accidentally caches Alex’s account page and then serves that same copy to another user, you’ve just leaked Alex’s private data to a stranger. Always mark personalized responses as private or no-store so the shared CDN never holds them.

⚡ Good Defaults

If you remember nothing else, remember this pair of defaults. They cover the vast majority of real sites.

  • For static assets (images, CSS, JS, fonts): use a long TTL, like a year, plus cache busting via versioned filenames. The file gets cached everywhere forever, and updates ship as new names. Fast and always correct.
  • For dynamic or personalized content: use a short TTL or no caching at all. Mark per-user responses private or no-store so the shared cache never touches them.

So a typical pair of headers looks like this. A long-lived versioned asset says keep me forever:

Cache-Control: public, max-age=31536000

And a private user response says don’t cache me at all:

Cache-Control: private, no-store

That’s 31536000 seconds, which is a year, for the safe static file, and a flat “don’t store this anywhere shared” for the private one. Start from these two defaults and you’ll rarely go wrong.

⚠️ Common Mistakes and Misconceptions

A few caching habits cause real outages and data leaks. Let’s clear them up.

  • “Just cache everything forever.” Sounds fast, but the moment you update anything, every edge keeps serving the old copy and you can’t easily fix it. Long TTLs are only safe with cache busting.
  • “Never cache anything, to be safe.” Now every request goes all the way to the origin, so the CDN does nothing and your site is slow and your origin is overloaded. You’ve thrown away the whole point.
  • “My deploy is live, so users see the new version.” Not unless you purged or used a new filename. The edge happily serves the old cached copy until its TTL expires. Forgetting to purge after a deploy is the classic “why am I still seeing the old site” bug.
  • “It’s fine to cache the logged-in homepage.” No. If it’s personalized, the CDN might serve one user’s page to another. Personalized pages must be private or no-store.

🛠️ Design Challenge

Try this one yourself to lock the ideas in.

Alex runs a news site that gets sudden traffic spikes when a big story breaks. Design the caching strategy:

  • The site logo, CSS, and JavaScript bundles? Static, so cache them a year with versioned filenames.
  • The article pages, which are the same for every reader but get edited as news develops? Shared but changing, so a short TTL like 60 seconds works, and a purge when a major correction goes out.
  • A logged-in reader’s saved-articles list and account settings? Personalized, so mark them private or no-store and keep them off the shared cache.

Then ask yourself: when Alex pushes a corrected article, how do they make readers stop seeing the wrong version fast? Walk through the short TTL, a purge, and why they wouldn’t just purge on every tiny edit. That reasoning is exactly what a system design interview wants to hear.

🧩 What You’ve Learned

You can now explain how a CDN decides what to cache and how to keep it fresh. Here’s what you’ve picked up.

  • ✅ TTL is how long the edge keeps a copy, and the origin sets it with the Cache-Control header (max-age, no-store, private, public).
  • ✅ Cache invalidation, or purging, drops a cached copy on demand, but purging too often hammers the origin with misses.
  • ✅ Cache busting changes a file’s name when it changes, so updates ship instantly with no purge needed.
  • ✅ A pull CDN fetches from the origin on first request; a push CDN holds content you upload ahead of time.
  • ✅ Cache static content aggressively with long TTLs and cache busting, and treat dynamic or personalized content with short TTLs or no caching.
  • ✅ Good defaults: long TTL plus cache busting for static assets, and short or no caching for dynamic and private responses.

Check Your Knowledge

Test what you learned. Pick an answer for each question, then click Check.

  1. 1

    What does TTL control on a CDN, and who sets it?

    Why: TTL is how long the edge keeps a copy before re-checking, and the origin sets it by sending the Cache-Control header.

  2. 2

    What is the main difference between purging and cache busting?

    Why: Purging forces the edge to drop a copy right away, while cache busting changes the file's name so the CDN fetches it as a new file with no purge needed.

  3. 3

    Why is purging too often a bad idea?

    Why: Right after a purge, many requests become misses and hit the origin together, so purging constantly overloads the origin and defeats the cache.

  4. 4

    How should you handle a logged-in user's private account page on a shared CDN?

    Why: Personalized pages must be marked private or no-store, or the shared CDN might serve one user's page to another user.

🚀 What’s Next?

You’ve now got the deeper caching picture, so let’s connect it back to the bigger story.

  • CDN Explained is the intro this lesson builds on, covering edge servers, hits and misses, and why CDNs exist.
  • Introduction to Caching zooms out to the caching idea that powers CDNs, browsers, databases, and a whole lot more.

Get those two down and you’ll have a solid grip on the caching and CDN fundamentals every system design interview leans on.

Share & Connect