Capacity Estimation in System Design Interviews

Here’s a question that trips up almost everyone:

  • You’ve sketched your design on the board, and the interviewer asks, “So, how many servers do you need? One? A thousand?”
  • And you just guess. You say “a few” and hope they move on.
  • The thing is, you can’t pick the right pieces for a system if you have no feel for how big it is.

So how do you actually know? You do a little quick math. Not exact, not with a calculator, just rough numbers that tell you whether you’re building for a small crowd or a giant one. That skill is called capacity estimation, and that’s exactly what we’ll learn here.

🎯 Why Estimate at All

Let’s be clear about what this math is for. It’s not so you can write a precise number on the board and look smart:

  • Rough numbers guide your real design choices. They tell you whether you even need a cache, more database copies, or a network of servers spread around the world.
  • A system for a thousand users and a system for a billion users look completely different. The estimate tells you which one you’re building.
  • It also shows the interviewer that you don’t just memorize designs, you reason about scale like a real engineer.

So this is the step where vague turns into concrete. Once you know roughly how many requests hit per second and how much data piles up, the rest of your design almost picks itself.

Exactness is not the goal

Nobody expects perfect numbers here. If the real answer is 9,400 and you say 10,000, that’s totally fine. You just need the right ballpark, because the ballpark is what changes your design.

🧮 Back-of-the-Envelope Math

The fancy name for this is back-of-the-envelope estimation. It just means quick, rough calculations you could do on the back of an envelope, no calculator needed:

  • You round everything to easy numbers, like 10 million instead of 9.7 million.
  • You use powers of ten so the multiplication stays simple.
  • You say the number out loud, then move on. Precision is not the point.

Think of it like guessing how much paint you need for a room. You don’t measure every wall to the millimeter. You eyeball it, round up, and buy roughly the right amount. Same spirit here.

📊 What to Estimate

There are a handful of numbers worth working out, and they build on each other. Here’s what you’re after:

  • Daily active users. This is how many people actually use the system each day, often shortened to DAU. It sets the overall size of everything else.
  • Requests per second. This is how many requests the system handles each second, usually called QPS, which is short for queries per second. It tells you how hard your servers get hit.
  • Storage growth. This is how much new data piles up over time. It tells you how big your database needs to get.
  • Bandwidth. This is how much data flows in and out per second. It tells you how fat your network pipes need to be.

The flow from one to the next is pretty natural. You start with users, turn that into requests, then squeeze QPS, storage, and bandwidth out of it. Here’s the path.

Daily active users

Requests per day

QPS (requests/sec)

Storage growth/day

Bandwidth (data/sec)

⌨️ A Worked Example

Let’s actually run the numbers for a simple social app, so you see how easy it is. Say we’re told it has 10 million daily active users, and each user makes about 10 requests a day.

First, turn users into requests per day. We just multiply.

10,000,000 users × 10 requests = 100,000,000 requests/day

Now we want QPS, the requests per second. A day has about 86,400 seconds, but we round that to 100,000 to keep the math easy. So we divide requests per day by that.

100,000,000 requests/day ÷ 100,000 seconds ≈ 1,000 QPS

So this system handles roughly a thousand requests every second. That already tells us a single server won’t cut it, so we’ll need a load balancer spreading traffic across several.

Now let’s estimate storage. Say each request that writes data stores about 1 KB, and say 10% of requests are writes. That’s 10 million writes a day.

10,000,000 writes/day × 1 KB ≈ 10 GB/day

So we’re piling up about 10 GB every single day. Over a year that’s a few terabytes, which tells us storage will grow fast and we should plan for it. Notice how we rounded at every step and never touched a calculator. That rough number is all you need to make the next decision.

Always say the units out loud

Numbers without units are meaningless. Is it 10 GB per day or per second? Per user or total? Say the full thing, “10 GB per day”, every time. It keeps you honest and keeps the interviewer following along.

🔢 Handy Numbers to Remember

You’ll do this math faster if a few numbers live in your head. Keep this little cheat sheet handy.

Thing Rough value Why it helps
Seconds in a day ≈ 86,400, round to 100,000 Turns requests/day into QPS
Thousand 1,000 (10³, “K”) Small counts, like QPS
Million 1,000,000 (10⁶, “M”) Daily users, requests
Billion 1,000,000,000 (10⁹, “B”) Huge-scale user counts
KB / MB / GB / TB each ≈ 1,000× the last Storage and bandwidth sizes
Read:write ratio often 10:1 or higher Most apps read far more than write

The one to burn into memory is 86,400 seconds in a day, rounded to 100,000. Almost every QPS estimate starts there.

🎯 Read-Heavy vs Write-Heavy

Here’s a number that quietly shapes your whole design: the read:write ratio. A read is when the system fetches data to show you, and a write is when it stores new data:

  • Most real systems read far more than they write. On a social feed, way more people scroll than post.
  • So you estimate reads and writes separately, not just one total. A 10:1 read-to-write ratio means for every post, ten people read it.
  • This ratio drives your big choices. Heavy on reads pushes you toward caching and extra database copies. Heavy on writes pushes you toward handling a flood of incoming data.

So don’t just say “1,000 QPS” and stop. Split it. Say “about 900 reads per second and 100 writes per second.” That split is what tells you where the pressure really is.

Don't forget to split reads and writes

A system can look calm on total QPS but be brutal on writes, or the other way around. Treating reads and writes as one lump hides the real bottleneck. Always estimate the two separately.

🧩 Turning Estimates Into Decisions

This is the payoff. Your numbers aren’t just for show, each one points at a concrete design move. Here’s how to read them:

  • High QPS? A single server can’t handle thousands of requests a second. So you reach for a load balancer to spread traffic across many servers, plus a cache to keep popular data in fast memory.
  • Big storage growth? When the data won’t fit on one machine, you split it across many. That splitting is called sharding, where each machine holds just a chunk of the data.
  • Lots of reads? You add read replicas, which are extra copies of the database that share the reading load. And you put a CDN, a network of servers spread worldwide, close to users so they get data fast.
  • Lots of writes? You lean on queues, which line up the writes and handle them in the background, so the app stays fast under the flood.

See the pattern? Every estimate maps to a tool. That’s why this math matters. The numbers justify your design instead of you just guessing.

⚠️ Common Mistakes and Misconceptions

A few traps catch people every time. Let’s clear them out:

  • “I need exact numbers.” No. Chasing precision wastes your time and misses the point. Round hard, get the ballpark, move on.
  • “I’ll skip estimation and just design.” Then you’re guessing whether you need a cache or sharding. The estimate is what tells you, so skipping it means flying blind.
  • “Requests per day is the same as per second.” It’s not, and forgetting to convert is the classic blunder. Always divide by the seconds in a day to get QPS.
  • “One total QPS is enough.” It hides the read:write split. A write-heavy system needs a very different design from a read-heavy one, so separate them.
  • “Bigger storage numbers are better.” Inflating numbers to sound impressive just makes your design wrong. Be honest with your rounding.

🛠️ Practice Challenge

Time to run the math yourself. Take this prompt: “Estimate the scale for a photo-sharing app like a simple Instagram.”

Work through it on paper, step by step:

  • Users. Assume 50 million daily active users. Write that down.
  • Requests. Say each user views about 20 photos a day and posts 1. Turn that into reads per day and writes per day.
  • QPS. Divide each by 100,000 seconds to get read QPS and write QPS separately.
  • Storage. Say each photo is about 1 MB. Multiply by the number of new photos a day to get storage growth per day.
  • Decisions. Given your numbers, what would you add? A cache? Read replicas? A CDN? Sharding? Name each one and say why.

Do this out loud, like someone’s listening. The more you rehearse the math on small prompts, the calmer you’ll be when the real “how big is it?” question lands.

🧩 What You’ve Learned

You can now size up a system with a few lines of quick math. Here’s what you’ve picked up.

  • ✅ Capacity estimation is rough math that guides real design choices, not an exact-number exercise.
  • ✅ You estimate daily users, QPS, storage growth, and bandwidth, each building on the last.
  • ✅ QPS is requests per day divided by the seconds in a day, about 86,400, rounded to 100,000.
  • ✅ Round to easy numbers and use powers of ten so the math stays simple.
  • ✅ The read:write ratio matters, so always estimate reads and writes separately.
  • ✅ High QPS points to load balancing and caching, big storage to sharding, heavy reads to replicas and a CDN.

Check Your Knowledge

Test what you learned. Pick an answer for each question, then click Check.

  1. 1

    What is the main goal of back-of-the-envelope estimation?

    Why: The point is a rough ballpark for users, QPS, and storage that drives real design decisions, not exact figures.

  2. 2

    How do you turn requests per day into QPS?

    Why: You divide total requests per day by the seconds in a day, around 86,400 and usually rounded to 100,000.

  3. 3

    Why should you estimate reads and writes separately?

    Why: A read-heavy system wants caching and replicas, while a write-heavy one wants queues, so the split reveals where the real load is.

  4. 4

    Big storage growth most directly points you toward which design move?

    Why: When data will not fit on one machine, you split it across many through sharding, with each machine holding a chunk.

🚀 What’s Next?

You’ve got the math. Now pair it with the other interview skills that lean on these numbers.

Run these estimates on a few small prompts, and “how many servers do you need?” will stop feeling like a trap and start feeling like simple arithmetic.

Share & Connect