Solving a Complete System Design Interview Question

You’ve learned the steps one by one. Now let’s put them all together on a real problem:

  • The interviewer leans back and says, “Design a URL shortener for me.”
  • The clock starts. You’ve got about forty-five minutes and a blank whiteboard.
  • This time we won’t talk about the process in the abstract. We’ll actually solve it, start to finish, like you’re sitting in the room.

So follow along as if you’re the candidate, named Alex, working through it out loud. We’ll narrate Alex’s thinking at every step, so you can see not just the answer, but how a good answer gets built.

🎯 How to Read This

Quick note before we dive in, so you know what you’re looking at:

  • We’ll walk every single step of the interview template on this one problem, in order.
  • At each step you’ll see what Alex says out loud, and why Alex says it.
  • The goal isn’t to memorize this design. It’s to feel the rhythm of the process so you can repeat it on any prompt.

If you want the bird’s-eye view of the steps first, read A System Design Interview Template. This lesson is that template in action. Okay, let’s go.

1️⃣ Clarify the Requirements

The first thing Alex does is the thing most people skip: not draw a single box yet. Instead, Alex asks questions. Because “design a URL shortener” could mean a lot of things, and the job is to shrink it down to something buildable.

Here’s Alex talking to the interviewer:

  • “Just to be clear, the core job is: take a long URL and give back a short one, and when someone clicks the short one, send them to the long one. Right?”
  • “Do we need custom aliases, like letting a user pick short.ly/my-event? Or analytics on clicks? Or link expiry?”
  • “Roughly how many users are we building for? A small team, or internet scale?”

The interviewer says: core shorten-and-redirect, build it for internet scale, and treat custom aliases, analytics, and expiry as nice-to-haves if there’s time.

So Alex pins the scope into two buckets. First the functional requirements, which are the things the system must actually do:

  • Take a long URL and return a short URL.
  • When someone hits the short URL, redirect them to the original long URL.

Then the non-functional requirements, which are about how well it does those things:

  • Redirects must be fast, because a person is waiting on a click. A slow redirect feels broken.
  • The service must be highly available, meaning it stays up almost all the time. A dead link is useless.
  • It must scale to billions of links and clicks without falling over.

Then Alex says the scope out loud to lock it in: “Okay, so we’re building fast, always-on shorten and redirect at large scale, and I’ll leave aliases, analytics, and expiry for the end if we have time.” That one sentence keeps both of you pointed at the same target.

Don't assume, ask

If Alex had just started drawing without asking, Alex might have spent twenty minutes designing analytics nobody wanted. One clarifying question up front saves you from building the wrong thing. When in doubt, ask.

2️⃣ Estimate the Scale

Now Alex gets a rough feel for how big this is. This quick, rough math is called back-of-the-envelope estimation, which just means numbers you could scribble on the back of an envelope, no calculator needed. The point isn’t to be exact, it’s to learn the shape of the problem.

Here’s how Alex reasons about it out loud:

  • “A link gets created once. That’s a write, meaning we store something new.”
  • “But after that, the same link might get clicked thousands of times. Each click is a read, meaning we just look something up.”
  • “So reads massively outnumber writes. A rough ratio people use is around 100 reads for every 1 write.”

That one fact decides almost everything, so Alex says it plainly: this is a read-heavy system, which means most traffic is just looking up links, not creating them. So we should pour our effort into making reads blazing fast, even if creating a link is a touch slower.

Then a couple of round numbers, just for the shape:

  • Say we get a few hundred new links per second at peak. That’s the write load, and it’s pretty light.
  • At a 100-to-1 ratio, that’s tens of thousands of redirects per second. That’s the read load, and it’s heavy. This is the number we design around.
  • Over years, billions of links pile up. Each row is tiny (a short code plus a URL), so storage is large but not scary, and it grows steadily.

Alex rounds everything, says it out loud, and moves on. The estimate is a compass, not a destination.

Reads vs writes, in one line

A read fetches existing data, like following a short link. A write stores new data, like creating one. Knowing which one dominates drives the whole design, and here it’s reads by a mile.

3️⃣ Define the API

Before drawing any servers, Alex pins down what the service actually offers. The cleanest way is to list the API endpoints. An API, short for Application Programming Interface, is just the agreed set of requests our service accepts.

Alex says, “We really only need two endpoints here,” and writes them down.

POST /shorten
Body: { "long_url": "https://example.com/some/very/long/path?x=1" }
Response: { "short_url": "https://short.ly/abc123" }
GET /abc123
Response: 301 Redirect -> Location: https://example.com/some/very/long/path?x=1

Then Alex reads them back, one at a time:

  • POST /shorten is the create call, the write path. You hand it a long URL, and it hands back a short one. We use POST because we’re making something new on the server.
  • GET /abc123 is the redirect call, the read path, the one almost all the traffic hits. The browser asks for the short code, and the server answers with a redirect to the real URL.
  • That 301 is an HTTP status code meaning “moved permanently”. We’ll come back to whether 301 or 302 is the right call, because there’s a real trade-off hiding there.

See how just listing two endpoints already shows the shape? One light write path, one heavy read path. And Alex has barely drawn anything yet.

4️⃣ High-Level Design

Now Alex draws the boxes. This is the high-level design, a simple picture of the main pieces and how a request flows through them. Alex narrates while drawing, so the interviewer follows along.

Here are the building blocks Alex names:

  • Client. The user’s browser or phone, where every request starts.
  • Load balancer. A traffic cop that spreads incoming requests across many servers so no single one gets crushed.
  • App servers. The machines that run our logic, like creating a code or looking one up.
  • Cache. A small, super-fast store that keeps the busiest links handy, so we skip the slow database when we can.
  • Database. The organized store that holds every code and its long_url.

Client (browser)

Load balancer

App servers

Cache (hot links)

Database

Then Alex walks both flows out loud, because narrating the path is half the interview:

  • Creating a link (write): the client sends POST /shorten, the load balancer picks an app server, the server makes a short code, saves the code and long_url in the database, and returns the short URL.
  • Following a link (read): the browser hits GET /abc123, an app server checks the cache first, and only goes to the database if the cache misses. Then it replies with the redirect.

Notice the cache is sitting right in the read path, ready to make redirects fast. That’s no accident, it’s because we already know this is read-heavy.

5️⃣ Deep Dive: Generating the Short Code

Now the interviewer leans in: “How do you actually make that abc123 code? And how do you make sure two links never get the same one?” When two links accidentally share a code, that clash is called a collision, and we have to avoid it. This is the heart of the problem, so Alex slows down here.

Alex lays out the two common approaches.

Approach 1: a random string.

  • Generate a random 7-character string, like xK9p2mQ.
  • It’s simple, and the codes don’t reveal how many links exist.
  • The catch is collisions. Two random tries could land on the same string, so before saving you have to check whether that code already exists, and retry if it does. That extra check costs a little time on every write.

Approach 2: an auto-increment ID, then base62.

  • The database hands every new link a number that goes up by one each time: 1, 2, 3, and so on. That’s an auto-increment ID.
  • A raw number like 1000000 is long and ugly, so we convert it into a short code using base62.
  • Base62 is just a way of writing a number using 62 symbols instead of the usual 10. The symbols are 0-9, a-z, and A-Z, which is 10 + 26 + 26 = 62, and that’s where the name comes from.
  • Because each character carries more, a big number turns tiny. A number in the billions fits in about 6 or 7 characters.

Alex picks the second one and says why: “I’ll go with auto-increment plus base62. Every ID is unique by design, since the counter never repeats, so there are no collisions and no need to check first. That keeps writes clean and fast.” The reason is the part the interviewer is listening for.

Why base62 keeps codes tiny

A 7-character base62 code can stand for over 3.5 trillion different links. Writing those same IDs as plain numbers would need many more characters. Base62 keeps the link short while still giving you a giant supply of unique codes.

6️⃣ Deep Dive: Making Redirects Fast

Alex circles back to the big fact from the estimate: this is read-heavy. Tons of clicks, far fewer creates. So the next deep dive is making those reads quick.

Here’s Alex’s reasoning:

  • “Clicks are wildly uneven. A small set of links, like one viral tweet, soaks up most of the traffic. So I’ll keep those hot links in a cache.”
  • A cache is a small, super-fast store, usually in memory, that holds the data people ask for most.
  • “A common choice is Redis, an in-memory key-value store. In-memory means it lives in fast RAM instead of slow disk, so a lookup takes well under a millisecond.”
  • “On a redirect, I check the cache first. A hit skips the database entirely, which is a huge speed-up. On a miss, I read from the database and then save it in the cache for next time.”

Then Alex adds one more layer for good measure: “A CDN could cache redirect responses close to users too, so the request doesn’t even travel far.” A CDN, short for Content Delivery Network, is a set of servers spread around the world. The redirect is the thing people wait on, so shaving milliseconds here makes the whole service feel fast.

For the full picture of how this caching trick works, see Introduction to Caching.

Why a small cache goes a long way

Because clicks are so lopsided, even a tiny cache holding just the hottest links can serve a big chunk of all traffic without ever touching the database. That’s exactly why caching pays off so well here.

7️⃣ Bottlenecks and Trade-offs

No design is perfect, and Alex knows the interviewer is waiting to hear the weak spots. A bottleneck is the part that gets overwhelmed first as traffic grows, like a narrow doorway in a crowded hall. So Alex names each one, names a fix, and is honest that every fix has a cost.

Here’s Alex going through them.

The database gets overloaded by reads. Tens of thousands of redirects a second would crush a single database.

  • Fix: cache the hot links up front, and add read replicas. A read replica is an extra copy of the database that only handles reads, so we spread the redirect lookups across copies.
  • Cost: cached data can go a little stale, and keeping replicas in sync adds complexity. For a shortener that’s a fine trade, since a link’s target rarely changes.

Storage keeps growing forever. Billions of links won’t fit on one machine.

  • Fix: shard the database by code. Sharding means splitting one giant database into smaller pieces, called shards, so codes starting one way live on one shard and others live elsewhere.
  • Cost: queries that span shards get trickier, and you have to pick a sharding scheme carefully. But each shard only handles a slice, so no single machine is the limit.

More traffic than any one server can take. A single app server can’t handle internet scale.

  • Fix: run stateless app servers behind the load balancer. Stateless means a server keeps no memory of past requests, so any server can handle any request, and we just add more when traffic grows.
  • Cost: almost none here, which is exactly why we made the servers stateless in the first place.

301 vs 302. This is the trade-off Alex flagged earlier, and now is the time to settle it.

  • A 301 means “moved permanently”, so the browser may cache it and skip our server entirely on the next click. That’s the fastest option.
  • A 302 means “moved temporarily”, so the browser asks our server every single time, which lets us count clicks for analytics.
  • The cost is a real tug-of-war: 301 is faster but hides clicks from you, while 302 sees every click but adds load. “For pure speed I’d pick 301,” Alex says, “but if analytics were in scope, I’d switch to 302.”

The pattern never changes: name the bottleneck, name the fix, name the cost. Do that a few times and you’ve shown exactly the thinking they came to see.

Say the cost out loud

Anyone can say “add a cache” or “add replicas”. What sets you apart is adding “but then I have to handle stale data”. Naming the downside of your own fix is what makes you sound senior.

8️⃣ Wrap Up

With a couple of minutes left, Alex recaps the whole thing in a few clean lines, the way you’d summarize for a busy interviewer:

  • “We built a service with two endpoints: POST /shorten to create, and GET /code to redirect with a 301.”
  • “Codes come from an auto-increment ID encoded in base62, so they’re unique with no collision checks.”
  • “Since it’s read-heavy, redirects go through a Redis cache and read replicas, with the database sharded by code so it scales.”

Then Alex shows awareness of what was left out: “With more time, I’d add custom aliases, which means checking an alias isn’t already taken before saving. I’d add analytics, which would push me toward 302 so I can count every click. And I’d add expiry using a created_at timestamp to decide when a link goes dead.” That little list tells the interviewer Alex knows the scope was trimmed on purpose, not by accident.

For the full version of this design with the data model spelled out, see Design a URL Shortener.

🧠 What Made This a Good Answer

Step back and notice what actually earned the points here. It wasn’t a secret blueprint, it was how Alex worked:

  • Structure. Alex moved through clarify, estimate, API, design, deep dive, and trade-offs in order, so the interviewer always knew where things stood. No flailing.
  • Communication. Alex narrated every choice out loud and checked in with the interviewer, turning a monologue into a conversation.
  • Justified trade-offs. Alex never just said “use a cache”. Alex said why, and admitted the cost each time. That honesty about downsides is the senior signal.

Two candidates could draw totally different boxes and both pass, as long as they think and talk like this. The process is the answer.

🛠️ Practice Challenge

Now run the exact same process yourself on a fresh prompt: “Design Pastebin, where a user pastes some text and gets a short link that others can open to read it.”

Walk every step out loud, in the same order:

  • Clarify. What questions would you ask? Can pastes expire? Is there a size limit on the text? Are pastes public or private?
  • Estimate. Is this read-heavy or write-heavy? (Hint: one paste, many reads, just like our shortener.)
  • API. What endpoints do you need? Maybe POST /paste and GET /{id}.
  • High-level design. Draw client, load balancer, app servers, cache, and database. Where does the actual text get stored?
  • Deep dive. Pick one piece, maybe how you generate the paste ID, or where big blobs of text live.
  • Trade-offs. Where would it strain if one paste went viral? How would you keep reads fast?

Do it as if someone’s listening. For a worked version to check yourself against, see Design Pastebin. The more you rehearse on small prompts, the calmer you’ll be when a big one lands for real.

🧩 What You’ve Learned

You just watched the whole process run end to end on one problem. Here’s what you’ve picked up.

  • ✅ How to clarify a vague prompt into clear functional and non-functional requirements, and pin the scope out loud.
  • ✅ How a quick estimate reveals the system is read-heavy, which drives every later choice.
  • ✅ How to turn features into a tiny two-endpoint API, with POST /shorten and a 301 redirect on GET /code.
  • ✅ How to draw a high-level design and narrate the read and write flows.
  • ✅ How to deep dive on code generation with auto-increment plus base62, avoiding collisions.
  • ✅ How to make redirects fast with a Redis cache, read replicas, and a CDN.
  • ✅ How to name bottlenecks, fixes, and costs, including the 301 vs 302 trade-off.
  • ✅ Why structure, communication, and justified trade-offs are what actually earn the pass.

Check Your Knowledge

Test what you learned. Pick an answer for each question, then click Check.

  1. 1

    Why is a URL shortener considered a read-heavy system?

    Why: A link is created once but can be clicked thousands of times, so reads massively outnumber writes.

  2. 2

    Why did Alex choose auto-increment IDs encoded in base62 for the short code?

    Why: The counter never repeats, so every code is unique with no need to check for collisions before saving.

  3. 3

    How does the design make redirects fast at scale?

    Why: Hot links are kept in an in-memory cache like Redis and checked first, so most redirects skip the slow database.

  4. 4

    What is the trade-off between a 301 and a 302 redirect?

    Why: A 301 lets the browser cache and skip your server for speed, while a 302 forces every click through your server so you can count them.

🚀 What’s Next?

You’ve now seen the process and the process in action. Lock it in by going deeper on the source material and the template.

Run this same walkthrough on a few more prompts, and “design X for me” will stop feeling scary and start feeling like a checklist you can work through calmly.

Share & Connect