Design a URL Shortener (System Design)

Table of Contents +

You’ve seen this a hundred times, right?

You copy some giant, ugly link, the kind with a hundred characters and a bunch of ?ref=...&utm=... junk at the end.
You paste it into a site like bit.ly, and out pops something tiny like short.ly/abc123.
You share that short one in a tweet or a message, someone clicks it, and they land on the exact same big page.

That little tool is one of the most loved system design interview questions out there. It looks simple on the surface, but it quietly touches almost everything: databases, caching, scaling, and a neat little trick for making short codes. So let’s design one together, step by step.

🎯 What We’re Building

So what exactly is a URL shortener? Let’s name it plainly first.

A URL shortener is a service that takes a long link and gives you a much shorter one that points to the same place.
The short link is called the short code, that’s the little abc123 part at the end.
When someone clicks the short link, the service quietly sends them to the original long URL. That sending-along is called a redirect.

Now why would anyone want this? A few real reasons:

Long links are ugly and easy to break when you paste them across apps. Short ones stay clean.
Some places limit how much you can type, so a short link saves room.
You can track clicks on a short link, which is handy for marketing.

So our job is to build something that does two things really well: turn a long URL into a short one, and send people from the short one back to the long one. Sounds easy, but the interesting part is doing it fast and at huge scale.

📋 Requirements

Before writing any code or drawing any boxes, a good engineer asks: what must this thing actually do? We split that into two buckets.

A functional requirement is a thing the system must do, a feature you can point at.
A non-functional requirement is about how well it does those things, like how fast or how reliable it is.

Here’s what our shortener must do. These are the functional ones:

Take a long URL and give back a short URL.
When someone visits the short URL, redirect them to the original long URL.
Optionally let a user pick their own custom alias, like short.ly/my-event.
Optionally let a link expire after some time.

And here’s how well it should do them. These are the non-functional ones:

Redirects should be fast, because people are waiting on a click. Slow redirects feel broken.
The service should be highly available, meaning it stays up almost all the time. A link that doesn’t work is useless.
It should scale, so it keeps working even when billions of links and clicks pile on.

Always ask before you design

In a real interview, don’t jump straight to drawing boxes. First ask the interviewer what features matter and roughly how big it needs to be. Nailing the requirements first is half the score.

📊 Rough Scale

Now let’s get a rough feel for the size. This is called a back-of-the-envelope estimate, which just means quick, rough math to guess how big things are. We’re not aiming for exact numbers, just the shape of it.

Think about how the service gets used. Creating a new short link happens once. That’s a write, meaning we store something new.
But that one link might get clicked thousands of times after that. Each click is a read, meaning we just look something up.
So reads (the clicks and redirects) hugely outnumber writes (new links). A rough split people often use is around 100 reads for every 1 write.

That one fact shapes the whole design:

Our system is read-heavy, which means most of the traffic is just looking up links, not creating them.
So we should make reads blazing fast, even if creating a link is a tiny bit slower. That’s a fair trade.

Keep this in your head: reads dominate. We’ll lean on it again and again.

🔌 The API

Let’s pin down how the outside world talks to our service. The way other programs talk to ours is called an API, short for Application Programming Interface. It’s just the agreed set of requests we accept.

We really only need two endpoints. Here’s what they look like.

POST /shorten
Body:     { "long_url": "https://example.com/some/very/long/path?x=1" }
Response: { "short_url": "https://short.ly/abc123" }

GET /abc123
Response: 301 Redirect  ->  Location: https://example.com/some/very/long/path?x=1

Let’s read them one at a time:

POST /shorten is the create call. You hand it a long URL in the body, and it hands back a short one. We use POST because we’re creating something new on the server.
GET /abc123 is the redirect call. Someone’s browser hits the short link, and our server replies with a redirect that points the browser to the real long URL.
That 301 is an HTTP status code meaning “moved permanently”, so the browser jumps straight to the long URL. (You could also use 302, which means “moved temporarily”.)

301 or 302?

A 301 tells browsers the link is permanent, so they may cache it and skip your server next time, which is fast. A 302 is temporary, so the browser keeps asking your server every time, which lets you count clicks. Pick 301 for speed, 302 if you need analytics on every click.

🗄️ Data Model

Now, where do we keep all these links? At its heart, we’re storing a simple pairing: a short code, and the long URL it points to. That’s it. Here’s the tiny table we need.

Column	What it holds
`code`	The short code, like `abc123` (this is the key we look up by)
`long_url`	The original long URL to redirect to
`created_at`	When the link was made (handy for expiry and stats)

Look at what we’re actually doing here:

For a redirect, we get a code and we want its long_url. That’s it. Look up one key, get one value.
That shape, look up by a key and get a value, is exactly what a key-value store is built for. A key-value store is a database that’s super fast at “give me the value for this key”.
So a simple key-value or NoSQL database fits this really well. We don’t need fancy joins or complex queries here, just dead-simple, lightning-fast lookups.

🧩 The Core Problem: Generating the Short Code

Here’s the real heart of the design, the part interviewers love poking at: how do we make that short abc123 code? And how do we make sure two different long URLs never get the same code? When two things accidentally get the same code, that clash is called a collision, and we have to avoid it.

There are two main ways people do this. Let’s walk through both.

Approach 1: Random string.

Just generate a random string of, say, 7 characters, like xK9p2mQ.
It’s simple and the codes don’t reveal how many links exist.
The catch is collisions. Two random tries could land on the same string. So before saving, you check if that code already exists, and if it does, you generate another one. That extra check costs a little time on every write.

Approach 2: Auto-increment ID plus base62.

The database hands every new link a number that goes up by one each time: 1, 2, 3, and so on. That’s called an auto-increment ID.
But a raw number like 1000000 is long and boring. So we convert it into a shorter code using base62.
Base62 is just a way of writing numbers using 62 symbols instead of the usual 10. The symbols are 0-9, a-z, and A-Z. (That’s 10 + 26 + 26 = 62, which is where the name comes from.)
Because base62 packs more meaning into each character, a big number turns into a tiny code. The number 125 becomes cb, and even a number in the billions fits in about 6 or 7 characters.

The big win of the second approach:

Every ID is unique by design, since the counter never repeats. So you get no collisions at all, with no need to check first.
That makes writes clean and fast.

Why base62 and not just the number?

A 7-character base62 code can represent over 3.5 trillion different links. Writing those as plain numbers would need many more characters. Base62 keeps the link short while still giving you a massive supply of unique codes.

So how big can these codes go? Here’s the quick intuition.

Code length	How many links it can cover
6 characters	About 56 billion
7 characters	About 3.5 trillion

So even a 7-character code gives us room for a mind-boggling number of links. We’re not running out anytime soon.

🏗️ High-Level Design

Okay, let’s put the pieces together. When you zoom out, the whole system is just a few boxes talking to each other.

Let’s trace what happens for each of our two jobs.

Creating a link (a write):

The client sends POST /shorten with the long URL.
An app server takes the next ID, turns it into a base62 code, and saves the code and long_url in the database.
It sends the short URL back to the client.

Following a link (a read):

The client’s browser hits GET /abc123.
The app server first checks the cache, that fast memory store where we keep the busiest links. We’ll cover this next.
If the code is in the cache, great, we grab the long URL instantly. If not, we look it up in the database, then save it in the cache for next time.
The server replies with a redirect to the long URL, and the browser goes there.

That’s the full loop. Notice the cache sitting right in the read path, ready to make those redirects fast.

⚡ Making Redirects Fast

Remember our big fact? The system is read-heavy. Tons of people clicking links, far fewer creating them. So we pour our effort into making reads quick. Here’s how.

Most clicks go to a small set of popular links, like one viral tweet’s link. So we keep those hot links in a cache. A cache is a small, super-fast store (usually in memory) that holds the data people ask for most.
A common choice here is Redis, which is a popular in-memory key-value store. “In-memory” means it keeps data in fast RAM instead of slower disk, so lookups take well under a millisecond.
When a redirect comes in, we check the cache first. A hit there skips the database entirely, which is a huge speed-up.

We can push things even closer to the user too:

A CDN can help, which stands for Content Delivery Network, a set of servers spread around the world. It can cache redirect responses close to users so the request doesn’t even travel far.
This matters because the redirect is the thing people wait on. Shave milliseconds here and the whole service feels fast.

Why caching works so well here

Link clicks are wildly uneven. A tiny fraction of links get the vast majority of the clicks. So even a small cache holding the hottest links can serve a big chunk of all traffic without ever touching the database.

📈 Scaling It

Now imagine this thing gets huge, billions of links and clicks. One server and one database won’t cut it anymore. Here’s how we grow it.

Stateless app servers behind a load balancer. “Stateless” means a server keeps no memory of past requests, so any server can handle any request. A load balancer is the traffic cop that spreads incoming requests across all the servers. Because the servers are stateless, we can just add more of them when traffic grows.
Read replicas. A read replica is an extra copy of the database that only handles reads. Since we’re read-heavy, we send all the redirect lookups to replicas and keep the main database free for writes.
Cache aggressively. As we said, Redis up front soaks up most of the read traffic before it ever reaches the database.
Shard the database by code. Sharding means splitting one giant database into smaller pieces, called shards, so no single machine holds everything. We can shard by the code, so codes starting one way live on one shard and others live elsewhere. Each shard handles a slice of the lookups.

Put together, this design can handle enormous load. Reads fly through the cache and replicas, writes stay simple thanks to the ID counter, and we add machines as we grow.

🧰 Tech Choices

Part of system design is not just naming pieces, it’s saying why you picked each one. Here are the main technology decisions for this system and the reason behind each.

Decision	Choice	Why
Make the short code	Base62 of a unique id	Short, unique codes that are easy to share.
Store the mapping	Key-value database	Simple, very fast code→URL lookups at huge scale.
Make redirects fast	Cache (Redis)	Hot links are served from memory, not the database.
Count clicks	Async via a queue	Analytics don’t slow down the redirect.

⚠️ Common Mistakes and Misconceptions

A few things trip people up on this one. Let’s clear them out.

“I need a fancy relational database with lots of tables.” Not really. The lookup is dead simple, one key to one value, so a key-value or NoSQL store fits beautifully. Don’t over-build it.
“Random codes are fine, no need to check anything.” Careful. Random codes can collide, so you must check for an existing code before saving. The auto-increment plus base62 approach sidesteps this entirely.
“Just make the database faster.” The bigger win is caching. Since reads dominate and a few links are super popular, a cache in front of the database does the heavy lifting.
“301 and 302 are the same.” They’re not. 301 is permanent and browsers may cache it, which is fast but hides clicks from you. 302 is temporary and hits your server every time, which lets you count clicks.
“Base62 is some complex encryption.” Nope. It’s just a different way to write a number using 62 symbols. It only shortens the code, it doesn’t hide or secure anything.

🛠️ Design Challenge

Try extending the design yourself. Think each one through first, then open the answer to see a full breakdown.

Custom aliases. Let a user pick their own code, like short.ly/summer-sale. How do you make sure the alias isn’t already taken?

Show the answer

Analytics. Count how many times each link is clicked. Would 301 or 302 help, and where do you store the counts?

Show the answer

Expiry. Let a link stop working after a set time. How would you use the timestamp, and what do you return for a dead link?

Show the answer

🧩 What You’ve Learned

You can now design a URL shortener from scratch and talk through it clearly. Here’s what you picked up.

✅ The core job: shorten a long URL, and redirect a short code back to it.
✅ Functional vs non-functional requirements, and why you gather them first.
✅ The system is read-heavy, so reads get the optimization love.
✅ A two-endpoint API: POST /shorten to create, GET /code to redirect with a 301.
✅ A tiny key-value-style data model of code, long_url, and created_at.
✅ Generating codes with an auto-increment ID plus base62, which avoids collisions.
✅ Making redirects fast with a cache like Redis and a CDN.
✅ Scaling with stateless servers behind a load balancer, read replicas, and sharding by code.

Check Your Knowledge

Test what you learned. Pick an answer for each question, then click Check.

🚀 What’s Next?

This case study leans hard on two ideas that show up in almost every system design. Go deeper on them next.

Introduction to Caching explains how caching keeps reads fast, the exact trick our redirects depend on.
SQL vs NoSQL breaks down when a key-value or NoSQL store beats a relational one, which is the call we made for our data model.

Once you’re comfortable with those, come back and try the design challenge again. You’ll see the whole system click into place.

Previous Infrastructure as Code (IaC) Next Design a Pastebin (System Design)

Share & Connect

Share on LinkedIn