Design a News Feed System (System Design)

You do this every single day:

  • You open Instagram, or Facebook, or Twitter, and the very first thing you see is a fresh stream of posts.
  • They’re all from people you follow, and the new stuff is sitting right at the top.
  • It loads in a blink, even though those people might have posted just seconds ago.

That stream is called the news feed, and it’s one of the most loved system design interview questions out there. It looks simple, you just show some posts, right? But inside it hides a really juicy problem: how do you build a personal feed for millions of people, fast, when everyone follows a different set of people? Let’s design it together, step by step.

🎯 What We’re Building

So what exactly is a news feed? Let’s name it plainly first.

  • A news feed is the personalized list of posts a user sees when they open the app.
  • It’s made of posts from the people that user follows, with the newest ones near the top.
  • Two different users almost never see the same feed, because Alex follows different people than Riya does.

So our job is to build something that does two things really well. One, let people post content. Two, show each person a fresh feed of posts from the people they follow. Sounds easy, but the interesting part is doing it fast and for millions of users at once.

📋 Requirements

Before drawing any boxes, a good engineer asks: what must this thing actually do? We split that into two buckets.

  • A functional requirement is a thing the system must do, a feature you can point at.
  • A non-functional requirement is about how well it does those things, like how fast or how reliable it is.

Here’s what our feed must do. These are the functional ones:

  • Let a user post content, like text, a photo, or a video.
  • Let a user follow other users.
  • Show each user a feed of posts from the people they follow, with the newest stuff near the top.

And here’s how well it should do them. These are the non-functional ones:

  • Feeds should load fast, because people open the app and expect posts right away. A slow feed feels broken.
  • It should scale to millions of users posting and scrolling at the same time.
  • It should be eventually fresh. If Riya posts now, it’s totally fine if Alex sees it a few seconds later, not the exact same instant.

Always ask before you design

In a real interview, don’t jump straight to drawing boxes. First ask the interviewer what features matter and roughly how big it needs to be. For a feed, the freshness question is golden: is it okay if a post shows up a few seconds late? The answer changes your whole design.

📊 Rough Scale

Now let’s get a rough feel for the size. This is a back-of-the-envelope estimate, which just means quick, rough math to guess how big things are. We’re not after exact numbers, just the shape of it.

  • Think about how people actually use the app. Posting something happens once in a while. That’s a write, meaning we store something new.
  • But opening the app and scrolling the feed happens many times a day, every day. Each feed view is a read, meaning we just look stuff up.
  • So reads (feed views) hugely outnumber writes (new posts). Most people scroll way more than they post.

That one fact shapes the whole design:

  • Our system is read-heavy, which means most of the traffic is people opening their feed, not posting.
  • So we should make reading the feed blazing fast, even if posting does a little extra work in the background. That’s a fair trade.

Keep this in your head: reads dominate. We’ll lean on it again and again.

🧩 The Core Question: How to Build the Feed

Here’s the real heart of the design, the part interviewers love poking at: when Alex opens the app, where does that list of posts actually come from?

There are two main ways to do this, and they’re kind of opposites. The whole interview really circles around these two.

  • One way: do the work when someone posts, so the feed is already sitting ready when they open the app.
  • The other way: do the work when someone opens the app, building their feed fresh on the spot.

These two ideas have names, fan-out on write and fan-out on read. Let’s walk through both, because the smart final answer mixes them.

✍️ Fan-out on Write

Let’s start with the first approach. “Fan-out” just means spreading one thing out to many places. So fan-out on write means: the moment you post, we spread that post out into all your followers’ feeds right away.

Here’s the idea in points:

  • Every user has their own precomputed feed, a ready-made list of post IDs just waiting for them. “Precomputed” means we built it ahead of time, before they even asked.
  • When Riya posts something, we immediately push that post into the feed of every single person who follows Riya.
  • So by the time Alex opens the app, Alex’s feed is already built. We just hand it over. No heavy work at read time.

So when Riya hits “post”, the work fans out to all her followers like this.

Riya posts

Look up Riya's followers

Push post into Alex's feed

Push post into Sam's feed

Push post into Maya's feed

So what’s the trade here?

  • Reads are super fast, because the feed is already built. Opening the app is just “grab my ready list”. This is the dream for a read-heavy system.
  • But writes get heavy. If Riya has a thousand followers, one post turns into a thousand little writes, one into each follower’s feed. That’s a lot of work for a single post.

Why fan-out on write loves reads

Remember our big fact: the system is read-heavy. People open the app far more often than they post. Fan-out on write moves the hard work to the rare event (posting) and keeps the common event (reading the feed) cheap. That’s usually a smart trade.

📖 Fan-out on Read

Now the opposite approach. Fan-out on read means we don’t build anything ahead of time. We build the feed fresh the moment the user opens the app.

Here’s the idea in points:

  • We don’t keep a ready-made feed for anyone. We just store each user’s own posts.
  • When Alex opens the app, we look at everyone Alex follows, go pull their recent posts, mix them together, sort by newest, and show them.
  • So all the work happens at read time, right when Alex is waiting.

So what’s the trade here? It’s basically the mirror image of the first one.

  • Writes are light. When Riya posts, we just save it once in Riya’s own posts. We don’t touch anybody else’s feed.
  • But reads get heavier. Every time Alex opens the app, we have to go fetch posts from everyone Alex follows and merge them, while Alex waits.

So the two approaches pull in opposite directions, and it helps to see them side by side.

⚖️ Fan-out on Write vs Read

Here’s the clean comparison you can rattle off in an interview.

Aspect Fan-out on Write Fan-out on Read
When the work happens When you post When you open the app
Feed read speed Very fast (already built) Slower (built on the spot)
Cost of posting Heavy (push to all followers) Light (save once)
Best for Normal users with a few followers Users with millions of followers

Notice that last row. It’s hinting at the real problem we hit next.

🌟 The Celebrity Problem

Fan-out on write sounds great, so why not just use it for everyone? Here’s where it falls apart.

  • Imagine a celebrity with twenty million followers. The moment they post, fan-out on write tries to push that one post into twenty million feeds.
  • That single post becomes twenty million writes. Now imagine they post a few times an hour. The system buckles.
  • This blow-up from one user having a huge follower count is called the celebrity problem, sometimes the “hotkey” or “fan-out” problem.

So pure fan-out on write breaks on celebrities. But pure fan-out on read is slow for everyone. What do we do? We mix them. This mix is called the hybrid approach, and it’s the answer interviewers are hoping to hear.

  • For normal users (a few hundred or few thousand followers), use fan-out on write. Pushing to their followers is cheap, and reads stay fast.
  • For celebrities (millions of followers), don’t fan out on write. Skip pushing their posts into everyone’s feed.
  • When Alex opens the app, we build the feed from two pieces: the precomputed part (from the normal people Alex follows) plus the celebrity posts pulled fresh at read time. We mix those two together and show them.

Alex opens the app

Precomputed feed (normal follows)

Pull recent celebrity posts live

Merge and sort by newest

Show Alex the feed

The hybrid answer wins interviews

If you only remember one thing from this lesson, remember this: fan-out on write for normal users, fan-out on read for celebrities, then merge at read time. Saying that line shows you understand both approaches and the trade-off between them.

🏗️ High-Level Design

Okay, let’s put the pieces together. When you zoom out, the system is just a few boxes talking to each other.

Client (app)

Post service

Feed service

Database (posts, follows)

Queue

Fan-out worker

Feed cache (per user)

Let’s name what each box does:

  • Post service. This is what you hit when you post. It saves the post in the database, then drops a message on a queue saying “this post needs to be fanned out”.
  • Database. This holds the real, permanent data: every post, and who follows whom. It’s the source of truth, the place we trust if anything else gets lost.
  • Queue. A message queue is a waiting line for tasks. Instead of doing the heavy fan-out right away, we drop a “go fan this out” task on the queue and move on. A worker picks it up in the background.
  • Fan-out worker. This is the background helper that reads tasks off the queue. For a normal user’s post, it looks up the followers and pushes the post into each follower’s feed cache.
  • Feed cache (per user). Each user gets their own ready-made feed kept in a fast store. This is the precomputed feed we talked about.
  • Feed service. This is what you hit when you open the app. It grabs your ready feed from the cache, mixes in fresh celebrity posts, sorts, and hands it back.

Let’s trace both jobs through these boxes.

Posting (a write):

  • The client sends the post to the post service.
  • The post service saves it in the database and drops a fan-out task on the queue.
  • The fan-out worker picks up the task and, for a normal user, pushes the post into each follower’s feed cache.

Reading the feed (a read):

  • The client asks the feed service for the feed.
  • The feed service grabs the ready feed from that user’s feed cache.
  • It pulls in recent posts from any celebrities the user follows, merges them, sorts by newest, and returns the list.

Why a queue sits in the middle

Pushing a post to thousands of followers takes time. We don’t want the person who posted to sit and wait for all that. So we drop the job on a queue and let a worker do it in the background. The poster gets an instant “posted!” while the fan-out happens quietly after. Doing work in the background like this is called being asynchronous.

⚡ Making It Fast

Remember our big fact? The system is read-heavy. Tons of people opening their feed, far fewer posting. So we pour our effort into making the feed load quick. Here’s how.

  • Keep each user’s feed in a cache. A cache is a small, super-fast store (usually in memory) that holds data people ask for most. We store each user’s feed there, ready to grab.
  • A common choice is Redis, an in-memory store. “In-memory” means it keeps data in fast RAM instead of slower disk, so reading the feed takes well under a millisecond.
  • We don’t store whole posts in the feed cache, just the list of post IDs. Then we look up the actual post content separately, often from another cache. This keeps each user’s feed small and cheap.

There’s one more piece worth a quick mention, ranking.

  • The simplest feed is just newest-first, sort by time and you’re done. Great for a first design.
  • But real apps like Instagram don’t show strictly newest-first. They rank posts by what you’re likely to care about, using things like how close you are to that person and how popular the post is. That’s called ranking.
  • For an interview, start with newest-first, then say “we could add a ranking step on top”. That shows you know it exists without over-building.

Why caching works so well here

Feed views vastly outnumber posts, and people re-open the app constantly. So keeping each feed ready in a fast cache means most reads never touch the slower database at all. The cache does the heavy lifting.

📈 Scaling It

Now imagine this thing gets huge, millions of users posting and scrolling. One server and one database won’t cut it. Here’s how we grow it.

  • Shard the database. Sharding means splitting one giant database into smaller pieces, called shards, so no single machine holds everything. We can shard posts and feeds by user ID, so each shard handles a slice of the users.
  • Cache aggressively. As we said, per-user feeds live in a fast store like Redis, soaking up most of the read traffic before it ever reaches the database.
  • Do fan-out asynchronously through queues. The queue and background workers mean a single post never makes the poster wait. And when posting spikes, tasks just pile up in the queue and get worked off steadily, instead of crashing the system.
  • Add more fan-out workers. Since they all just pull tasks off the same queue, we can run as many workers as we need. More traffic, more workers.

Post service

Queue

Fan-out worker 1

Fan-out worker 2

Fan-out worker 3

Sharded feed cache

Put together, this design handles enormous load. Reads fly through the per-user cache, writes stay fast because fan-out happens in the background, and we add machines as we grow.

🧰 Tech Choices

Part of system design is not just naming pieces, it’s saying why you picked each one. Here are the main technology decisions for this system and the reason behind each.

Decision Choice Why
Build the feed Fan-out on write (precompute) Feeds are read far more than written, so precomputing makes reads fast.
Handle celebrities Hybrid (pull for huge accounts) Avoids fanning one post out to tens of millions of feeds.
Store ready feeds Redis Serves precomputed feeds in memory, very fast.
Photos and videos Object storage + CDN Large media stored cheaply, served from nearby.
Spread the fan-out work Message queue Does the heavy fan-out in the background.

⚠️ Common Mistakes and Misconceptions

A few things trip people up on this one. Let’s clear them out.

  • “Just query everyone’s posts live, every single time.” That’s pure fan-out on read, and for a read-heavy app it’s slow and expensive. Every feed open turns into a big merge across everyone you follow. Precompute feeds for normal users instead.
  • “Fan-out on write handles celebrities fine.” It does not. One celebrity post becomes millions of writes and melts the system. That’s the celebrity problem, and the fix is the hybrid approach: fan out on read for celebrities.
  • “Skip the cache, the database is fast enough.” With reads dwarfing writes, overloading the database on every feed open won’t hold up. The per-user feed cache is what makes the whole thing fast.
  • “Do the fan-out right when the user posts, while they wait.” No, that makes posting slow and fragile under load. Drop the task on a queue and let background workers handle it.
  • “The feed has to be perfectly fresh and identical for everyone instantly.” It doesn’t. A feed is eventually fresh, a few seconds of delay is totally fine, and that slack is exactly what lets us do fan-out in the background.

🛠️ Design Challenge

Try extending the design yourself. Think each one through first, then open the answer to see a full breakdown.

Ranking by relevance. Instead of newest-first, show the posts a user is most likely to care about. Where would the ranking run, and on what signals?

Inserting ads. Slip sponsored posts into the feed every so often. Would you mix ads into the stored feed, or blend them in at read time?

🧩 What You’ve Learned

You can now design a news feed from scratch and talk through it clearly. Here’s what you picked up.

  • ✅ The core job: let people post, and show each person a fresh feed of posts from those they follow.
  • ✅ The system is read-heavy, so reading the feed gets the optimization love.
  • ✅ Fan-out on write pushes a post into followers’ feeds up front: fast reads, heavy writes.
  • ✅ Fan-out on read builds the feed when the user opens the app: light writes, heavier reads.
  • ✅ The celebrity problem breaks fan-out on write, and the hybrid approach fixes it.
  • ✅ A high-level design with a post service, feed service, per-user feed cache, database, queue, and fan-out worker.
  • ✅ Making it fast with per-user feed caches, and scaling with sharding plus async fan-out through queues.

Check Your Knowledge

Test what you learned. Pick an answer for each question, then click Check.

  1. 1

    What is the main difference between fan-out on write and fan-out on read?

    Why: Fan-out on write does the work up front so reads are fast, while fan-out on read does the work at read time so writes stay light.

  2. 2

    What is the celebrity problem with pure fan-out on write?

    Why: Pushing one celebrity post into millions of follower feeds creates a write explosion that overwhelms the system.

  3. 3

    What is the hybrid approach that solves the celebrity problem?

    Why: Normal users get precomputed feeds while celebrity posts are pulled fresh and merged in when each follower reads, avoiding the write explosion.

  4. 4

    Why is a message queue used for the fan-out work?

    Why: Dropping fan-out on a queue lets the poster get an instant response while background workers spread the post, and it smooths spikes.

🚀 What’s Next?

This case study leans hard on two ideas that show up in almost every system design. Go deeper on them next.

Once you’re comfortable with those, come back and try the design challenge again. You’ll see the whole system click into place.

Share & Connect