Design a Coupon / Flash-Sale System (Groupon)
Table of Contents + −
A flash sale is brutal on a system. A site like Groupon posts “500 deals at 90% off, starting at 12:00 PM.” At exactly noon, a hundred thousand people hit the buy button in the same few seconds, all fighting for 500 spots. Two things must stay true: never sell more than 500, and stay up under the flood. Let’s design a system that survives this.
🎯 What the System Does
A flash-sale system needs to:
- Hold a limited number of deals or coupons (say 500).
- Let users claim one when the sale starts.
- Never oversell: stop at exactly 500, not 501.
- Survive a massive spike of traffic in a short window.
The whole challenge is the combination: limited stock plus a sudden crush of people. Let’s tackle both.
📋 Requirements
Functional (what it must do):
- Start the sale at a set time.
- Let users claim a deal until stock runs out.
- Show “sold out” once it’s gone.
Non-functional (how well it must do it):
- Correctness: never give out more deals than exist. This is the top rule.
- Handle spikes: survive a huge burst of traffic without crashing.
- Fairness and speed: users should get a quick yes or no.
The one rule you cannot break
Selling 501 deals when you have 500 means giving someone a deal you can’t honor. Overselling is the failure that matters most here, so the design centers on counting stock correctly under heavy load.
🛑 The Core Problem: Don’t Oversell
Picture the last deal. Two users click buy at the same instant:
- Both read “1 left.”
- Both think “great, it’s mine,” and both claim it.
- Now you’ve sold 2 of your last 1. Oversold.
This is a race condition again, the same shared-data clash. We need the “check stock and reduce it” step to happen as one safe, all-or-nothing action, so two people can’t both grab the last one.
A clean way to do this uses a fast in-memory store like Redis, which can decrease a number safely in one step:
left = decrease stock by 1 # done as one safe stepif left >= 0: give the user the dealelse: increase stock back by 1 show "sold out"Reading that: the decrease happens as a single safe action, so each user gets a unique result. Only the first 500 see a number that’s zero or above. Everyone after sees “sold out.” No overselling, even with thousands clicking at once.
🏗️ High-Level Design
Here’s the shape, built to absorb a spike.
Reading the flow:
- A load balancer spreads the flood across many app servers.
- The servers do the one safe step on the Redis stock counter to claim a spot.
- A winning claim drops an order on a queue, and a separate order service records it in the database calmly, later.
The clever bit: the fast, high-pressure part (counting stock) happens in Redis, which is built for speed. The slower part (saving the order to the database) happens off to the side through a queue, so the database isn’t crushed by the spike.
🌊 Surviving the Spike
A flash sale’s traffic isn’t steady, it’s a wall hitting at once. A few moves help:
- Absorb writes with a queue. Confirmed orders go on a queue and get saved at a steady pace, instead of all slamming the database at the same instant.
- Cache the “sold out” state. Once stock hits zero, most people are just getting told “sold out.” Serve that from a cache so it’s cheap and the system isn’t doing real work for late arrivals.
- A waiting room or rate limits. Many real systems put extra users in a virtual waiting line, letting them in steadily so the core never gets overwhelmed all at once.
Split the fast path from the slow path
The trick across all of this: do the urgent thing (claim a spot) fast in memory, and push the heavier thing (record the order) onto a queue to handle calmly. That split is what keeps a flash sale standing.
📈 Why Redis, Not the Main Database
You might ask: why not just count stock in the main database? Because under a huge spike, the database would be a bottleneck. It’s slower and would buckle under a hundred thousand requests at once. Redis lives in memory, handles a very high request rate, and can do the safe decrease in one step. So we use Redis for the hot counter and the database for the permanent record.
🧰 Tech Choices
Part of system design is not just naming pieces, it’s saying why you picked each one. Here are the main technology decisions for this system and the reason behind each.
| Decision | Choice | Why |
|---|---|---|
| Count limited stock | Redis (in-memory) | Decreases stock safely in one step at a very high request rate; a database would be the bottleneck. |
| Record confirmed orders | Message queue + database | The queue absorbs the spike; orders are saved at a steady pace. |
| Serve “sold out” | Cache | Cheap to tell the many late arrivals it’s gone. |
| Keep it fair | Waiting room + rate limits | Stops bots and floods from overwhelming the core. |
⚠️ Common Mistakes and Misconceptions
A few things to keep straight:
- “Read the stock, then write the new value.” That two-step approach lets two users grab the last item. The decrease must be one safe, all-or-nothing step.
- “Save every order straight to the database during the rush.” The database can’t take the full spike at once. Use a queue to record orders at a steady pace.
- “Count stock in the main database.” Under a flash sale, that’s a bottleneck. A fast in-memory store like Redis handles the hot counter far better.
🧩 What You’ve Learned
Nice work. Here’s the recap:
- ✅ A flash-sale system gives out limited stock to a sudden flood of buyers.
- ✅ The top rule is never oversell, which means the “check and reduce stock” step must be one safe action.
- ✅ A fast in-memory store like Redis can safely decrease the stock counter in one step, even under huge load.
- ✅ Confirmed orders go on a queue and are saved to the database at a steady pace, so the spike doesn’t crush it.
- ✅ Caching the “sold out” state and using waiting rooms or rate limits help absorb the burst.
Check Your Knowledge
Test what you learned. Pick an answer for each question, then click Check.
- 1
What is the most important rule for a flash-sale system?
Why: Overselling means promising deals you can't honor. Counting stock correctly under load is the central goal.
- 2
Why must the 'check stock and reduce it' step be one safe action?
Why: If reading and reducing stock are separate steps, two users can both grab the last one. Doing it as one all-or-nothing step prevents that.
- 3
Why save confirmed orders through a queue instead of straight to the database?
Why: A queue absorbs the burst and lets the order service write to the database calmly, so the spike doesn't overwhelm it.
- 4
Why use Redis for the stock counter instead of the main database?
Why: Under a flash sale, the database would be a bottleneck. Redis is in-memory and fast, ideal for the hot stock counter.
🛠️ Design Challenge
Try extending the flash-sale design yourself. Think each one through first, then open the answer to see a full breakdown.
One deal per user. Stop a single person from grabbing many of the limited deals. How would you enforce that?
Stopping bots. Bots can grab all the stock in milliseconds, leaving real users with nothing. How do you keep it fair?
Returning stock when payment fails. A user claims a deal but never pays. How do you give that unit back?
🚀 What’s Next?
You’ve survived a flash sale. Let’s look at related high-pressure designs.
- Design a Ticketing System handles limited seats with the added twist of holding a seat during checkout.
- Rate Limiting is a key tool for taming traffic spikes.
Get these down and you’ll be confident with any “limited stock, huge rush” problem.