Leaky Bucket Algorithm

Table of Contents +

Imagine you’re running a backend service, and requests come at it in waves. One second it’s quiet, the next second a thousand requests slam in all at once. Here’s what you really want:

No matter how crazy the input looks, you want what reaches your server to be a calm, steady stream.
A nice, even output rate, like water dripping out of a tap at the same speed all the time.
The bursts should get flattened out before they ever hit the thing you’re protecting.

That’s exactly what the leaky bucket algorithm gives you. Let’s see how.

🪣 The Idea

So picture an actual bucket with a small hole at the bottom. You pour water in from the top, and it drips out of the hole at a fixed speed. That picture is the whole algorithm. Here’s how it maps over:

Each incoming request is like a drop of water you pour into the bucket. The bucket is really just a queue, which is a line where requests wait their turn.
The water leaks out of the hole at a fixed, steady rate. So requests leave the queue and get processed at that same steady pace, no faster.
The bucket has a limit on how much it can hold, called its capacity. If requests pour in faster than they leak out and the bucket fills up, the extra ones spill over the top and get dropped.

The key thing to hold in your head: water goes in fast or slow, but it always drips out at one constant speed. That steady drip is the point of the whole thing.

⚙️ How It Works

Let’s break the machine into its moving parts so you can see how a request actually flows through it. There are really just a few things to track:

The bucket (the queue). This is where requests wait. New requests join the back of the line as they arrive.
The leak rate. This is the fixed speed at which requests leave the bucket and get processed, say 5 requests per second. It never speeds up, even if the bucket is full and overflowing.
The capacity. This is the most the bucket can hold. Once the line is full and another request shows up, there’s no room, so that request is dropped right away.

Now follow one request through the system:

A request arrives. The bucket checks if there’s room.
If there’s space, the request joins the queue and waits its turn.
If the bucket is already full, the request gets dropped. It doesn’t wait, it’s just gone.
Meanwhile, on its own steady clock, the bucket leaks out one request at a time at the fixed rate and sends it off to be processed.

Here’s that flow as a picture.

Why the leak rate never changes

The leak rate is fixed on purpose. That’s the whole trick. Because the output speed stays constant no matter what the input does, the thing on the other side always sees the same calm, predictable load. A backend that knows it will never get more than 5 requests per second is a backend that’s very easy to keep healthy.

🌊 Why It Smooths Traffic

So why go to all this trouble? Because of what comes out the other end. Let’s look at the difference:

The input can be wild. A thousand requests arrive in one second, then nothing for the next three seconds. That’s called bursty traffic, where load comes in sudden spikes instead of a smooth flow.
The output is always calm. Thanks to the fixed leak rate, your server sees the same steady trickle the whole time, no matter how spiky the input was.

This evening-out is so useful it has its own name. Reshaping bursty traffic into a smooth, steady stream is called traffic shaping. The leaky bucket is one of the classic ways to do it. So the bucket acts like a shock absorber: it soaks up the spikes and hands out the load at a pace your system can actually handle.

🆚 Leaky Bucket vs Token Bucket

People mix these two up all the time, so let’s clear it up. They both limit rate, but they behave very differently. The short version:

Leaky bucket gives you a steady output and refuses to let bursts through. Requests leave at one fixed pace, full stop.
Token bucket is more relaxed. It saves up “tokens” while things are quiet, so when a burst arrives, it can let a chunk of it through all at once.

Here’s the side-by-side.

Aspect	Leaky Bucket	Token Bucket
Output rate	Always steady and fixed	Can spike during a burst
Allows bursts?	No, bursts get flattened	Yes, up to saved-up tokens
What it holds	Waiting requests (a queue)	Spare tokens (permission to send)
Best for	Smoothing traffic into a steady flow	Allowing occasional bursts
Side effect	Can add queuing delay	Load can stay uneven

If you want to go deeper on the other side of this, check out the Token Bucket Algorithm lesson next.

🌍 Where It’s Used

This isn’t just a textbook idea. You’ll find leaky buckets quietly doing their job in real systems:

Network traffic shaping. Routers and switches use leaky buckets to send packets out at a steady rate, so the network doesn’t get slammed by sudden floods of data.
Evening out request rates to a backend. A service sitting in front of a slower system can use a leaky bucket to make sure that slower system never gets more than it can chew, by handing requests over at a fixed pace.
Smoothing writes to a database. If a lot of writes arrive at once, a leaky bucket can drip them in steadily instead of overwhelming the database in one go.

The pattern shows up anywhere someone downstream needs a predictable, even load and can’t handle sudden spikes.

⚠️ The Trade-off

The leaky bucket is great, but nothing is free. There’s a cost to that lovely steady output, and you should know it before you reach for this:

It adds delay. Because requests sit in the queue and wait their turn to leak out, a request can be held for a while before it’s processed. That waiting time is called queuing delay. If your input is bursty and your leak rate is slow, some requests wait a long time.
It drops bursts that a token bucket would have allowed. The leaky bucket flatly refuses to let a spike through, even if your system could have handled it. A token bucket, by saving up tokens, would have let that burst pass. So if your users genuinely need quick bursts now and then, the leaky bucket’s strictness can hurt.

So the choice comes down to what you value: a perfectly smooth, predictable stream, or the flexibility to allow occasional bursts.

⚠️ Common Mistakes and Misconceptions

A few things trip people up with this one. Let’s clear them out:

“Leaky bucket and token bucket are the same thing.” No. They both limit rate, but the leaky bucket gives a steady output and blocks bursts, while the token bucket saves up allowance and lets bursts through. Different behavior, different use cases.
“Leaky bucket allows bursts.” It doesn’t. That’s the token bucket. The whole point of the leaky bucket is to flatten bursts into a steady trickle, so a spike never reaches the other side as a spike.
“There’s no downside, just use it everywhere.” Watch out for the queuing delay. Requests wait in line, so they can be held up. If low latency matters more than smoothness, a leaky bucket might be the wrong tool.
“A full bucket slows down the leak.” Nope. The leak rate is fixed no matter what. A full bucket just means new requests get dropped, but the ones inside still leave at the same steady pace.

🛠️ Design Challenge

Try this on your own to test yourself.

Imagine you’re putting a leaky bucket in front of a payment service that can safely handle 10 requests per second. During a flash sale, traffic spikes to 5,000 requests in one second.

How big should your bucket capacity be? What happens to requests once it’s full?

Show the answer

A user’s request lands at the back of a long queue. How long might they wait, and is that acceptable for a payment?

Show the answer

Would a token bucket have been a better fit here? Why or why not?

Show the answer

🧩 What You’ve Learned

You can now explain how the leaky bucket keeps a system calm under pressure. Here’s what you’ve picked up.

✅ Requests go into a bucket, which is a queue, and leak out at a fixed steady rate.
✅ When the bucket overflows, extra requests are dropped.
✅ It smooths bursty input into a steady output, which is called traffic shaping.
✅ It differs from the token bucket, which saves up tokens and allows bursts.
✅ The trade-off is queuing delay and dropping bursts a token bucket would have allowed.

Check Your Knowledge

Test what you learned. Pick an answer for each question, then click Check.

🚀 What’s Next?

You’ve got the smoothing side of rate limiting down. Now go see the flexible side and the bigger picture.

Token Bucket Algorithm shows how saving up tokens lets you allow bursts, the opposite trade-off from the leaky bucket.
Rate Limiting Explained zooms out to cover why we limit rate at all and the other strategies you can choose from.

Previous Token Bucket Algorithm Next Consistent Hashing Explained

Share & Connect

Share on LinkedIn