Rate Limiting for Security

Imagine this. You wake up, check your logs, and see something scary:

  • One client just sent thousands of login attempts to your sign-in page, all in a few seconds.
  • It’s not a real person. A real person can’t type that fast, right?
  • It’s a script, trying password after password, hoping one of them works.

That’s an attacker trying to break into someone’s account by sheer brute force. And the funny thing is, you already have the perfect tool to stop them. It’s rate limiting, the same idea you use to manage load, just pointed at attackers instead. Let’s see how.

🎯 Beyond Load

You’ve probably met rate limiting as a way to keep one greedy client from hogging your server. Quick recap so we’re on the same page:

  • Rate limiting caps how many requests a single client can make in a given time window, like “100 per minute”.
  • Requests under the limit pass through. Requests over it get turned away with the 429 Too Many Requests status code.
  • Normally we use this to protect server capacity and keep things fair for everyone.

Here we’re going to use that same cap as a security shield. See, attacks almost always need lots of requests to work. So if you choke the request rate, you choke the attack right along with it. Same gate, different purpose.

New to the basics?

If “time window”, “429”, and “token bucket” sound unfamiliar, read Rate Limiting Explained first. This lesson builds straight on top of it, so a quick warm-up there will make everything here click faster.

🔓 Stopping Brute Force

Let’s start with the most basic attack. A brute force attack is when someone tries many passwords against an account, one after another, until one finally works.

  • The attacker doesn’t know your password, so they just guess. A lot.
  • Common passwords first, then dictionary words, then random combinations.
  • With no limit, a script can fire thousands of guesses a second. Given enough time, a weak password will fall.

Now watch what happens when you add a simple rule, like “only five failed login attempts per minute”:

  • The script tries five passwords, then hits the wall and gets a 429.
  • It now has to wait a full minute for the next five tries.
  • Guessing a real password this way would take years instead of seconds. The attack just becomes pointless.

Here’s that decision in one picture.

Yes

No

Login attempt

Count tries for this account

Under the limit?

Check password

Block, return 429

The beautiful part is that a normal user barely notices. Alex might fumble the password once or twice, but Alex isn’t sending five wrong tries a minute. The limit hits the attacker hard and leaves the real person alone.

🔑 Credential Stuffing

Brute force guesses passwords. This next one doesn’t even bother guessing. Credential stuffing is when attackers take username and password pairs leaked from some other website’s breach and try them on yours at scale.

  • People reuse the same password everywhere. So a password leaked from site A often unlocks the same person’s account on site B.
  • Attackers grab huge lists of these leaked pairs, sometimes millions of them.
  • Then they run a bot that tries each pair on your login, hoping the person reused their password.

The thing is, this attack lives or dies on volume. They need to try a massive number of pairs to find the few that still work. So your defenses are the same ones that hurt brute force:

  • Rate limits slow the whole run to a crawl, so the bot can only test a trickle of pairs instead of a flood.
  • An account lockout (temporarily freezing an account after too many failed tries) blocks the ones they’re actively overloading.
  • Add CAPTCHAs or extra checks after a few failures, and a dumb bot just gives up.

Don't lean only on the password

Credential stuffing works because real, correct passwords get leaked elsewhere. Rate limiting gives you time and blocks the bots, but pair it with things like multi-factor authentication so a single stolen password isn’t enough to get in.

🕷️ Scraping and Abuse

Not every attacker wants in. Some just want your data. Scraping is when bots mass-download your content or data, page after page, far faster than any human could browse.

  • Think a competitor copying your entire product catalog and prices.
  • Or a bot pulling every public profile, listing, or article you have.
  • A real user reads one page at a time. A scraper grabs hundreds per second.

That speed difference is exactly what gives them away, and exactly what rate limiting catches:

  • Cap how many pages or API calls a client can pull per minute, and the scraper slams into the limit almost immediately.
  • A genuine visitor browsing normally stays well under it and never even sees the cap.
  • You can set tighter limits on the data that’s most valuable or most expensive to serve.

So the same gate that protects your login also protects the data you’d rather not hand out for free.

💸 Protecting Expensive Endpoints

Some requests cost you real money or real resources every single time. These deserve extra-tight limits. An endpoint is just a specific address in your API that does one job, like “send an OTP” or “reset my password”.

  • Sending an OTP means firing off an SMS or email, and each one costs money. (An OTP is a one-time password, that short code sent to verify it’s really you.)
  • An attacker can spam your “send OTP” button thousands of times and run up a huge bill, or flood some poor person’s phone with texts.
  • Heavy operations like report generation, image processing, or search can also pile up cost and load fast.

So you guard these the same way, just stricter:

  • Limit “send OTP” to something like a few per phone number per hour. A real user needs one or two, not two hundred.
  • Cap expensive operations per user so nobody can run them in a tight loop.
  • This protects both your wallet and your innocent users who’d otherwise get spammed.

Unprotected costly endpoints can bankrupt you

An OTP or email endpoint with no limit is a runaway bill waiting to happen. One bored attacker with a loop can send hundreds of thousands of messages overnight. Always rate limit anything that costs money per call.

🧩 How to Do It Well

Okay, so you’re sold on the idea. Now let’s talk about doing it right, because a sloppy rate limit can be dodged or can hurt real users. Here’s the playbook:

  • Limit per user AND per IP. An IP address is the network address of a machine. Limit per account so one user can’t be overloaded, and per IP so one machine can’t spread its attack across many accounts. You want both, not one.
  • Tighter limits on sensitive endpoints. Your login, signup, password reset, and OTP routes should have much stricter caps than a normal read-only page. That’s where attackers aim, so that’s where you clamp down hardest.
  • Add lockouts after repeated failures. A short temporary freeze after several wrong attempts stops a determined guesser cold.
  • Use CAPTCHAs as a step-up. A CAPTCHA is that little “prove you’re human” puzzle. Show it only after a few failed tries, so real users rarely see it but bots get stuck.
  • Add backoff. Make each repeated failure wait a little longer than the last. We’ll talk about this gentle version next.

Here’s how the common attacks line up against these defenses.

Attack How rate limiting helps
Brute force login Caps guesses per account, so cracking a password would take years instead of seconds.
Credential stuffing Slows the bot to a trickle and triggers lockouts, so leaked pairs can’t be tested at scale.
Scraping Limits pages per minute, catching bots that pull data far faster than any human.
OTP / email abuse Caps sends per number, preventing runaway bills and spam to innocent users.
Floods (DoS) Rejects excess requests at the door, so a single source can’t overwhelm the service.

⚖️ Don’t Lock Out Real Users

Here’s the balancing act. Crank your limits too tight and you start punishing the very people you’re trying to serve. Security and usability have to live together.

  • A real user who mistypes a password twice shouldn’t get locked out for an hour. That’s a support ticket and an angry customer.
  • When you do block someone, tell them clearly what happened and when to try again. A blank error just confuses honest people.
  • Use gradual backoff instead of a hard ban. Backoff means each repeated failure waits a bit longer, like 1 second, then 2, then 4. Real users barely notice, but a bot churning through guesses gets choked.

So the goal isn’t to slam the door. It’s to stay friendly to the fumbling human and brutal to the relentless bot.

Write clear block messages

When you return a 429, include a Retry-After header and a plain message like “Too many attempts, please try again in 60 seconds”. It turns a scary wall into a clear instruction, and it stops good clients from retrying in a tight loop that only makes things worse.

⚠️ Common Mistakes and Misconceptions

A few things trip people up here. Let’s clear them out:

  • “Rate limiting is only for load.” That’s half the story. It’s also one of your cheapest, most effective security tools against brute force, stuffing, scraping, and abuse.
  • “Just limit by IP.” Risky. Many real users share one IP behind a company network or a phone carrier, so you might block a whole office. And attackers hop across many IPs using proxies. Limit per user too.
  • “My login is fine with no limit.” It really isn’t. An open login is an open invitation for brute force and credential stuffing. Sensitive endpoints need the tightest limits you have.
  • “A lockout should be permanent.” No. A permanent lock lets an attacker freeze a real user out on purpose, just by failing their login a few times. Use short, temporary freezes that expire on their own.
  • “One limit fits every endpoint.” A read-only page and a password reset are nothing alike. Match the strictness to the risk and cost of each route.

🛠️ Design Challenge

Try this one on your own to test yourself.

You’re securing a banking app’s login. Real users sometimes mistype their password, but attackers run credential stuffing with millions of leaked pairs. Design the protection:

  • What would you key your limit on, the username, the IP, or both? Why both?
  • How many failed attempts before you slow things down, and how would you back off after that?
  • When would you show a CAPTCHA, and when would you temporarily lock the account?
  • How do you make sure a real user who fumbles twice still gets in smoothly?

There’s no single perfect answer. The point is to reason about the trade-off between security and usability out loud, exactly the way you would in an interview.

🧩 What You’ve Learned

You can now explain rate limiting as a security tool from end to end. Here’s what you’ve picked up:

  • ✅ Rate limiting is a security shield, not just a load tool, because almost every attack needs high request volume to work.
  • ✅ It stops brute force by capping login guesses, and slows credential stuffing so leaked pairs can’t be tested at scale.
  • ✅ It protects content from scrapers and guards costly endpoints like OTP and email from abuse and runaway bills.
  • ✅ Do it well by limiting per user AND per IP, setting tighter limits on sensitive routes, and adding lockouts, CAPTCHAs, and backoff.
  • ✅ Balance security with usability through clear messages, gradual backoff, and short temporary lockouts that don’t punish honest users.

Check Your Knowledge

Test what you learned. Pick an answer for each question, then click Check.

  1. 1

    Why does rate limiting work so well against brute force and credential stuffing attacks?

    Why: Both attacks depend on trying many requests fast, so capping the request rate slows the attack to a crawl.

  2. 2

    Why should you rate limit per user and per IP, not just per IP?

    Why: Limiting only by IP can block a whole shared network and still misses attackers who hop across many IPs.

  3. 3

    Why do endpoints like 'send OTP' deserve extra-strict limits?

    Why: An unlimited OTP or email endpoint can run up a huge bill and spam innocent users, so it needs a tight cap.

  4. 4

    What is a good way to stay friendly to real users while blocking bots?

    Why: Gentle backoff and short, clearly explained freezes barely affect honest fumbles but still choke relentless bots.

🚀 What’s Next?

You’ve got rate limiting as a defense down. Next, go deeper into the attacks and patterns around it.

  • DDoS Protection Basics covers what happens when the flood is too big for a simple limit, and how you defend against many machines at once.
  • Rate Limiting Explained revisits the core algorithms and where the limiter runs, the foundation everything here is built on.

Get comfortable with those, and you’ll be able to talk confidently about protecting systems in any security or system design interview.

Share & Connect