Rate Limiting: What Stops One Client From Melting Your App

Part nine of the full-stack series. You can have a perfectly secured endpoint and still go down because one client hits it ten thousand times a minute — a brute-force login, a greedy scraper, a buggy mobile app stuck in a retry loop. Rate limiting is the valve: it caps how often a given caller can do a thing, protecting both your infrastructure and, when there’s a metered API behind it, your bill.

What it’s actually defending against

A limit per caller blunts several distinct problems at once:

Threat	What the limit does
Brute-force login	a few tries/minute makes password guessing hopeless
Scraping	caps how fast someone can vacuum your data
Runaway client	a buggy retry loop hits a wall instead of your database
Cost abuse	each call to a paid downstream API is capped, so is the bill
Accidental DoS	one client can’t starve everyone else of capacity

Note the framing: rate limiting is about fair use and abuse, a normal-operations control. Defending against a distributed flood (DDoS) is a different, network-layer problem handled upstream by your CDN/provider — don’t expect app-level throttling to absorb that.

The algorithms, briefly

Three common approaches, increasing in smoothness:

Fixed window — “100 requests per minute,” counter resets each minute. Simple, but allows a burst of 200 across a window boundary (last second of one minute + first of the next).
Sliding window — counts over a rolling 60-second span, smoothing the boundary spike.
Token bucket — a bucket refills at a steady rate; each request spends a token. Allows short legitimate bursts while enforcing a long-run average. The most flexible, and what most good limiters use under the hood.

For most apps the default fixed/sliding window is fine. Reach for token bucket when you want to tolerate bursts without raising the overall ceiling.

Wiring it in Laravel

The framework ships with this. Named limiters live in one place, keyed by user or IP:

// AppServiceProvider — define limits once, by name
RateLimiter::for('api', fn (Request $r) =>
    Limit::perMinute(60)->by($r->user()?->id ?: $r->ip()));

// Tighter limit on the expensive/sensitive route
RateLimiter::for('login', fn (Request $r) =>
    Limit::perMinute(5)->by($r->ip()));

// Apply per route group
Route::middleware('throttle:api')->group(function () { /* ... */ });
Route::post('/login', ...)->middleware('throttle:login');

Over the limit, Laravel returns 429 Too Many Requests automatically, with Retry-After and X-RateLimit-* headers so well-behaved clients know to back off. Key by user ID when authenticated, IP when not — keying purely by IP punishes everyone behind a shared NAT (an office, a campus, a whole mobile carrier).

Tell the client the rules

A limit that clients can’t see leads to confused retries that make the problem worse. Return the headers and document them:

HTTP/1.1 429 Too Many Requests
Retry-After: 30
X-RateLimit-Limit: 60
X-RateLimit-Remaining: 0

A good client reads Retry-After and waits; a good frontend shows “slow down, try again in 30s” instead of spamming. The contract idea from the API post applies here too: 429 is part of the contract.

Caveats and best practices

Different limits for different routes. Login and password-reset want tight limits; a read-only public endpoint can be generous. One global number is always wrong somewhere.
Back it with Redis, not the database, once you have more than one app server — the counter must be shared across them, or each server enforces its own fraction of the limit.
Limit by the right key. User ID > API token > IP. Pure-IP limiting both over-blocks (shared NAT) and under-blocks (attacker rotating IPs).
Layer it. App-level throttling for fairness, plus edge/CDN limits for volumetric abuse. Neither replaces the other.

Conclusion

Why      → brute force, scrapers, runaway clients, bill shock, fairness
Algos    → fixed window → sliding window → token bucket (burst-friendly)
Laravel  → RateLimiter::for(...), throttle middleware, auto 429
Key by   → user id > token > IP (never pure IP)
Tell     → Retry-After + X-RateLimit-* headers, documented

Rate limiting is a cheap valve that turns “one bad client takes everyone down” into “one bad client hits a wall.” Define limits per route, key them by user, and let clients see the rules. Next: caching and CDN — making the requests you do serve fast for everyone.