← Back to Blog

EZThrottle: Making Failure Boring Again

By @RahmiPruitt ·

Modern software doesn't fail in exciting ways.
It fails with 429s, timeouts, and regions quietly going dark.

And somehow we've all accepted that this means:

I built EZThrottle because I got tired of pretending this was normal.

The Problem Nobody Really Talks About

Most systems deal with failure in isolation.

Each service:

This works... until it doesn't.

At scale, independent retries turn into retry storms:

The real issue isn't that APIs fail. It's that every client is blind to what every other client is experiencing.

Failure isn't shared state.

Multiple services independently hitting a single upstream API
Without coordination, every service independently hammers the same upstream

The Simple Idea That Changed Everything

EZThrottle is built around one opinionated idea:

Retries shouldn't be independent.

Instead of thousands of machines panicking at once, EZThrottle coordinates failure in one place.

All outbound requests flow through EZThrottle, which keeps track of:

Once failure becomes shared state, it stops being chaos.

Why This Runs on the BEAM (and Why That Matters)

EZThrottle is written in Gleam and runs on Erlang/OTP.

That choice wasn't about trends—it was about survival.

The BEAM was designed for:

EZThrottle isn't trying to make one HTTP call fast. It's trying to coordinate millions of them safely. This is exactly what the BEAM is good at.

429s Aren't Errors — They're Signals

A 429 isn't your API yelling at you. It's your API asking you to slow down.

Most systems ignore that signal and keep retrying anyway.

EZThrottle takes the hint.

The boring default (on purpose)

By default, EZThrottle sends 2 requests per second per target domain.

Not globally. Not per account. Per destination.

Examples:

This default exists to prevent your infrastructure from accidentally turning into a distributed denial-of-service attack.

It smooths bursts, stops retry storms, and keeps upstreams healthy.

Yes, it's conservative. That's the point.

Tuning Rate Limits When You Need To

You can tune rate limits when you know more than the default.

Here's what that looks like in code:

from ezthrottle import EZThrottle, Step, StepType

client = EZThrottle(api_key="your_api_key")

result = (
    Step(client)
    .url("https://api.example.com/endpoint")
    .method("POST")
    .type(StepType.PERFORMANCE)
    .rps(10)              # preferred requests per second
    .max_concurrent(25)   # optional concurrency cap
    .webhooks([{"url": "https://your-app.com/webhook"}])
    .execute()
)

Upstreams can also tell EZThrottle how fast they want to be called using response headers:

X-EZTHROTTLE-RPS: 5
X-EZTHROTTLE-MAX-CONCURRENT: 10

The important part isn't the knobs. It's this:

Rate limiting becomes shared state instead of a thousand sleep() calls.

When Things Actually Break (5xx and Outages)

Regions go down. Configs break. Dependencies flake out.

EZThrottle assumes this will happen.

When a request fails with a 5xx or times out:

The result isn't "everything is perfect."

The result is: a small latency bump instead of a full outage.

EZThrottle coordinating requests across multiple regions
EZThrottle coordinates traffic across regions with consensus and controlled rate limiting

Region Racing: Let the Fastest One Win

Sometimes you don't want to wait. You just want the request to finish.

EZThrottle supports region racing:

Here's what that looks like:

from ezthrottle import EZThrottle, Step, StepType

client = EZThrottle(api_key="your_api_key")

result = (
    Step(client)
    .url("https://api.example.com/endpoint")
    .method("POST")
    .type(StepType.PERFORMANCE)
    .regions(["iad", "lax", "ord"])
    .execution_mode("race")
    .webhooks([{"url": "https://your-app.com/webhook"}])
    .execute()
)

This isn't about chasing microbenchmarks. It's about predictable completion when the world is messy.

FRUGAL vs PERFORMANCE (How Teams Actually Use This)

Not every request needs maximum reliability.

EZThrottle supports two common patterns:

Here's a FRUGAL example:

from ezthrottle import EZThrottle, Step, StepType

client = EZThrottle(api_key="your_api_key")

result = (
    Step(client)
    .url("https://api.example.com/endpoint")
    .type(StepType.FRUGAL)
    .fallback_on_error([429, 500, 502, 503])
    .webhooks([{"url": "https://your-app.com/webhook"}])
    .execute()
)

This lets teams start cheap and gradually move reliability into infrastructure instead of application code.

The Tradeoff EZThrottle Makes (On Purpose)

EZThrottle prioritizes:

Over:

If you want every request to fire as fast as possible, EZThrottle isn't for you.
If you want to stop waking up to retry storms, it probably is.

What EZThrottle Is (and Isn't)

EZThrottle is:

EZThrottle is not:

Requests live in memory, move through the system, and disappear. There's nothing to mine, leak, or hoard.

Why This Matters

Most infrastructure complexity exists to compensate for unreliable networks.

EZThrottle doesn't eliminate failure.

It eliminates panic.

By turning retries, rate limits, and region health into shared state, it gives you something rare in distributed systems:

Predictable behavior when things go wrong.

Rahmi Pruitt

Founder, EZThrottle

@RahmiPruitt

Ready to Make Failure Boring?

Start with 1 million free requests. No credit card required.

Start Free →

© 2025 EZThrottle. TCP for APIs. The World's First API Aqueduct™

Built on BEAM by a solo founder who believes engineers deserve to sleep at night.