Announcement March 23, 2026

A Queue Per User at Scale

Announcing AccountQueues: per-user, per-key queue isolation that solves the noisy neighbor problem — plus two response headers that let API providers control their own flow rate.

Rahmi Pruitt

Founder, EZThrottle

The noisy neighbor problem

If you've ever run a multi-tenant API — or called one — you've felt this. One customer sends a thousand requests. Everyone else waits. The aggressive customer doesn't know they're causing harm. The others don't know why their requests are slow. You're stuck in the middle.

This is the noisy neighbor problem. It exists in cloud compute, in databases, in CDNs. And it has always existed in API rate limiting infrastructure. Until now, EZThrottle shared one queue per URL across all customers. Customer A flooding Stripe meant Customer B's Stripe requests waited in line behind them.

Today we're fixing that.

Introducing AccountQueues

An AccountQueue is one dedicated queue per user, per API key — isolated from every other customer, living globally across the EZThrottle cluster via Syn, our distributed process registry.

# Before: one shared queue per URL
customer_a (1000 req) ──┐
customer_b (2 req)   ──┤──▶ [ SHARED QUEUE ] ──▶ api.stripe.com
customer_c (5 req)   ──┘
                              ↑ customer_b waits hours

# After: one queue per user, per API key
customer_a ──▶ [Queue A] ──┐
customer_b ──▶ [Queue B] ──┤──▶ api.stripe.com
customer_c ──▶ [Queue C] ──┘

all three execute at their own pace. nobody waits for anyone else.
    

Each queue is a BEAM process registered globally in Syn. Machine crashes? Syn detects it, removes the registration. Next request for that user spawns a fresh queue wherever it lands. No coordination code. No manual failover. The infrastructure heals itself.

Why this matters for distributed systems

Most teams aren't feeling this acutely today. But the trajectory is clear.

Imagine 50 distributed workers all sharing sk_live_abc123 to call OpenAI. Each worker thinks it has a 60 req/min budget. It doesn't — they share one budget across all 50. The result: workers fight each other, requests get rate-limited unpredictably, and your retry logic makes it worse by creating a thundering herd the moment the limit resets.

With AccountQueues, there is one global queue for sk_live_abc123 across your entire cluster. Every worker routes through it. The rate limit is respected exactly. This is resource contention solved at the infrastructure layer — not in your application code.

AccountQueues are opt-in

AccountQueues are off by default. This is intentional.

EZThrottle's default is 2 requests per second per machine. We chose 2 RPS deliberately — not because it's a technical constraint, but because we care about the health of the internet. We're not here to help anyone flood an API. The default is conservative because most APIs have rate limits, and we'd rather be a good citizen than squeeze out every last request per second.

If AccountQueues were on by default for any API key we've ever seen, an attacker could spam requests with thousands of fake keys and turn EZThrottle into a process-spawning machine — consuming memory until the cluster falls over. We're not doing that.

AccountQueues are enabled for APIs where per-user queuing makes clear sense: APIs that already rate limit by API key. The current list — maintained in our open-source ezconfig library — includes OpenAI, Anthropic, Stripe, GitHub, Twitter, Slack, Discord, HubSpot, Notion, Airtable, Shopify, SendGrid, Twilio, and Google APIs.

Want your API added to ezconfig?
Open a GitHub issue with a link to your official rate limit documentation. Issues without supporting evidence will be automatically closed. If you are the API owner, skip the issue — just use the response headers below to opt in directly.

Provider flow control: two headers

If you're an API provider and you want to opt into AccountQueues — or control how fast EZThrottle sends traffic to you — add these headers to your API responses. EZThrottle reads them automatically. No config changes. No dashboard. Just a header.

X-EZTHROTTLE-ACCOUNT-QUEUE

Opt your API into per-user, per-key queue isolation dynamically.

# Opt in: EZThrottle creates one queue per user, per API key for your domain
X-EZTHROTTLE-ACCOUNT-QUEUE: enabled

# Opt out: revert to shared queue
X-EZTHROTTLE-ACCOUNT-QUEUE: disabled
    

Once enabled, this persists for the life of the URL queue actor — typically until a deployment or machine restart. You don't need to send it on every response, but sending it consistently is a good practice.

X-EZTHROTTLE-RPS

Tell EZThrottle exactly how fast to go per user queue. EZThrottle adjusts in real time. Your API is under load? Lower the value. Capacity recovered? Raise it.

X-EZTHROTTLE-RPS: 10    # 10 req/sec per user queue
X-EZTHROTTLE-RPS: 0.5   # 1 request every 2 seconds (current minimum)
    

The current minimum is 0.5 RPS — one request every two seconds. As the EZThrottle cluster grows, we plan to support even lower rates for APIs that need them. Values below 0.5 are rejected with a warning.

X-EZTHROTTLE-MAX-CONCURRENT

Limit how many requests from a single user queue can be in-flight simultaneously.

X-EZTHROTTLE-MAX-CONCURRENT: 5

Without AccountQueues, this means 5 per machine — multiply by your cluster size and you get the real number. With AccountQueues, this means exactly 5. Per user. Globally. One queue per user per key means the limit you set is the limit actually enforced — not multiplied by how many machines happen to be running.

Default configurations via ezconfig

EZThrottle ships with sensible defaults for popular APIs via ezconfig, a community-maintained open-source Gleam library. It sets RPS limits and concurrency defaults for 14 major APIs so EZThrottle behaves correctly out of the box, without any configuration on your end.

If your API isn't listed, EZThrottle falls back to 2 RPS — conservative and internet-friendly. PRs are welcome.

What's coming next: tier flows

AccountQueues today give every user equal, isolated flow. Next up: tier-based priority within each queue. Reserved customers get guaranteed throughput. Paid tier gets priority over free. Free tier is best-effort. Same per-user isolation — different lanes within each queue.

This is fair queuing for HTTP. The same principle that lets your Netflix stream while someone else downloads. Every customer gets their own lane. Tier flows determine how fast each lane moves.

The bigger picture

The internet solved this for TCP in 1988 — per-flow queuing, QoS, BGP routing around failures. Every packet gets a fair share. No single flow starves the rest. HTTP API calls have never had this layer. Every team builds their own retry logic, their own rate limiting, their own per-customer fairness, from scratch, every time.

EZThrottle is that missing layer. The coordination infrastructure between your application and every API it depends on. Bidirectional — protecting your calls going out, and protecting your API from the inside.

AccountQueues are the first step toward making that real.

Try AccountQueues today

Available on all paid tiers. Free tier uses the shared queue. Questions? Want your API added to the default list?

Start Free Open an ezconfig issue

Or email us directly: support@ezthrottle.network