Rate limits
Captchas per minute (CPM) is the only rate limit on /solve. /balance has a separate per-IP limit. Here is how each behaves and how to size against them.
Rate limits
There are exactly two rate limits in play. Knowing them is enough to design a stable client.
| Limit | Where | Scope | Value |
|---|---|---|---|
| Captchas per minute (CPM) | POST /solve | Per API key | max_cpm on the key |
/balance polling | GET /balance | Per source IP | 30 requests / minute |
There is no fixed concurrency limit. As long as your CPM bucket has tokens, you can run as many parallel /solve requests as you want.
How the CPM bucket works
/solve uses a continuous token bucket sharded across API keys:
- Bucket size is
max_cpmtokens. - Refill rate is
max_cpm / 60tokens per second. - Each
/solvecall consumes one token. A 4xx validation error returns it. A 503 backend failure returns it.
Practical implications:
- Refill is continuous, not minute-aligned. You don't wait for the next minute boundary; capacity returns the moment the second hand moves.
- At full bucket you can burst
max_cpmcalls at once. - At steady state you can sustain exactly
max_cpmsolves per minute.
Check your live state with GET /balance:
{
"max_cpm": 600,
"current_cpm": 540,
"cpm_limit": 600
}current_cpm is the number of tokens consumed in the rolling window; cpm_limit - current_cpm is roughly what you can still send right now.
429 from /solve
{
"success": false,
"error": "Rate limit exceeded: max 600 captchas per minute"
}When this fires, the request never reached the solver — no token was consumed, no balance was touched. Back off briefly and retry.
import time, requests
for attempt in range(5):
r = requests.post(url, headers=h, json=payload, timeout=120)
if r.status_code != 429:
break
time.sleep(min(2 ** attempt, 8))Sizing your worker pool
A simple rule that avoids 429s entirely:
max_parallel_workers = floor(max_cpm * avg_solve_seconds / 60)For example, with max_cpm = 600 and 1-second Turnstile solves, you'd run ~10 parallel workers. With 5-second Challenge solves, ~50 workers. Going wider doesn't help — the bucket caps you anyway.
from concurrent.futures import ThreadPoolExecutor
MAX_CPM = 600
AVG_SOLVE_SECONDS = 1.5
workers = max(1, int(MAX_CPM * AVG_SOLVE_SECONDS / 60))
with ThreadPoolExecutor(max_workers=workers) as pool:
for fut in pool.map(solve, items):
...class Semaphore {
constructor(n) { this.n = n; this.queue = []; }
async acquire() {
if (this.n > 0) { this.n--; return; }
await new Promise((res) => this.queue.push(res));
}
release() {
this.n++;
const next = this.queue.shift();
if (next) { this.n--; next(); }
}
}
const sem = new Semaphore(Math.max(1, Math.floor(600 * 1.5 / 60)));
async function solve(payload) {
await sem.acquire();
try { /* fetch */ } finally { sem.release(); }
}/balance limit
/balance is per-IP, 30 requests/minute. Treat it as a startup or monitoring call — never a per-solve check. The balance you receive is stale the moment you read it; the in-memory ledger on the server is always authoritative.
{ "success": false, "error": "Rate limit exceeded" }Request body size
Every endpoint accepts request bodies up to 1 MB. Real solve requests are well under 1 KB, so you only hit this if something is going wrong upstream.
Don't retry blindly
Retrying 429 in a tight loop just wastes attempts. The bucket fills at a fixed rate — back off and let it refill, or cap your concurrency client-side.
Need more headroom?
Larger CPM ceilings are available on request. Contact support with your typical solve mix and target throughput.