Rate limits & quota

429 responses from AACsearch — diagnose per-key rate limits, monthly quota exhaustion, and the headers that tell you when to retry.

Two distinct conditions both return HTTP 429:

`error` field	Meaning	Resets when
`rate_limit_exceeded`	Per-key request rate exceeded	60-second sliding window
`quota_exceeded`	Monthly search-unit quota for the org exhausted	1st of next calendar month, or on plan upgrade

The fix is different for each. The first is "wait and retry"; the second is "spend more or wait for the reset."

Rate limit (per key)

Each API key has a rateLimitPerMinute value (default 60). Requests above that count return:

HTTP/1.1 429 Too Many Requests
Retry-After: 15
X-RateLimit-Limit: 60
X-RateLimit-Remaining: 0
X-RateLimit-Reset: 1717201234

{
  "error": "rate_limit_exceeded",
  "message": "Rate limit of 60 req/min exceeded. Retry after 15 seconds."
}

Checks

Inspect the headers. X-RateLimit-Limit is the per-key cap, X-RateLimit-Remaining is what is left in the current window, Retry-After is the seconds to wait.
Identify the key that is hitting the limit. Multiple frontend instances often share one widget key. Open Search → API Keys → the key → Last 24 hours and look for spikes.
Are you doing one-search-per-keystroke? A naive autocomplete can fire 5–10 requests per second per user. With 12 active users that is 3,600 req/min from a single key.

Fix

Cause	Fix
Naive autocomplete	Debounce in the client — 200 ms is a good default. Or use `useDeferredValue` (React 18+). See browser SDK autocomplete pattern
Crawler / unintended traffic	Check `X-Forwarded-For` in your edge logs; block bots at the edge before the request reaches AACsearch
Genuine high traffic	Raise the per-key limit in Search → API Keys (also subject to plan quota), or split into multiple keys with separate limits
Burst from a one-off script	Wait `Retry-After` seconds and retry; do not loop without backoff

Retry pattern

Honor Retry-After. Do not retry until it elapses, otherwise the bucket stays full forever:

async function searchWithRateLimitRetry(query: string) {
	for (let attempt = 0; attempt < 3; attempt++) {
		try {
			return await client.search({ q: query });
		} catch (err) {
			if (err instanceof AacSearchError && err.code === "rate_limit") {
				const retryAfterSec = Number(err.response?.headers.get("Retry-After") ?? 5);
				await new Promise((r) => setTimeout(r, retryAfterSec * 1000));
				continue;
			}
			throw err;
		}
	}
	throw new Error("rate-limited after 3 attempts");
}

Quota exceeded (monthly)

When the org consumes its plan's monthly search-unit allowance:

HTTP/1.1 429 Too Many Requests

{
  "error": "quota_exceeded",
  "message": "Monthly search-unit quota exhausted. Upgrade your plan or wait for reset."
}

One search-unit = 1 search request OR 1 document indexed. See Plans and limits for the per-plan numbers.

Checks

Where are you on the plan? Open Settings → Billing → Usage. The bar shows current consumption vs cap.
Is the cause expected? If you launched a new storefront yesterday, expected traffic spike. If you didn't, scan recent logs for runaway scripts.
Has the wallet overage budget been enabled? Paid plans can opt in to overage billing — once enabled, requests above the soft cap deduct from the wallet instead of returning 429.

Fix

Plan	Action
Free	Either upgrade to Starter+, or wait for 1st-of-month reset
Starter / Pro / Business	Enable wallet overage in Settings → Billing → Overage; or upgrade plan tier
Enterprise	Contact your account manager for a true-up; quota is contractual not enforced

After enabling overage, the next request after the soft cap deducts from the wallet (priced per 1k search-units). The hard cap blocks only when the wallet itself is empty.

Difference between soft and hard cap

Cap	Trigger	Behavior
Soft	Plan quota reached	Alerts fire to billing email; requests still served if overage budget enabled
Hard	Plan quota + overage budget both exhausted	429 returned, search stops

Without overage enabled, soft and hard cap coincide.

Diagnostics packet

Field	Notes
Organization ID	required
Plan tier	from Settings → Billing
Window	UTC start/end of the spike
Affected key prefix	first 12 chars
Sample request response	full body including `X-RateLimit-*` headers
Recent volume	requests per minute over the last hour

Errors and rate limits — error code matrix
Plans and limits — quota tiers
Billing limits — billing-specific issues
Browser SDK — debounce / useDeferredValue pattern

Rate limits & quota

On this page