Plans & Limits

The canonical AACsearch plan matrix, Search-Unit definition, quota catalog, soft/hard semantics, and the code paths that enforce them.

AACsearch enforces capacity through a single config source, @repo/payments/lib/entitlements. The same data drives the marketing pricing page, the dashboard plan card, the Billing → Plans doc, and the gates that fire on every request. This page is the canonical reference for what the numbers mean and how the gates work.

If you are a customer trying to pick or compare plans, Billing → Plans is the friendlier landing. This page is the engineering deep dive — read it when you need to know exactly what counts, when the gate fires, and what the response shape is.

Plan matrix

Plan	Search units / month	Indexed docs	Indexes	Seats	API keys / index	Connector syncs / mo	Analytics retention	Support
Free	10,000	1,000	1	3	3	30	7 days	Community
Starter	100,000	10,000	3	10	5	300	30 days	Email
Pro	1,000,000	100,000	10	25	10	3,000	90 days	Priority email
Business	5,000,000	1,000,000	50	100	20	30,000	365 days	Dedicated
Enterprise	Custom	Custom	Custom	Custom	Custom	Custom	Custom	SLA 99.95%

These exact numbers live in PLAN_LIMITS in packages/payments/lib/entitlements.ts. The marketing page, the dashboard, and billing/plans.mdx pull from the same constants. If you edit one, edit the source — the rest follow on the next release.

Search Units

The headline quota is the Search Unit. The definition is deliberately simple:

One Search Unit = one search request OR one document write.

The two are charged from the same bucket because they have similar infrastructure costs and similar customer load profiles. Mixing them in one cap means customers can flex between query-heavy and ingest-heavy workloads inside a single envelope.

What emits a Search Unit?

Action	Units	Event row
`POST /search` (any filter/sort/pagination)	1	`SearchUsageEvent { type: "search" }`
`POST /multi-search` with N queries	N	One row per query
`POST /search/suggest` / autocomplete	1	`SearchUsageEvent { type: "suggest" }`
Aggregation-only query (`per_page: 0`)	1	Counted as a search
`PUT /indexes/{id}/documents` (single)	1	`SearchUsageEvent { type: "ingest" }`
`POST /indexes/{id}/documents:batch` (N docs)	N	One row per successful line
`DELETE /indexes/{id}/documents/{external_id}`	1	`SearchUsageEvent { type: "ingest_delete" }`
`DELETE …documents:byFilter` matching M docs	M	One row per matched doc
Full reindex (`POST /indexes/{id}/sync`)	M	One row per re-emitted document
Connector delta sync touching M docs	M	One row per doc + 1 connector-sync unit (separate quota)

Failed requests (4xx, 5xx, 429) do not consume units. Successful zero-result searches do consume a unit — they cost the engine the same as any other query.

What does NOT emit a Search Unit?

AI calls (Knowledge / RAG, embeddings, rerank, summarize, chat) — these run on the wallet, a separate ledger.
Heartbeats from connector tokens.
Dashboard reads (server-side, not metered).
Admin operations (schema edits, key creation, member invites).

The full quota catalog

There are six quotas in AACsearch. Five roll up into the plan; the sixth (wallet) is independent.

1. Search Units (per month)

Defined above. Reset at 00:00:00 UTC on the first day of your billing period (monthly billing) or the anniversary date (annual). See Billing → Quotas for the customer-facing version.

2. Indexed documents (steady-state)

Caps the total document count across all your indexes at any moment. Checked at ingest time, not in a periodic batch. Unlike Search Units, this is a steady-state cap — there is no monthly reset because document count is intrinsically a snapshot, not a flow.

If you reach the cap, new upserts return quota_exceeded. Reads continue. Deletes free space immediately.

A document is one row regardless of field count or length, but very large documents (>1 MB) hit per-document size limits in the search engine. Target <64 KB per doc.

3. Indexes per organization

Hard cap on the number of SearchIndex rows for the org. Hitting this returns index_limit_reached on create. Existing indexes continue to work; deleting one frees a slot immediately.

4. Connector syncs (per month)

Each full or delta sync run by an ss_connector_* token costs one connector-sync unit plus N search units (one per document touched).

Heartbeats are free. The cap is the most likely to become the binding constraint for high-frequency connectors (e.g. five-minute PrestaShop polling) — at one delta every 5 minutes, you spend 8,640 syncs/month, well over Pro's 3,000 cap.

Decide between a longer polling interval and a higher plan, or move to a webhook-driven push from the source CMS (no polling overhead).

5. Seats

Active members in the org. Owner / admin / member / viewer all count; removed members free their seat immediately; pending invites don't count.

Seat caps live in PLAN_LIMITS.maxSeats. Hitting the cap disables the Invite button with a tooltip explaining the limit.

6. AI usage (wallet, plan-independent)

The wallet is a pre-paid balance in micro-USD (or kopecks) drawn down by AI calls. The plan does not cap AI usage — the wallet balance does. The plan only gates whether AI features are available at all (Pro+ for Knowledge and AI rerank).

Full mechanics: Billing → Wallet & AI credits.

Analytics retention (not a quota, a plan attribute)

The Analytics page's period selector is gated by plan retention — Free / Starter cannot pick 90d / 365d. This is enforced on the analytics aggregation procedures via featureGate("analyticsRetention", "30d") etc.

It is included here for completeness — it is a plan limit even though it is not a counter you can "use up."

Soft caps and hard caps

A soft cap is a threshold that triggers warnings without blocking traffic. A hard cap rejects requests outright. AACsearch combines both:

80% of quota → soft cap fires. Banner in dashboard; advisory header on every API response; one-time email to billing contacts.
100% of quota → hard cap fires. API returns 429 (search_quota_exceeded); ingest blocks; new sync runs queued.
Above 100% → overage. Off by default. When enabled on Pro/Business, requests continue at a metered post-paid rate. See Billing → Quotas → Overage.

Code paths:

// packages/api/modules/entitlements/middleware/quota-check.ts
const result = await checkQuota(orgId, "search");
// result.allowed        — true if under 100%
// result.isSoftCap      — true if above 80% and below 100%
// result.isHardCap      — true if at or above 100% (and overage disabled)
// result.percentUsed    — exact percent (for display)
// result.remaining      — units left in the period
// result.overageRateUsdMicrosPerSearch — only present if overage enabled

When isHardCap === true and overage is disabled, the request is rejected:

HTTP/1.1 429 Too Many Requests
Content-Type: application/json

{
  "error": "search_quota_exceeded",
  "detail": "Monthly search quota reached. Upgrade or wait for the period reset.",
  "quota": "search",
  "limit": 1000000,
  "used": 1000000,
  "resetsAt": "2025-11-01T00:00:00Z"
}

When isHardCap === true and overage is enabled, the request proceeds; an OverageTransaction row records the unit at the configured per-unit price.

Why soft + hard, not just one?

Soft-only would let runaway widget traffic bill customers into the ground; we hard-cap by default.
Hard-only would surprise customers without warning, which we want to avoid for plan churn reasons.
The 80% banner is the customer's chance to upgrade or trim before traffic stops.

Quota reset behavior

Resets happen on:

Quota	Reset trigger	Notes
Search Units	First day of billing period at `00:00:00 UTC`	Counter to 0; overage from prior period billed on next invoice
Connector syncs	Same anchor as Search Units	—
Indexed documents	No reset — steady-state cap, not a flow	Trim or upgrade to free room
Indexes	No reset — steady-state	Delete an index to free a slot
Seats	No reset — steady-state	Remove members to free seats
Wallet spending limit	First day of calendar month	The wallet balance itself does not reset

There is no quota rollover. Unused Search Units do not carry forward.

For annual customers, the reset anchor is the monthly anniversary of the first payment, not calendar months. This avoids the trap where a customer who pays on the 15th finds their counter reset on the 1st (with 14 days of usage "lost").

The quota gate sequence

Every search / ingest request runs through this gate stack in packages/api/modules/search/public-handler.ts:

Request arrives
  ↓
Auth gate          — public-auth.ts. Verify API key prefix + hash + scopes.
  ↓
Origin gate        — Verify request Origin matches allowedOrigins (if set).
  ↓
Tenant gate        — Confirm key belongs to the org claimed in the request.
  ↓
Feature gate       — quotaCheck.ts → resolveOrgPlan(orgId) → featureGate("synonyms")
  ↓
Quota gate         — quotaCheck.ts → checkQuota(orgId, "search")
  ↓                    if allowed=false → 429 quota_exceeded (return)
                       if isSoftCap=true → set X-Aacsearch-Quota-Warning header
  ↓
Rate gate          — rate-limit.ts → sliding window on SearchRateLimitBucket
  ↓                    if exceeded → 429 rate_limit_exceeded (return)
  ↓
Search / ingest    — Typesense client call
  ↓
Emit usage event   — SearchUsageEvent row (best-effort; failure does not block response)
  ↓
Response

Failures at any gate are typed JSON errors, never raw upstream messages (Invariant 6). Failure at the auth or tenant gate emits no usage event; failures at feature/quota/rate gates emit a SearchUsageEvent with type: "{gate}_block" so they appear in the Analytics → Failed tab.

Plan resolution

resolveOrgPlan(orgId) in @repo/payments:

Read the latest active Purchase (subscription) for the org.
Map the provider's priceId → planId via packages/payments/config.ts.
Cache the result for 60 seconds in process memory (per-server cache).
invalidatePlanCache(orgId) is called from the provider webhook so changes propagate within seconds.
Fail open — if the provider is unreachable, return Free-plan limits instead of blocking traffic.

The 60-second cache is the floor on plan-change propagation. A customer who upgrades sees the new limits on the next webhook tick + the next per-server cache miss — typically a few seconds in practice.

oRPC entitlements procedure

entitlements.getPlanInfo returns the live plan + usage snapshot:

const planInfo = await orpc.entitlements.getPlanInfo.call({ organizationId });
// {
//   planId: "pro",
//   planName: "Pro",
//   searchUnitsUsed: "847352",     // BigInt as string (Invariant 7)
//   searchUnitsLimit: "1000000",   // BigInt as string
//   percentUsed: 84.7,
//   isSoftCap: true,
//   isHardCap: false,
//   indexCount: 7,
//   indexLimit: 10,
//   seatCount: 18,
//   seatLimit: 25,
//   features: { synonyms: true, curations: true, scopedTokens: true, ... },
//   resetsAt: "2025-11-01T00:00:00.000Z"
// }

Use this in the dashboard and in any custom admin tooling. BigInt fields are transformed to strings over oRPC (Invariant 7) — convert back to BigInt on the client if you need arithmetic.

Rate limiting (separate from quota)

Rate limiting is per API key, quota is per organization. Both can fire on the same request:

Per-key: sliding-window bucket in SearchRateLimitBucket, default 600 req/min (configurable per key, lower or higher).
Per-org: monthly Search-Unit quota.

Hitting the rate limit returns 429 with rate_limit_exceeded + Retry-After header. The bucket is keyed by (keyId, windowStart) with the window rotating every 60 seconds.

Rate-limit failures do not consume Search Units. The quota check happens first; the rate gate is the last filter before the engine.

See Search API → Errors and rate limits for headers, response shape, and tuning advice.

Worked billing examples

Small e-commerce shop

5,000 SKUs, one index.
30,000 monthly visitors × 4 searches each = 120,000 Search Units.
Daily inventory sync = 30 connector syncs/month.
No AI features.

Cap check:

Search Units: 120k → over Starter (100k), under Pro (1M) → Pro.
Documents: 5k → fits Starter (10k) and Pro (100k).
Syncs: 30 → fits any paid plan.

Pick Pro ($99/mo at the draft rate). Wallet: zero.

Knowledge base with AI Q&A

2,000 articles indexed.
50,000 search requests/month (well under Pro 1M).
5,000 Knowledge queries/month × 1,500 tokens average = 7.5M tokens.
Blended rate ≈ $0.005 per 1k tokens → ~$37.50/month wallet draw.

Plan: Pro (Knowledge module is Pro+). Wallet: top up $40 monthly with auto-recharge at $5.

Marketplace with scoped tokens

8 tenants isolated via scoped tokens (Pro+).
80,000 documents across one shared index.
2M searches/month.

Plan: Pro (scoped tokens, fits the search-unit budget). Wallet: zero unless AI rerank is enabled.

Black Friday spike on Business

Normal month: 4M searches.
Black Friday week: extra 2M searches → 6M total → 1M over Business cap.
Overage enabled at $0.00008 per Search Unit → 1M × $0.00008 = $80 overage on the invoice.

Cheaper than upgrading to Enterprise for one week; the spending limit (set to $200 to be safe) caps the worst case.

Admin overrides

Org admins see plan + usage in /[orgSlug]/settings/billing.
Platform admins at /admin/organizations can inspect any org's entitlements and manually adjust plan assignment, useful for migrations and edge cases.

Manual adjustments emit admin_override_plan audit events (one of the recorded actions in AuditLog). See Security → Audit logs for retention and export.

Where the numbers live in code

Thing	File
Plan matrix (`PLAN_LIMITS`)	`packages/payments/lib/entitlements.ts`
Feature matrix	`packages/payments/lib/entitlements.ts`
Quota middleware	`packages/api/modules/entitlements/middleware/quota-check.ts`
Feature-gate middleware	`packages/api/modules/entitlements/middleware/feature-gate.ts`
Plan resolution	`packages/payments/lib/entitlements.ts` (`resolveOrgPlan`)
Provider price → plan mapping	`packages/payments/lib/provider-price-ids.ts`
Rate-limit bucket model	`packages/database/prisma/schema.prisma` (`SearchRateLimitBucket`)
Usage event model	`packages/database/prisma/schema.prisma` (`SearchUsageEvent`)
Overage transactions	`packages/database/prisma/schema.prisma` (`OverageTransaction`)
Wallet ledger	`packages/database/prisma/schema.prisma` (`WalletLedgerEntry`)

Billing → Plans — customer-facing plan matrix with right-sizing examples.
Billing → Usage units — unit-by-unit reference.
Billing → Quotas — soft/hard cap behavior from the customer angle.
Billing → Wallet & AI credits — pay-as-you-go AI billing.
Search API → Errors and rate limits — HTTP error catalog.
Troubleshooting → Billing limits — what to do when a quota fires unexpectedly.
Security → Audit logs — recorded plan-change events.

Plans & Limits

On this page