Write path
How a document write enters AACsearch — from POST /v1/indexes/.../documents through SearchIngestBuffer / SearchSyncOutbox and the worker, into the Typesense alias.
AACsearch is DB-first: every write lands in PostgreSQL before it is ever projected to Typesense. The HTTP layer never calls Typesense synchronously from a customer request. This is Hard Invariant #2 — durability, partial-fail handling, and zero-downtime reindex all depend on it.
Flow
Description
The diagram shows a synchronous-from-the-client HTTP write being authenticated, rate-limited and quota-checked, then durably enqueued into PostgreSQL (SearchIngestBuffer / SearchSyncOutbox) — the customer receives 202 Accepted immediately. A background worker later claims pending rows, attaches embeddings, and imports the batch into the Typesense alias, reconciling per-row success or failure with exponential backoff.
sequenceDiagram
autonumber
participant Client as Customer / Connector
participant API as packages/api/v1/documents.ts
participant Auth as verifySearchApiKey (scope=ingest)
participant DB as PostgreSQL (SearchIngestBuffer + SearchSyncOutbox)
participant W as Sync worker (sync-worker.ts)
participant Embed as autoEmbedDocuments
participant TS as Typesense alias_name(orgShortId_slug)
Client->>API: POST /v1/indexes/:indexId/documents:batch
API->>Auth: Bearer ss_search_* / ss_connector_*
Auth-->>API: VerifiedSearchKey { organizationId, indexId }
API->>API: rate-limit (per-key, 1m sliding bucket)
API->>API: enforceQuota (plan / overage)
API->>DB: enqueueManySearchIngest() (or SearchSyncOutbox doc_upsert)
API-->>Client: 202 Accepted (jobId)
loop worker tick
W->>DB: claim pending rows (atomic updateMany + lockedBy)
W->>Embed: autoEmbedDocuments(batch)
Embed-->>W: vectors attached
W->>TS: collection.documents().import(batch, action=upsert)
alt all green
W->>DB: markIngestRowsSuccess / outbox.status=done
else partial fail
W->>DB: markIngestRowsFailure + nextRetryAt (exp. backoff)
end
endWhat each step guarantees
- Auth (
verifySearchApiKey). Token is hashed (sha256) and compared against theSearchApiKey.hashcolumn. Connector keys (ss_connector_*) and search keys (ss_search_*) share hash space; the scope column drives authorization. - Rate limit. Per-key sliding window from
SearchRateLimitBucket. ExceedingrateLimitPerMinutereturns 429 before any DB write. - Quota gate. Plan entitlements and wallet overage are checked once per request; write quota is consumed atomically with the enqueue.
- Durable enqueue. Rows are written to
SearchIngestBuffer(legacy path) orSearchSyncOutbox(canonical, idempotent path). The HTTP response is202— the document does not need to be in Typesense before the client returns. - Worker projection. A background process claims rows with
lockedBy = WORKER_ID, runs auto-embedding when the index has a vector field, callscollection.documents().import(), and reconciles per-row success / failure. - Alias targeting. The worker always writes to
aliasName(organizationId, slug), which points at the current physical collection version. Reindex swaps the alias atomically; in-flight writes follow the new pointer.
Why DB-first
- Durability. Server restarts or Typesense outages do not lose writes.
- Partial-fail recovery. Only failed rows retry; successful ones are not duplicated.
- Tenant isolation. The worker tags every document with the
tenantIdfield before import; the alias enforcesfilter_byon read. - Backpressure. The buffer absorbs bursty CMS full-syncs without overloading Typesense.
Related
- Connector lifecycle — how CMS modules feed this same write path.
- Read path — the mirror flow for queries.
- Reindexing & zero-downtime — alias swap mechanics.
Architecture
Visual reference for the AACsearch internals — write path, read path, security model, connector lifecycle, and the analytics feedback loop.
Read path
How a search query flows through AACsearch — from the customer SDK to /search/public/multi, through public-auth and tenant-filter combine, into Typesense multi_search, and back as a sanitized response.