AACsearch
Operations & Reliability

Backups and retention

What AACsearch backs up, how long we keep it, and how a restore actually works.

Backups and retention

This page is your reference for what gets backed up, where the backups live, how long they're kept, and what a restore does and doesn't do for you. For the engineer-on-call procedure during an actual incident, see the DR recovery runbook.

What we back up

ComponentMethodFrequencyLocation
PostgreSQL (primary database)WAL-G continuous archiving + base backupsContinuous WAL + daily baseEncrypted S3 bucket in your residency region
Typesense (search index data)Snapshot APIEvery 6 hoursEncrypted S3 bucket in your residency region
Uploaded files (knowledge base, attachments)Direct S3 with cross-region replicationReal-timeS3 in the region + cross-region replica
Application code & configurationGit (versioned)Per deployGitHub
Audit logSame as PostgreSQL (it's in PG)ContinuousSame as PG

PostgreSQL backups are the ground truth. Every other backup either points at PostgreSQL or can be reconstructed from it.

Retention

Backup typeRetention
PostgreSQL WAL30 days (rolling)
PostgreSQL daily base30 days (rolling)
Typesense snapshots30 days (rolling)
Uploaded filesVersioned in S3; previous versions retained for 30 days
Audit log rows (in DB)365 days (Business+) / 90 days (Starter), then purged
Deletion-request residueDeletion certificate within 30 days of last day of service

After the retention window, the data is gone. We will not have it for you. Plan exports accordingly — see the audit log export.

What restore means

A restore replays WAL up to a target time, then brings the database online at that point. Everything in the database is rolled back together. You cannot restore one organization's data without restoring everyone's data on that database.

If you only need your data back, see Per-tenant document recovery below — that uses the snapshot path, not a full database restore.

RTO and RPO

  • RTO (target maximum time to recover): 1 hour for the shared cluster.
  • RPO (target maximum data loss): 15 minutes for PostgreSQL, 6 hours for Typesense.

Typesense RPO is higher because we recover Typesense from snapshot plus reindex from PostgreSQL — the gap between the last snapshot and the incident is replayed from the database, which has a shorter RPO.

Enterprise customers can negotiate tighter RTO/RPO on a dedicated cluster — see Dedicated cluster.

Per-tenant document recovery

If you accidentally deleted documents from your own index and want them back without involving us, the path depends on your source of truth:

  1. You have the source elsewhere (e.g. a product DB). Re-emit them through your normal ingest path. The buffer flushes them, the alias keeps serving the existing collection.
  2. You don't have the source. Open a support ticket within 30 days. We can replay from the relevant Typesense snapshot into a new collection in your project. This is not an atomic rollback — the documents come back with their snapshot-time values, not their pre-deletion state if you had updated them in between.

We will not, under any circumstances, do this for another tenant's data on your behalf. The two-person rule applies to recovery operations as much as it does to deploys.

Where backups live

Backups live in encrypted S3 buckets in the same residency region as the cluster they back up. There is no cross-region backup replication unless you contractually request one. This is the right default — moving backups across regions changes the data-residency picture and almost everyone is surprised by it.

If you have a contractual requirement for off-region backup, request it explicitly through Procurement.

What is not backed up

  • Browser / mobile state. localStorage, in-memory caches, scoped tokens that the browser is holding. These are ephemeral.
  • Your application data. AACsearch backs up the search index and our database. Your e-commerce DB, CRM, or CMS is yours to back up.
  • API keys after revocation. Once revoked, a key cannot be un-revoked even from a restore. This is intentional — restoring a key would be a security regression.
  • In-flight ingest rows past 90 days. The ingest buffer is rolling; once a row is flushed successfully or has been failed for 90 days, it's purged from the buffer (the audit log still reflects the action).

How to verify backups (your side)

We do not surface backup verification as a self-serve operation. Internally, we exercise restore weekly on a copy of production. You can request the most recent verification report under NDA — see Procurement.

If you want your own verification, the supported path is to schedule a periodic export:

  • Documents → searchIndex.export (CSV or NDJSON of all documents per index)
  • Audit log → auditLog.export (see Audit log export)
  • Synonyms, curations, analytics → via the corresponding admin endpoints

Drop those into your own object storage. This is what we recommend for any compliance program that asks for "customer-controlled backups" — restoring our backups never gives you your data in your own region.

Common mistakes

  • Treating reindex as restore. Reindex builds from your current source of truth. If your source of truth lost the data, reindex won't bring it back.
  • Assuming we keep backups forever. We don't. 30 days is the rolling window. If you need year-long retention, export.
  • Asking for backup of an organization deleted yesterday. Account deletion is final after 30 days. During those 30 days, contact sales urgently — recovery is possible but requires a contract amendment.

See also

On this page