Index Schema Reference
Complete reference for index schemas — required fields, field types, searchable/facet/sort flags, slug rules, schema validation errors, and product/content catalog examples.
An index schema declares the fields a search index stores, their types, and how each field is searched, filtered, sorted and faceted. The schema is fixed at index-creation time; adding or removing fields later requires a reindex (zero downtime via alias swap).
This page is the reference for the schema you pass to search.createIndex and to the underlying createPhysicalCollection() helper in @repo/search. For a guided 2-minute quickstart, see Create your first index.
Index lifecycle
draft → created → ingesting → searching → reindexing → searching
↑ ↓
└──── alias swap ┘- Created — the index row is written to Postgres (
SearchIndexmodel) and a versioned collection (<prefix>_<org>_<slug>_v1) plus an alias (<prefix>_<org>_<slug>) are provisioned in Typesense. - Ingesting — documents flow into
SearchIngestBufferand are forwarded to Typesense by the background worker. Writes are always DB-first (Invariant 2): public clients never write directly to Typesense. - Searching — queries always target the alias, never the versioned collection. This makes future reindex transparent to clients.
- Reindexing — when the schema changes, a new version (
_v2,_v3, …) is built in parallel. After the new collection is fully populated, the alias is atomically swapped. The old versioned collection can then be dropped.
All collection / alias names are namespaced per organization, so tenants are isolated at the engine level (Invariant 5).
Slug rules
Each index has a slug that identifies it inside an organization. The slug must satisfy:
| Rule | Value |
|---|---|
| Length | 1–64 characters |
| Charset | lowercase letters, digits, dashes (a-z0-9-) |
| Start | letter or digit (no leading dash) |
| Regex | ^[a-z0-9][a-z0-9-]*$ |
Valid: products, articles, help-center, catalog-v2.
Invalid: Products (uppercase), -articles (leading dash), news_2024 (underscore), my products (space), 🔥-hot (non-ASCII).
The slug is sanitized again at the collection layer (packages/search/lib/collections.ts sanitize()), but the API-level validation above is what your callers will hit first — schemas that fail will return a BAD_REQUEST error with the message slug must be lowercase letters, digits, and dashes.
Required fields
You don't declare these — the system injects them:
| Field | Type | Why it's required |
|---|---|---|
id | string | Typesense document id. Must be unique within the collection. Use a stable external id (SKU, slug, UUID) so re-ingest is idempotent. |
organization_id | string | Tenant key. Injected by createPhysicalCollection() and AND-combined into every public search call (Invariant 5). You never set it directly. |
If you pass an organization_id field in your schema input, it is filtered out and replaced with the canonical tenant field — there is no way to bypass this.
Field types
The schema supports the following types (full list in packages/api/modules/search/types.ts searchFieldSchema):
| Type | When to use |
|---|---|
string | Text values — names, descriptions, identifiers, enum-style strings |
string[] | Multi-valued strings — tags, categories, brand list |
string* | Any string (string OR string[]). Less strict; prefer the specific type when you can. |
int32 | 32-bit signed integer. Counts, scores ≤ 2³¹ |
int64 | 64-bit signed integer. Unix timestamps (use seconds, not ms), large counts |
float | 64-bit floating point. Prices, ratings. Money in minor units should use int64, not float. |
bool | Boolean |
int32[], int64[], float[], bool[] | Array variants |
object | Nested JSON object — enabled because enable_nested_fields: true is set |
object[] | Array of nested objects |
auto | Type inferred at first ingest. Avoid for production schemas — use only for exploratory indexes. |
geopoint | [lat, lng] pair for geo search |
geopoint[] | Multiple geo points per document |
geopolygon | GeoJSON polygon for region filters |
geojson | Arbitrary GeoJSON geometry |
image | Image field for image search (vector-backed) |
Vector fields use type: "float[]" plus num_dim (and optional hnsw_params, vec_dist). For embeddings see AI Search.
Money, dates, money-as-float
- Money — store as
int64minor units (kopecks, cents). Format in the UI. Invariant 16 forbids decimal/float money in oRPC outputs. - Dates — store as
int64Unix seconds.created_at,updated_at,published_atare conventional. - Booleans — explicit
true/false. Avoid"yes"/"no"strings.
Field flags
Each field accepts a small set of flags. They control how Typesense builds its internal indexes, which directly affects query latency and memory.
| Flag | Default | Effect |
|---|---|---|
facet | false | Field becomes a filterable + facet-countable field. Required for filterBy: and facetBy: to work on this field. Cheap on strings; expensive on high-cardinality numerics. |
sort | true for numeric, false for string | When true, the field becomes a sortable. Numeric fields are sortable by default. String fields need sort: true explicitly if you want sortBy: "title:asc". |
optional | false | When true, documents that omit the field are still accepted. Without it, missing fields fail ingest validation. |
index | true | When false, the field is stored but not indexed. You cannot search, filter or facet on it. Useful for fields you only fetch back in the result. |
store | true | When false, the field is indexed but not stored. The field will be searchable / filterable but won't appear in the returned document. Saves disk. |
range_index | false | When true, builds an explicit range index for numeric fields. Speeds up price:[10..100] style filters on large indexes. Costs additional memory. |
stem | false | When true, applies stemming at index time (English: "running" → "run"). Combine with the matching language stemmer. |
truncate | false | When true, long values are truncated to fit Typesense's token limit. Use for fields that may exceed length limits but where truncation is acceptable. |
truncate_len | unset | Per-field truncation length cap (Typesense v30+). 1–16384. |
num_dim | unset | Required when float[] is used as a vector field. Sets the embedding dimension. |
hnsw_params | unset | ef_construction and M HNSW tuning knobs for vector fields. Defaults are safe; tune only after you have a benchmark. |
vec_dist | "cosine" | "cosine" or "ip" (inner product). Pick to match your embedding model. |
locale | unset | On a string/facet field, treats values as hierarchical paths. Example: "Electronics/Phones/Smartphones" with locale: "/" becomes drill-down faceted. |
Picking flags efficiently
Three rules that catch most mistakes:
- Don't facet what you don't filter. Faceting on a high-cardinality field (e.g. a free-text description) bloats memory without benefit.
- Mark string sort fields explicitly.
titleis not sortable unless you setsort: true. - Use
optional: truefor sparse fields. Sale prices, deprecated flags, locale-specific fields — anything that isn't on every document.
default_sorting_field
Top-level collection setting (not per-field). Used when a search omits sortBy. Common choices:
"_text_match"— relevance score (most search use cases)"popularity_score:desc"— popularity-weighted (e-commerce)"created_at:desc"— newest first (content / news)
The field must be sortable (numeric or sort: true).
Schema validation errors
The Zod validator (searchFieldSchema and searchIndexSlugSchema in packages/api/modules/search/types.ts) returns structured errors. Common ones you'll see on BAD_REQUEST:
| Error | What went wrong |
|---|---|
slug must be lowercase letters, digits, and dashes | Slug failed the regex. Lowercase only, no underscores, no leading dash. |
String must contain at least 1 character(s) at "fields.0.name" | Empty field name. |
String must contain at most 64 character(s) at "fields.0.name" | Field name longer than 64 chars. |
Invalid enum value at "fields.0.type" | Field type not in the supported list. Check spelling — "integer" is not valid; use "int32" or "int64". |
Number must be a positive integer at "fields.0.num_dim" | Vector fields need num_dim > 0. |
Number must be greater than or equal to 1 at "fields.0.truncate_len" | truncate_len is 1–16384. |
Typesense error: Field \<name>` should be set as sortable` | Returned at query time (not at schema time) when you sortBy: a string field without sort: true. Fix the schema and reindex. |
Typesense error: default_sorting_field \<f>` is not a sortable type` | The default sort field isn't sortable. Either pick a numeric field or add sort: true. |
Validation runs before any Typesense call. If you see a Typesense-side error (e.g. malformed hnsw_params), the upstream message is normalised by public-handler.ts before reaching the client — never echoed raw (Invariant 6).
Example: product catalog
A schema tuned for an e-commerce product index. Searchable text fields are weighted via queryBy at query time, not declared in the schema (see Search core relevance).
import { orpc } from "@shared/lib/orpc-query-utils";
await orpc.search.createIndex.call({
organizationId: "org_...",
slug: "products",
name: "Product catalog",
fields: [
{ name: "id", type: "string" },
{ name: "title", type: "string", sort: true },
{ name: "sku", type: "string" },
{ name: "brand", type: "string", facet: true, sort: true },
{ name: "categories", type: "string[]", facet: true },
{ name: "description", type: "string", optional: true },
{ name: "price", type: "int64", facet: true, range_index: true },
{ name: "sale_price", type: "int64", optional: true, facet: true },
{ name: "currency", type: "string", facet: true },
{ name: "availability", type: "string", facet: true },
{ name: "rating", type: "float", facet: true, optional: true },
{ name: "locale", type: "string", facet: true },
{ name: "created_at", type: "int64", sort: true },
{ name: "image_url", type: "string", index: false },
],
defaultSortingField: "_text_match",
});Notes:
priceandsale_priceare stored asint64minor units.999_99is 999.99 in display currency.image_urlusesindex: false— it's returned in results but not searchable.categoriesisstring[]withfacet: trueso it works in bothfilterBy: "categories:=Audio"andfacetBy: "categories".descriptionisoptional: true— products without a description still ingest.
Example: content catalog
A schema tuned for articles, help-center entries, or a blog index. Sortable strings, full-text body, hierarchical categories, and timestamp sorting.
await orpc.search.createIndex.call({
organizationId: "org_...",
slug: "help-center",
name: "Help center articles",
fields: [
{ name: "id", type: "string" },
{ name: "title", type: "string", sort: true },
{ name: "excerpt", type: "string", optional: true },
{ name: "body", type: "string", stem: true },
{ name: "author", type: "string", facet: true, sort: true, optional: true },
{ name: "section", type: "string", facet: true, locale: "/" },
{ name: "tags", type: "string[]", facet: true, optional: true },
{ name: "locale", type: "string", facet: true },
{ name: "reading_time", type: "int32", facet: true, optional: true },
{ name: "published_at", type: "int64", sort: true },
{ name: "updated_at", type: "int64", sort: true },
],
defaultSortingField: "published_at",
});Notes:
bodyhasstem: true— searching for"installs"matches"installation".sectionuseslocale: "/"so values like"Getting Started/First Index/Schema"become drill-down facets at the dashboard.- Default sort is
published_atso a blank query lists newest articles first. - No
descriptionflag-soup — explicitoptionalon each sparse field.
Changing a schema after creation
Field additions, deletions, and flag changes require a reindex: a new versioned collection is built with the new schema, documents are re-ingested, and the alias is atomically swapped. Old data stays queryable throughout. See Ingest and reindex for the trigger and Reindexing and zero downtime for the underlying mechanism.
The shortcut: never edit the schema in-place. Always go through the reindex path.
Related pages
- Create your first index — 2-minute quickstart
- Ingest and reindex — bulk ingest, schema migrations
- Filters, sorting & pagination — how field flags drive query syntax
- Multi-search and querying —
queryByand field weighting - Reindexing and zero downtime — alias-swap internals