Index Schema Reference

Complete reference for index schemas — required fields, field types, searchable/facet/sort flags, slug rules, schema validation errors, and product/content catalog examples.

An index schema declares the fields a search index stores, their types, and how each field is searched, filtered, sorted and faceted. The schema is fixed at index-creation time; adding or removing fields later requires a reindex (zero downtime via alias swap).

This page is the reference for the schema you pass to search.createIndex and to the underlying createPhysicalCollection() helper in @repo/search. For a guided 2-minute quickstart, see Create your first index.

Index lifecycle

draft → created → ingesting → searching → reindexing → searching
                                              ↑                ↓
                                              └──── alias swap ┘

Created — the index row is written to Postgres (SearchIndex model) and a versioned collection (<prefix>_<org>_<slug>_v1) plus an alias (<prefix>_<org>_<slug>) are provisioned in Typesense.
Ingesting — documents flow into SearchIngestBuffer and are forwarded to Typesense by the background worker. Writes are always DB-first (Invariant 2): public clients never write directly to Typesense.
Searching — queries always target the alias, never the versioned collection. This makes future reindex transparent to clients.
Reindexing — when the schema changes, a new version (_v2, _v3, …) is built in parallel. After the new collection is fully populated, the alias is atomically swapped. The old versioned collection can then be dropped.

All collection / alias names are namespaced per organization, so tenants are isolated at the engine level (Invariant 5).

Slug rules

Each index has a slug that identifies it inside an organization. The slug must satisfy:

Rule	Value
Length	1–64 characters
Charset	lowercase letters, digits, dashes (`a-z0-9-`)
Start	letter or digit (no leading dash)
Regex	`^[a-z0-9][a-z0-9-]*$`

Valid: products, articles, help-center, catalog-v2. Invalid: Products (uppercase), -articles (leading dash), news_2024 (underscore), my products (space), 🔥-hot (non-ASCII).

The slug is sanitized again at the collection layer (packages/search/lib/collections.ts sanitize()), but the API-level validation above is what your callers will hit first — schemas that fail will return a BAD_REQUEST error with the message slug must be lowercase letters, digits, and dashes.

Required fields

You don't declare these — the system injects them:

Field	Type	Why it's required
`id`	`string`	Typesense document id. Must be unique within the collection. Use a stable external id (SKU, slug, UUID) so re-ingest is idempotent.
`organization_id`	`string`	Tenant key. Injected by `createPhysicalCollection()` and AND-combined into every public search call (Invariant 5). You never set it directly.

If you pass an organization_id field in your schema input, it is filtered out and replaced with the canonical tenant field — there is no way to bypass this.

Field types

The schema supports the following types (full list in packages/api/modules/search/types.ts searchFieldSchema):

Type	When to use
`string`	Text values — names, descriptions, identifiers, enum-style strings
`string[]`	Multi-valued strings — tags, categories, brand list
`string*`	Any string (string OR string[]). Less strict; prefer the specific type when you can.
`int32`	32-bit signed integer. Counts, scores ≤ 2³¹
`int64`	64-bit signed integer. Unix timestamps (use seconds, not ms), large counts
`float`	64-bit floating point. Prices, ratings. Money in minor units should use `int64`, not `float`.
`bool`	Boolean
`int32[]`, `int64[]`, `float[]`, `bool[]`	Array variants
`object`	Nested JSON object — enabled because `enable_nested_fields: true` is set
`object[]`	Array of nested objects
`auto`	Type inferred at first ingest. Avoid for production schemas — use only for exploratory indexes.
`geopoint`	`[lat, lng]` pair for geo search
`geopoint[]`	Multiple geo points per document
`geopolygon`	GeoJSON polygon for region filters
`geojson`	Arbitrary GeoJSON geometry
`image`	Image field for image search (vector-backed)

Vector fields use type: "float[]" plus num_dim (and optional hnsw_params, vec_dist). For embeddings see AI Search.

Money, dates, money-as-float

Money — store as int64 minor units (kopecks, cents). Format in the UI. Invariant 16 forbids decimal/float money in oRPC outputs.
Dates — store as int64 Unix seconds. created_at, updated_at, published_at are conventional.
Booleans — explicit true / false. Avoid "yes" / "no" strings.

Field flags

Each field accepts a small set of flags. They control how Typesense builds its internal indexes, which directly affects query latency and memory.

Flag	Default	Effect
`facet`	`false`	Field becomes a filterable + facet-countable field. Required for `filterBy:` and `facetBy:` to work on this field. Cheap on strings; expensive on high-cardinality numerics.
`sort`	`true` for numeric, `false` for string	When `true`, the field becomes a sortable. Numeric fields are sortable by default. String fields need `sort: true` explicitly if you want `sortBy: "title:asc"`.
`optional`	`false`	When `true`, documents that omit the field are still accepted. Without it, missing fields fail ingest validation.
`index`	`true`	When `false`, the field is stored but not indexed. You cannot search, filter or facet on it. Useful for fields you only fetch back in the result.
`store`	`true`	When `false`, the field is indexed but not stored. The field will be searchable / filterable but won't appear in the returned document. Saves disk.
`range_index`	`false`	When `true`, builds an explicit range index for numeric fields. Speeds up `price:[10..100]` style filters on large indexes. Costs additional memory.
`stem`	`false`	When `true`, applies stemming at index time (English: `"running"` → `"run"`). Combine with the matching language stemmer.
`truncate`	`false`	When `true`, long values are truncated to fit Typesense's token limit. Use for fields that may exceed length limits but where truncation is acceptable.
`truncate_len`	unset	Per-field truncation length cap (Typesense v30+). 1–16384.
`num_dim`	unset	Required when `float[]` is used as a vector field. Sets the embedding dimension.
`hnsw_params`	unset	`ef_construction` and `M` HNSW tuning knobs for vector fields. Defaults are safe; tune only after you have a benchmark.
`vec_dist`	`"cosine"`	`"cosine"` or `"ip"` (inner product). Pick to match your embedding model.
`locale`	unset	On a string/facet field, treats values as hierarchical paths. Example: `"Electronics/Phones/Smartphones"` with `locale: "/"` becomes drill-down faceted.

Picking flags efficiently

Three rules that catch most mistakes:

Don't facet what you don't filter. Faceting on a high-cardinality field (e.g. a free-text description) bloats memory without benefit.
Mark string sort fields explicitly. title is not sortable unless you set sort: true.
Use optional: true for sparse fields. Sale prices, deprecated flags, locale-specific fields — anything that isn't on every document.

`default_sorting_field`

Top-level collection setting (not per-field). Used when a search omits sortBy. Common choices:

"_text_match" — relevance score (most search use cases)
"popularity_score:desc" — popularity-weighted (e-commerce)
"created_at:desc" — newest first (content / news)

The field must be sortable (numeric or sort: true).

Schema validation errors

The Zod validator (searchFieldSchema and searchIndexSlugSchema in packages/api/modules/search/types.ts) returns structured errors. Common ones you'll see on BAD_REQUEST:

Error	What went wrong
`slug must be lowercase letters, digits, and dashes`	Slug failed the regex. Lowercase only, no underscores, no leading dash.
`String must contain at least 1 character(s) at "fields.0.name"`	Empty field name.
`String must contain at most 64 character(s) at "fields.0.name"`	Field name longer than 64 chars.
`Invalid enum value at "fields.0.type"`	Field type not in the supported list. Check spelling — `"integer"` is not valid; use `"int32"` or `"int64"`.
`Number must be a positive integer at "fields.0.num_dim"`	Vector fields need `num_dim > 0`.
`Number must be greater than or equal to 1 at "fields.0.truncate_len"`	`truncate_len` is 1–16384.
Typesense error: `Field \`<name>` should be set as sortable`	Returned at query time (not at schema time) when you `sortBy:` a string field without `sort: true`. Fix the schema and reindex.
Typesense error: `default_sorting_field \`<f>` is not a sortable type`	The default sort field isn't sortable. Either pick a numeric field or add `sort: true`.

Validation runs before any Typesense call. If you see a Typesense-side error (e.g. malformed hnsw_params), the upstream message is normalised by public-handler.ts before reaching the client — never echoed raw (Invariant 6).

Example: product catalog

A schema tuned for an e-commerce product index. Searchable text fields are weighted via queryBy at query time, not declared in the schema (see Search core relevance).

import { orpc } from "@shared/lib/orpc-query-utils";

await orpc.search.createIndex.call({
	organizationId: "org_...",
	slug: "products",
	name: "Product catalog",
	fields: [
		{ name: "id", type: "string" },
		{ name: "title", type: "string", sort: true },
		{ name: "sku", type: "string" },
		{ name: "brand", type: "string", facet: true, sort: true },
		{ name: "categories", type: "string[]", facet: true },
		{ name: "description", type: "string", optional: true },
		{ name: "price", type: "int64", facet: true, range_index: true },
		{ name: "sale_price", type: "int64", optional: true, facet: true },
		{ name: "currency", type: "string", facet: true },
		{ name: "availability", type: "string", facet: true },
		{ name: "rating", type: "float", facet: true, optional: true },
		{ name: "locale", type: "string", facet: true },
		{ name: "created_at", type: "int64", sort: true },
		{ name: "image_url", type: "string", index: false },
	],
	defaultSortingField: "_text_match",
});

Notes:

price and sale_price are stored as int64 minor units. 999_99 is 999.99 in display currency.
image_url uses index: false — it's returned in results but not searchable.
categories is string[] with facet: true so it works in both filterBy: "categories:=Audio" and facetBy: "categories".
description is optional: true — products without a description still ingest.

Example: content catalog

A schema tuned for articles, help-center entries, or a blog index. Sortable strings, full-text body, hierarchical categories, and timestamp sorting.

await orpc.search.createIndex.call({
	organizationId: "org_...",
	slug: "help-center",
	name: "Help center articles",
	fields: [
		{ name: "id", type: "string" },
		{ name: "title", type: "string", sort: true },
		{ name: "excerpt", type: "string", optional: true },
		{ name: "body", type: "string", stem: true },
		{ name: "author", type: "string", facet: true, sort: true, optional: true },
		{ name: "section", type: "string", facet: true, locale: "/" },
		{ name: "tags", type: "string[]", facet: true, optional: true },
		{ name: "locale", type: "string", facet: true },
		{ name: "reading_time", type: "int32", facet: true, optional: true },
		{ name: "published_at", type: "int64", sort: true },
		{ name: "updated_at", type: "int64", sort: true },
	],
	defaultSortingField: "published_at",
});

Notes:

body has stem: true — searching for "installs" matches "installation".
section uses locale: "/" so values like "Getting Started/First Index/Schema" become drill-down facets at the dashboard.
Default sort is published_at so a blank query lists newest articles first.
No description flag-soup — explicit optional on each sparse field.

Changing a schema after creation

Field additions, deletions, and flag changes require a reindex: a new versioned collection is built with the new schema, documents are re-ingested, and the alias is atomically swapped. Old data stays queryable throughout. See Ingest and reindex for the trigger and Reindexing and zero downtime for the underlying mechanism.

The shortcut: never edit the schema in-place. Always go through the reindex path.

Create your first index — 2-minute quickstart
Ingest and reindex — bulk ingest, schema migrations
Filters, sorting & pagination — how field flags drive query syntax
Multi-search and querying — queryBy and field weighting
Reindexing and zero downtime — alias-swap internals

Index Schema Reference

On this page