AI Search Overview
Where AI Search fits in AACSearch OS — semantic search, AI answers, suggestions — and how it compares to keyword search and Knowledge RAG / GraphRAG.
AACSearch OS layers three intelligence surfaces on top of the same indexed catalog:
| Surface | What it does | Where it lives |
|---|---|---|
| Keyword search | Typo-tolerant lexical match across declared fields. Fast, deterministic, no LLM call. | POST /api/search and POST /api/search/multi |
| Semantic search | Vector match using embeddings. Helps when intent ≠ exact keywords ("running shoes" → "sport sneakers"). | POST /api/search with queryByEmbedding (Beta) / Semantic search |
| AI answers | Natural-language summary above the result list, citing the matched documents. | POST /api/search/ai/answer / AI answers |
| Suggestions | Autocomplete / "did-you-mean" / popular-query hints surfaced as the user types. | Suggestions |
The deeper layer — Q&A over your own internal documents — lives in Knowledge RAG and GraphRAG:
| Surface | What it does | Docs |
|---|---|---|
| Knowledge RAG | Retrieval-augmented Q&A over uploaded files / URLs in a Knowledge space. | Knowledge RAG |
| GraphRAG | Graph-aware retrieval that follows entity/relation paths for multi-document reasoning. | GraphRAG |
Feature status
| Capability | Status |
|---|---|
| Keyword search (typo-tolerance, facets, sort) | ✅ Available |
AI answer over search results (/api/search/ai/answer) | ✅ Available |
Image-to-vector search (/api/search/ai/image) | ✅ Available |
Knowledge RAG (file / URL ingest, ask) | ✅ Available |
Knowledge RAG streaming (askStream) | ✅ Available |
| GraphRAG entity + relation graph | ✅ Available |
| GraphRAG community detection (Louvain) | ✅ Available |
GraphRAG drill-down explain (graphragExplain) | ✅ Available |
| Semantic search with custom embedding model | 🟡 Beta — model selection per space |
| Auto-embedding on ingest | 🟡 Beta |
| Connector-driven Knowledge sources (Confluence, Notion, Google Drive) | ⏳ Roadmap |
| Per-tenant fine-tuned model | ⏳ Roadmap (Enterprise) |
Feature flags and paywall tiers are listed in Plans and limits. The same statuses appear on the marketing feature pages; if any mismatch, the docs page is the source of truth.
When to use which surface
A short decision guide. None of these is "smarter" than the others — they trade off latency, cost, and accuracy.
| Need | Pick |
|---|---|
| "Show me products matching this query" | Keyword search |
| "Show me products even when the customer phrases it differently" | Semantic + keyword (hybrid) |
| "Above the list, summarise what these 5 products are and answer the question" | AI answer |
| "Answer a question from my support documentation, with citations" | Knowledge RAG |
| "Answer a question that spans multiple documents and concepts" | GraphRAG |
| "Suggest queries as the user types" | Suggestions / multi-search |
When not to use AI answers
AI answers are powerful but not free, not always correct, and not always the right UX. Skip them when:
- The query is a navigation lookup ("login page", "checkout") — answer the query with the link, not a paragraph.
- The answer must be authoritative (legal, medical, pricing commitments). Classic search returns the source; the user reads it.
- Latency matters more than smoothness. AI answers add 500–2000 ms over the underlying search call.
- The catalog has fewer than ~5 documents that match — the model will fabricate context to fill the gap.
- You can't measure or display the citations. Without citations, AI answers are unverifiable and a support liability.
When the query passes none of those filters, render keyword + facets and let the user pick. The AI answer is an additive surface, not a replacement for the result list.
Cost shape
Every AI surface is metered through the AI Wallet in BigInt kopecks (Invariant 8). The public AI endpoints follow a reserve → call → commit/release pattern:
- Reserve credits before any paid operation (
reserveCreditsForPublicHandler). - Run the LLM / embedding call.
- On success, commit the actual usage (
commitFlatFeeUsage); on cancellation or error, release the reservation.
The per-call rates live in packages/api/modules/entitlements/credit-rates.ts (CREDIT_RATES.ai_answer, CREDIT_RATES.ai_image_search, …). Reservation failures return 402 Payment Required and never bill.
Privacy and data flow
- AI answers and Knowledge RAG send retrieved snippets (not full documents) to the configured LLM provider.
- The provider is OpenAI by default; per-organization model selection (Beta) lives in
KnowledgeSpace.ragConfig. - Inputs to image-to-vector search use
gpt-4o-mini's vision call to describe the image, then embed the description; the raw image is not retained beyond the request. - Tenant isolation (Invariant 5) holds across all AI calls: every retrieval is scoped to a single
organizationId(and within Knowledge, to a singleknowledgeSpaceId).
For SOC 2 / DPA scope, see Security & Compliance.
Related pages
- AI answers — endpoint, citations, prompt shape, limitations
- Semantic search — embeddings, hybrid mode, model selection
- Suggestions — autocomplete and "did-you-mean"
- Knowledge RAG — Q&A over uploaded documents
- GraphRAG — multi-document reasoning over the entity graph
- Plans and limits — entitlements and quotas