AACsearch
Knowledge RAG

Knowledge RAG Overview

Retrieval-augmented Q&A over your own documents — uploaded files, URLs, internal knowledge bases. How spaces, sources, ingestion jobs, and the ask endpoint fit together.

The Knowledge module is the second AACSearch surface: instead of searching a catalog of structured documents, it answers natural-language questions over your own unstructured content — PDFs, DOCX files, scraped URLs, internal articles. It is intentionally separate from storefront search; the two surfaces share no indexes and serve different jobs.

SurfaceUse caseBacked by
Storefront searchFind products in a catalogSearchIndex + Typesense alias
Knowledge RAGAnswer a question with citations from internal documentsKnowledgeSpace + KnowledgeChunk (Postgres + vectors)
GraphRAG (extension)Answer a multi-document / multi-concept questionGraphNode / GraphEdge over the same chunks

This page is the orientation for the Knowledge module. For ingestion details see Sources; for measuring answer quality see Evaluation.

Status

CapabilityStatus
File upload ingest (PDF / DOCX / TXT / Markdown)✅ Available
URL ingest (single page)✅ Available
Ingestion job tracking + retry✅ Available
Chunking (fixed / semantic / markdown / code strategies)✅ Available
Embedding (default model)✅ Available
Embedding model selection per space🟡 Beta — KnowledgeSpace.ragConfig.embeddingModel
ask (RAG retrieval + LLM answer, with citations)✅ Available
askStream (server-sent events)✅ Available
graphragExplain (graph-aware answer with paths)✅ Available — see GraphRAG
Connector-driven sources (Confluence, Notion, GDrive)⏳ Roadmap
Multi-modal sources (images, audio)⏳ Roadmap
Per-tenant fine-tuned answer model⏳ Roadmap (Enterprise)

Spaces

A Knowledge space is the unit of isolation. Each space:

  • Has exactly one owner — an organization (organizationId) OR a user (userId). The owner type is enforced by an XOR CHECK at the DB level, mirroring the Purchase model pattern.
  • Holds its own data sources, documents, chunks, embeddings, and graph nodes.
  • Has its own slug, unique per (ownerType, userId|organizationId).
  • Has its own RAG config — model, chunking strategy, top-k retrieval — at KnowledgeSpace.ragConfig (Beta).

Routes:

ScopePath
Organization/[orgSlug]/knowledge/[spaceSlug]
User (personal)/knowledge/[spaceSlug]

The same dashboard UI is rendered for both scopes; the ownerType is inferred from the URL.

Data model

ModelPurpose
KnowledgeSpaceTop-level container; XOR owner (user or org).
DataSourceSource descriptor (file upload, URL, future connector). Holds sync config and credential ref.
IngestionJobOne ingest run. Tracks QUEUED → RUNNING → SUCCEEDED / FAILED, processed/failed counts, error message.
KnowledgeDocumentA parsed source (one PDF, one URL, one DOCX). Stores extracted text + checksum.
KnowledgeChunkA retrievable text slice — chunkIndex, text, tokenCount, embedding (Json vector).
GraphNodeAn entity / concept extracted from chunks for GraphRAG. Carries canonicalName and nodeType.
GraphEdgeA relation between two GraphNodes with relationType, weight, and evidenceChunkId.

All seven are namespaced by knowledgeSpaceId. Every retrieval scopes to a single space — cross-space reads are not allowed (Invariant 5 extended to Knowledge).

End-to-end flow

DataSource (file or URL)


IngestionJob ──► parsers ──► KnowledgeDocument ──► chunker ──► KnowledgeChunk (text + embedding)

                                                  ┌───────────────────┴─────────┐
                                                  ▼                             ▼
                                          ask / askStream                buildGraphFromChunks
                                          (retrieval → LLM)              (GraphNode + GraphEdge)
  1. DataSource is created (createSource for connectors, ingestFile / ingestUrl for ad-hoc uploads).
  2. IngestionJob is enqueued. The worker parses the source (packages/api/modules/knowledge/lib/parsers.ts), splits into chunks (chunking.ts), embeds each chunk, and writes the rows.
  3. Retrieval at query time: embed the question, fetch the top-k chunks by cosine similarity in retrieval.ts, and pass them as numbered context to the LLM in rag-pipeline.ts.
  4. The answer includes sources — the documents the cited chunks came from.

For GraphRAG, step 2 also runs buildGraphFromChunks (LLM-based entity resolution + relation typing). Retrieval at step 3 then has the option of traversing the graph instead of (or in addition to) vector similarity.

The ask endpoint

const result = await orpc.knowledge.ask.call({
  spaceId: "ks_…",
  question: "How do I reset my password?",
  topK: 5,            // default 5
  includeGraph: false // true to mix in GraphRAG hits
});

Returns { answer, sources: KnowledgeDocument[], chunks: KnowledgeChunk[] }. Cite sources in the UI; show chunks[].text snippets if you want a "view passages" affordance.

For streaming, use askStream — it emits the same payload but as server-sent events so you can render the answer character-by-character.

Privacy and tenant isolation

  • Each ask call is scoped to exactly one KnowledgeSpace.
  • The org / user check is done by requireKnowledgeSpaceAccess (packages/api/modules/knowledge/lib/access.ts).
  • Retrieval never crosses spaces, even within the same organization.
  • Chunks are sent to the configured LLM provider (OpenAI by default). Per-space model selection is Beta.
  • Source files are stored in object storage; chunks + embeddings live in Postgres alongside the rest of the schema.

For DPA / SOC 2 scope, see Security & Compliance.

When to use Knowledge RAG

  • Internal Q&A — onboarding docs, runbooks, policies.
  • Product knowledge base — FAQs, support articles, integration guides.
  • Compliance and audit — questions where the answer must cite a specific document.

When not to use it

  • Product catalog Q&A — use AI answers on top of storefront search instead. Knowledge RAG doesn't see your SearchIndex documents.
  • Live data — Knowledge sources are snapshots taken at ingest time. For freshness, schedule recurring ingestion jobs or wait for the connectors roadmap.
  • Numerical computation — RAG is a retrieval layer, not a calculator. Don't ask it for live inventory counts.
  • Adversarial or PII-heavy content — see Security docs before ingesting; per-tenant model isolation is on the roadmap, not shipped today.

Cost shape

Every ask, askStream, and ingest job is metered through the AI Wallet (Invariant 8). Reserve-then-commit on each call; failed calls release the reservation. Rates live in packages/api/modules/entitlements/credit-rates.ts.

Watch the Activity tab for ai_overage_reached events when running large bulk ingests — the per-document embedding cost dominates ingest spend.

On this page