Import Jobs

Bulk import history, JSONL payload shape, error handling, and how import jobs relate to the ingest buffer and quota.

The Import Jobs page at /[orgSlug]/import-jobs lists every bulk import operation across all indexes in your organization. Use it to track in-flight imports, debug failures, and audit data movement.

Tabelle mit dem Import-Auftragsverlauf, Status, Zeilenzahlen und Dauer

When import jobs are created

An import job is created any time you bulk-load documents into an index outside the normal Connector API delta path. Sources include:

Indexes tab → row actions → Import Documents — JSONL upload from the dashboard.
v1 REST POST /api/v1/indexes/{indexId}/documents:batch — server-side batch ingest.
Connector full sync — when a CMS module triggers a fresh full sync (incremental syncs do not create an import job; they go through the ingest buffer).

Single-document writes through the public ingest endpoint are not import jobs — they flow through SearchIngestBuffer and the worker.

JSONL payload format

The dashboard import accepts a JSONL file (one JSON object per line). Each object is a document in the shape your index schema expects.

{"external_id":"sku-1","title":"Linen shirt","price":4900,"in_stock":true}
{"external_id":"sku-2","title":"Cotton tee","price":1900,"in_stock":false}
{"external_id":"sku-3","title":"Wool jumper","price":7900,"in_stock":true}

Rules:

One JSON object per line. Blank lines are ignored.
Each line must include external_id (string). This is the stable upsert key.
All other fields must match the index schema. Unknown fields are rejected line-by-line.
Maximum file size: 100 MB. Split larger imports into multiple files.
Maximum lines per file: 1,000,000. Lines beyond the cap are not processed.

Job lifecycle and statuses

Status	Meaning
`pending`	Queued, worker has not picked it up yet
`running`	Worker is processing batches
`completed`	All lines processed (some may have failed — see failure count)
`failed`	The job aborted before completion (e.g. invalid file, quota blocked)
`canceled`	Operator canceled the job from the row actions menu

Each row in the list shows:

Job ID and source (UI / API / connector)
Index name and slug
Status with a colored badge
Total lines processed
Failure count (with View errors action)
Started at / finished at / duration
Initiator (user or connector token name)

Error handling

Failures are recorded per line, not per job. A job with 1,000 lines and 7 invalid rows shows completed status with 7 in the Failures column. Click View errors to see the first 100 failure rows with:

Line number
Raw line content (truncated to 200 characters)
Error code (validation_error, schema_mismatch, duplicate_external_id, etc.)
Human-readable detail

Lines beyond the first 100 failures are counted but not stored, to keep error payloads bounded.

Common error codes

Error code	Cause	How to fix
`invalid_json`	Line is not valid JSON	Validate the file with `jq -c . file.jsonl` before upload
`missing_external_id`	The line lacks `external_id`	Add `external_id` to every row
`validation_error`	Field value fails type or format check (e.g. price as string)	Cast values to the schema types defined in the index settings
`schema_mismatch`	Field appears in the line but not in the index schema	Either add the field to the schema or strip it from the export
`duplicate_external_id`	Two lines share the same `external_id` within the same file	De-duplicate before upload — later lines silently overwrite earlier ones if both pass validation
`quota_exceeded`	The ingest count would exceed the plan's monthly indexed-document cap	Upgrade your plan or wait for the next billing period — see Plans & Limits
`index_not_found`	The target index was deleted while the job was queued	Recreate the index and re-run the import

How an import interacts with quota

Every successfully processed line consumes one search unit of the indexed_documents budget. Failures do not consume units. The pre-flight check in the worker rejects the entire job with quota_exceeded if the line count would push you past your monthly cap; partial imports do not happen.

For exact unit definitions, see Plans & Limits and Billing → Usage units.

How an import becomes searchable

For incremental imports (delta sync), processed lines flow through SearchIngestBuffer and are committed to the live alias by the worker as soon as each batch finishes. Latency is typically a few seconds.

For full reindex imports (job source = connector_full_sync or manual_reindex), the worker writes into a new versioned index ({orgShortId}_{slug}_v{n+1}) and atomically swaps the alias after verification. The previous version stays live until the swap completes — no downtime, no half-written state.

See Index Management → Reindex for the reindex flow.

Canceling a job

A running job can be canceled from the row actions menu. Cancellation is cooperative: the worker stops picking up new batches but lets in-flight batches finish. Documents already written remain in the index. The job ends with status canceled and the Stopped at line N detail.

You cannot resume a canceled job. Re-upload the remaining lines if needed.

Retention

Import job records are retained for 90 days. After that, the job row is soft-deleted (deletedAt) and the per-line error payload is purged. The documents themselves remain in the index regardless of job retention.

Audit trail

Each job emits an audit event so you have a history beyond the 90-day job retention:

Audit action	When emitted
`sync_connector`	A connector full sync started or finished
`update_schema`	Imports that trigger automatic schema changes

See Audit Logs for filtering and export.

CLI and API alternatives

For repeatable imports prefer the v1 REST endpoint over the dashboard:

curl -X POST https://app.aacsearch.com/api/v1/indexes/{indexId}/documents:batch \
  -H "Authorization: Bearer $AACSEARCH_ADMIN_KEY" \
  -H "Content-Type: application/x-ndjson" \
  --data-binary @products.jsonl

This avoids browser-side file-size limits and integrates with CI. Each batch call still creates an Import Job row, so the dashboard view stays the system of record.

Search Workspace — Indexes tab, Playground, API Keys, Widget.
Index Management — schema, reindex, sync history.
Plans & Limits — indexed-document caps and overage behavior.
Connectors → Overview — full-sync vs delta-sync, when each path is used.
Audit Logs — history beyond the 90-day job-row retention.

Import Jobs

On this page