AACsearch
Dashboard & Operations

Import Jobs

Bulk import history, JSONL payload shape, error handling, and how import jobs relate to the ingest buffer and quota.

The Import Jobs page at /[orgSlug]/import-jobs lists every bulk import operation across all indexes in your organization. Use it to track in-flight imports, debug failures, and audit data movement.

Tabelle mit dem Import-Auftragsverlauf, Status, Zeilenzahlen und Dauer

When import jobs are created

An import job is created any time you bulk-load documents into an index outside the normal Connector API delta path. Sources include:

  • Indexes tab → row actions → Import Documents — JSONL upload from the dashboard.
  • v1 REST POST /api/v1/indexes/{indexId}/documents:batch — server-side batch ingest.
  • Connector full sync — when a CMS module triggers a fresh full sync (incremental syncs do not create an import job; they go through the ingest buffer).

Single-document writes through the public ingest endpoint are not import jobs — they flow through SearchIngestBuffer and the worker.

JSONL payload format

The dashboard import accepts a JSONL file (one JSON object per line). Each object is a document in the shape your index schema expects.

{"external_id":"sku-1","title":"Linen shirt","price":4900,"in_stock":true}
{"external_id":"sku-2","title":"Cotton tee","price":1900,"in_stock":false}
{"external_id":"sku-3","title":"Wool jumper","price":7900,"in_stock":true}

Rules:

  • One JSON object per line. Blank lines are ignored.
  • Each line must include external_id (string). This is the stable upsert key.
  • All other fields must match the index schema. Unknown fields are rejected line-by-line.
  • Maximum file size: 100 MB. Split larger imports into multiple files.
  • Maximum lines per file: 1,000,000. Lines beyond the cap are not processed.

Job lifecycle and statuses

StatusMeaning
pendingQueued, worker has not picked it up yet
runningWorker is processing batches
completedAll lines processed (some may have failed — see failure count)
failedThe job aborted before completion (e.g. invalid file, quota blocked)
canceledOperator canceled the job from the row actions menu

Each row in the list shows:

  • Job ID and source (UI / API / connector)
  • Index name and slug
  • Status with a colored badge
  • Total lines processed
  • Failure count (with View errors action)
  • Started at / finished at / duration
  • Initiator (user or connector token name)

Error handling

Failures are recorded per line, not per job. A job with 1,000 lines and 7 invalid rows shows completed status with 7 in the Failures column. Click View errors to see the first 100 failure rows with:

  • Line number
  • Raw line content (truncated to 200 characters)
  • Error code (validation_error, schema_mismatch, duplicate_external_id, etc.)
  • Human-readable detail

Lines beyond the first 100 failures are counted but not stored, to keep error payloads bounded.

Common error codes

Error codeCauseHow to fix
invalid_jsonLine is not valid JSONValidate the file with jq -c . file.jsonl before upload
missing_external_idThe line lacks external_idAdd external_id to every row
validation_errorField value fails type or format check (e.g. price as string)Cast values to the schema types defined in the index settings
schema_mismatchField appears in the line but not in the index schemaEither add the field to the schema or strip it from the export
duplicate_external_idTwo lines share the same external_id within the same fileDe-duplicate before upload — later lines silently overwrite earlier ones if both pass validation
quota_exceededThe ingest count would exceed the plan's monthly indexed-document capUpgrade your plan or wait for the next billing period — see Plans & Limits
index_not_foundThe target index was deleted while the job was queuedRecreate the index and re-run the import

How an import interacts with quota

Every successfully processed line consumes one search unit of the indexed_documents budget. Failures do not consume units. The pre-flight check in the worker rejects the entire job with quota_exceeded if the line count would push you past your monthly cap; partial imports do not happen.

For exact unit definitions, see Plans & Limits and Billing → Usage units.

How an import becomes searchable

For incremental imports (delta sync), processed lines flow through SearchIngestBuffer and are committed to the live alias by the worker as soon as each batch finishes. Latency is typically a few seconds.

For full reindex imports (job source = connector_full_sync or manual_reindex), the worker writes into a new versioned index ({orgShortId}_{slug}_v{n+1}) and atomically swaps the alias after verification. The previous version stays live until the swap completes — no downtime, no half-written state.

See Index Management → Reindex for the reindex flow.

Canceling a job

A running job can be canceled from the row actions menu. Cancellation is cooperative: the worker stops picking up new batches but lets in-flight batches finish. Documents already written remain in the index. The job ends with status canceled and the Stopped at line N detail.

You cannot resume a canceled job. Re-upload the remaining lines if needed.

Retention

Import job records are retained for 90 days. After that, the job row is soft-deleted (deletedAt) and the per-line error payload is purged. The documents themselves remain in the index regardless of job retention.

Audit trail

Each job emits an audit event so you have a history beyond the 90-day job retention:

Audit actionWhen emitted
sync_connectorA connector full sync started or finished
update_schemaImports that trigger automatic schema changes

See Audit Logs for filtering and export.

CLI and API alternatives

For repeatable imports prefer the v1 REST endpoint over the dashboard:

curl -X POST https://app.aacsearch.com/api/v1/indexes/{indexId}/documents:batch \
  -H "Authorization: Bearer $AACSEARCH_ADMIN_KEY" \
  -H "Content-Type: application/x-ndjson" \
  --data-binary @products.jsonl

This avoids browser-side file-size limits and integrates with CI. Each batch call still creates an Import Job row, so the dashboard view stays the system of record.

On this page