Import Jobs
Bulk import history, JSONL payload shape, error handling, and how import jobs relate to the ingest buffer and quota.
The Import Jobs page at /[orgSlug]/import-jobs lists every bulk import operation across all indexes in your organization. Use it to track in-flight imports, debug failures, and audit data movement.
When import jobs are created
An import job is created any time you bulk-load documents into an index outside the normal Connector API delta path. Sources include:
- Indexes tab → row actions → Import Documents — JSONL upload from the dashboard.
- v1 REST
POST /api/v1/indexes/{indexId}/documents:batch— server-side batch ingest. - Connector full sync — when a CMS module triggers a fresh full sync (incremental syncs do not create an import job; they go through the ingest buffer).
Single-document writes through the public ingest endpoint are not import jobs — they flow through SearchIngestBuffer and the worker.
JSONL payload format
The dashboard import accepts a JSONL file (one JSON object per line). Each object is a document in the shape your index schema expects.
{"external_id":"sku-1","title":"Linen shirt","price":4900,"in_stock":true}
{"external_id":"sku-2","title":"Cotton tee","price":1900,"in_stock":false}
{"external_id":"sku-3","title":"Wool jumper","price":7900,"in_stock":true}Rules:
- One JSON object per line. Blank lines are ignored.
- Each line must include
external_id(string). This is the stable upsert key. - All other fields must match the index schema. Unknown fields are rejected line-by-line.
- Maximum file size: 100 MB. Split larger imports into multiple files.
- Maximum lines per file: 1,000,000. Lines beyond the cap are not processed.
Job lifecycle and statuses
| Status | Meaning |
|---|---|
pending | Queued, worker has not picked it up yet |
running | Worker is processing batches |
completed | All lines processed (some may have failed — see failure count) |
failed | The job aborted before completion (e.g. invalid file, quota blocked) |
canceled | Operator canceled the job from the row actions menu |
Each row in the list shows:
- Job ID and source (UI / API / connector)
- Index name and slug
- Status with a colored badge
- Total lines processed
- Failure count (with View errors action)
- Started at / finished at / duration
- Initiator (user or connector token name)
Error handling
Failures are recorded per line, not per job. A job with 1,000 lines and 7 invalid rows shows completed status with 7 in the Failures column. Click View errors to see the first 100 failure rows with:
- Line number
- Raw line content (truncated to 200 characters)
- Error code (
validation_error,schema_mismatch,duplicate_external_id, etc.) - Human-readable detail
Lines beyond the first 100 failures are counted but not stored, to keep error payloads bounded.
Common error codes
| Error code | Cause | How to fix |
|---|---|---|
invalid_json | Line is not valid JSON | Validate the file with jq -c . file.jsonl before upload |
missing_external_id | The line lacks external_id | Add external_id to every row |
validation_error | Field value fails type or format check (e.g. price as string) | Cast values to the schema types defined in the index settings |
schema_mismatch | Field appears in the line but not in the index schema | Either add the field to the schema or strip it from the export |
duplicate_external_id | Two lines share the same external_id within the same file | De-duplicate before upload — later lines silently overwrite earlier ones if both pass validation |
quota_exceeded | The ingest count would exceed the plan's monthly indexed-document cap | Upgrade your plan or wait for the next billing period — see Plans & Limits |
index_not_found | The target index was deleted while the job was queued | Recreate the index and re-run the import |
How an import interacts with quota
Every successfully processed line consumes one search unit of the indexed_documents budget. Failures do not consume units. The pre-flight check in the worker rejects the entire job with quota_exceeded if the line count would push you past your monthly cap; partial imports do not happen.
For exact unit definitions, see Plans & Limits and Billing → Usage units.
How an import becomes searchable
For incremental imports (delta sync), processed lines flow through SearchIngestBuffer and are committed to the live alias by the worker as soon as each batch finishes. Latency is typically a few seconds.
For full reindex imports (job source = connector_full_sync or manual_reindex), the worker writes into a new versioned index ({orgShortId}_{slug}_v{n+1}) and atomically swaps the alias after verification. The previous version stays live until the swap completes — no downtime, no half-written state.
See Index Management → Reindex for the reindex flow.
Canceling a job
A running job can be canceled from the row actions menu. Cancellation is cooperative: the worker stops picking up new batches but lets in-flight batches finish. Documents already written remain in the index. The job ends with status canceled and the Stopped at line N detail.
You cannot resume a canceled job. Re-upload the remaining lines if needed.
Retention
Import job records are retained for 90 days. After that, the job row is soft-deleted (deletedAt) and the per-line error payload is purged. The documents themselves remain in the index regardless of job retention.
Audit trail
Each job emits an audit event so you have a history beyond the 90-day job retention:
| Audit action | When emitted |
|---|---|
sync_connector | A connector full sync started or finished |
update_schema | Imports that trigger automatic schema changes |
See Audit Logs for filtering and export.
CLI and API alternatives
For repeatable imports prefer the v1 REST endpoint over the dashboard:
curl -X POST https://app.aacsearch.com/api/v1/indexes/{indexId}/documents:batch \
-H "Authorization: Bearer $AACSEARCH_ADMIN_KEY" \
-H "Content-Type: application/x-ndjson" \
--data-binary @products.jsonlThis avoids browser-side file-size limits and integrates with CI. Each batch call still creates an Import Job row, so the dashboard view stays the system of record.
Related
- Search Workspace — Indexes tab, Playground, API Keys, Widget.
- Index Management — schema, reindex, sync history.
- Plans & Limits — indexed-document caps and overage behavior.
- Connectors → Overview — full-sync vs delta-sync, when each path is used.
- Audit Logs — history beyond the 90-day job-row retention.