API

Scan

A scan is on-demand row selection over a namespace. It picks rows by one of four selectors and returns their IDs (mode: ids, an asynchronous job), their count (mode: count, synchronous), or the distinct values of one attribute field (mode: values, an asynchronous job):

Input	Field	Meaning	Notes
Filter selector	`filters`	An attribute predicate, or all rows when omitted.	Exact
Full-text selector	`fts`	A BM25 predicate against a text field.	Exact
Hybrid-text selector	`hybrid_text`	The BM25 leg, the per-token fuzzy legs, and the per-token surfacing legs — a superset of the `hybrid_text` query route (see Hybrid text count).	Exact
Radius selector	`ann`	Rows within `radius` of a query vector.	Approximate (ANN recall)
Fan-out control	`threads`	Maximum concurrent upstream requests for origin scatter/gather.	Origin only; defaults from `Index.spec.scan.threads`, then `8`.

Origin scatter/gather is enabled only for namespaces whose shard backfill is complete. For adopted namespaces initialized through POST /v2/namespaces/{ns}/init, scans stay on the single-namespace origin path while layer.shard_lag_rows is greater than 0; this keeps count, ID, and values scans from missing rows that have not yet been stamped with _hevlayer_shard.

A request carries at most one ranked selector (fts, hybrid_text, or ann). filters is always optional and, when present alongside a ranked selector, is ANDed onto the match set as an extra constraint. A request with more than one ranked selector is a 422.

At cutover, mode: ids is filter-only (ranked IDs are a defined fast-follow), while mode: count and mode: values support all four selectors. Use scans for bulk exports, manual inspection, UDF discovery debugging, cache/origin consistency checks, exact or approximate counts, and field value discovery.

Routes

Route	Method	Behavior
`POST /v2/namespaces/{ns}/scans`	POST	Create an ID or values scan job, or return a count.
`GET /v2/namespaces/{ns}/scans`	GET	List scan jobs for the namespace.
`GET /v2/namespaces/{ns}/scans/{id}`	GET	Read one scan job.
`GET /v2/namespaces/{ns}/scans/{id}/results`	GET	Read completed scan IDs or values.
`DELETE /v2/namespaces/{ns}/scans/{id}`	DELETE	Drop the in-memory scan job.

ID Mode

job = await client.create_scan("products", {
    "source": "auto",
    "mode": "ids",
    "filters": ["category", "Eq", "Electronics"],
    "threads": 8,
    "page_size": 1000,
})

job, err := client.CreateScan(ctx, "products", &hevlayer.CreateScanRequest{
    Source:   "auto",
    Mode:     "ids",
    Filters:  []interface{}{"category", "Eq", "Electronics"},
    Threads:  8,
    PageSize: 1000,
})

const job = await client.createScan("products", {
  source: "auto",
  mode: "ids",
  filters: ["category", "Eq", "Electronics"],
  threads: 8,
  page_size: 1000,
});

curl -X POST "$LAYER_GATEWAY_URL/v2/namespaces/products/scans" \
  -H "Authorization: Bearer $LAYER_GATEWAY_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "source": "auto",
    "mode": "ids",
    "filters": ["category", "Eq", "Electronics"],
    "threads": 8,
    "page_size": 1000
  }'

mode defaults to ids. Valid ID-mode sources are auto, cache, and origin. The Python and TypeScript clients also ship scan(...) helpers that create the job and poll until it completes; in Go, poll GetScan until status is completed.

The create response is 202 Accepted:

{
  "id": "scan-uuid",
  "namespace": "products",
  "source": "auto",
  "effective_source": "origin",
  "status": "running",
  "progress": 0,
  "documents_scanned": 0,
  "threads": 8,
  "created_at": "2026-05-26T10:00:00Z"
}

Read IDs after status is completed:

results = await client.get_scan_results("products", job.id, limit=1000, offset=0)

results, err := client.GetScanResults(ctx, "products", scanID,
    &hevlayer.GetScanResultsParams{Limit: 1000, Offset: 0})

const results = await client.getScanResults("products", job.id, {
  limit: 1000,
  offset: 0,
});

curl "$LAYER_GATEWAY_URL/v2/namespaces/products/scans/scan-uuid/results?limit=1000&offset=0" \
  -H "Authorization: Bearer $LAYER_GATEWAY_API_KEY"

{
  "ids": ["doc-1", "doc-2"],
  "total": 2
}

Count Mode

count = await client.create_scan("products", {
    "mode": "count",
    "source": "auto",
    "filters": ["category", "Eq", "Electronics"],
    "threads": 8,
    "timeout_seconds": 30,
})

count, err := client.CreateScan(ctx, "products", &hevlayer.CreateScanRequest{
    Mode:           "count",
    Source:         "auto",
    Filters:        []interface{}{"category", "Eq", "Electronics"},
    Threads:        8,
    TimeoutSeconds: 30,
})

const count = await client.createScan("products", {
  mode: "count",
  source: "auto",
  filters: ["category", "Eq", "Electronics"],
  threads: 8,
  timeout_seconds: 30,
});

curl -X POST "$LAYER_GATEWAY_URL/v2/namespaces/products/scans" \
  -H "Authorization: Bearer $LAYER_GATEWAY_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "mode": "count",
    "source": "auto",
    "filters": ["category", "Eq", "Electronics"],
    "threads": 8,
    "timeout_seconds": 30
  }'

{
  "count": 4210,
  "served_by": "snapshot",
  "snapshot_sha": "3f9e8b21",
  "watermark_ms": 1747300000123,
  "elapsed_ms": 3
}

When watermark_ms is present, the response also includes x-layer-stable-as-of with the same epoch-ms value.

Count-mode sources are auto, snapshot, cache, and origin. Snapshot reads are eligible only for a single leaf Eq or In filter on a field present in the latest snapshot fields[]. And, Or, Not, range operators, fields absent from the snapshot, and skipped fields fall through under auto and fail with 412 precondition_failed under source: snapshot.

All scan modes accept the same temporal selectors as query: as_of conjoins _hevlayer_upserted_at <= as_of; between: [lo, hi] conjoins lo < _hevlayer_upserted_at <= hi. The temporal predicate is ANDed with filters and with any ranked selector (fts, hybrid_text, or ann). Snapshot-served scans cannot evaluate temporal windows from a pre-aggregated body, so source: auto falls through to cache/origin when a temporal selector is present and source: snapshot fails with 412 precondition_failed.

Live count responses include:

{
  "count": 4210,
  "served_by": "origin",
  "bounded": false,
  "timed_out": false,
  "shards_saturated": 0,
  "shards_total": 1,
  "threads": 1,
  "elapsed_ms": 42
}

Values Mode

A values scan enumerates the distinct values of one attribute field over the rows the selector picks, each with its document count. Use it to discover a field’s value set — what product categories exist, what tags appear on rows matching a query — instead of confirming values you already know with counts.

field is required for mode: values (and rejected on other modes with 422). It must name a scalar string or integer attribute, or an array of strings — each array element counts once per containing document. Vector fields are a 422.

job = await client.create_scan("products", {
    "mode": "values",
    "field": "category",
    "source": "auto",
    "filters": ["in_stock", "Eq", True],
})

job, err := client.CreateScan(ctx, "products", &hevlayer.CreateScanRequest{
    Mode:    "values",
    Field:   "category",
    Source:  "auto",
    Filters: []interface{}{"in_stock", "Eq", true},
})

const job = await client.createScan("products", {
  mode: "values",
  field: "category",
  source: "auto",
  filters: ["in_stock", "Eq", true],
});

curl -X POST "$LAYER_GATEWAY_URL/v2/namespaces/products/scans" \
  -H "Authorization: Bearer $LAYER_GATEWAY_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "mode": "values",
    "field": "category",
    "source": "auto",
    "filters": ["in_stock", "Eq", true]
  }'

Like ID mode, the create response is a 202 Accepted job, and the scan(...) SDK helpers poll it to completion:

{
  "id": "scan-uuid",
  "namespace": "products",
  "mode": "values",
  "field": "category",
  "source": "auto",
  "effective_source": "origin",
  "status": "running",
  "progress": 0,
  "documents_scanned": 0,
  "threads": 8,
  "created_at": "2026-05-26T10:00:00Z"
}

Read values from the same results route after status is completed, with the same limit/offset pagination as scan IDs:

{
  "values": [
    {"v": "electronics", "n": 4210},
    {"v": "books", "n": 1240}
  ],
  "total": 2,
  "truncated": false
}

v/n is the same vocabulary snapshot facet histograms use: v is the value, n its document count. Ordering is deterministic — n descending, then v ascending. Counts are exact for filter-selector scans; on a ranked scan with a saturated shard the job carries bounded: true and each n is a >= lower bound.

Precomputed serving

An unfiltered values scan (no filters, no ranked selector) on a field present in the latest snapshot fields[] is answered straight from the snapshot’s facet histogram: the job completes during the create call — the 202 body already shows status: completed — and carries effective_source: snapshot with snapshot_sha and watermark_ms. Fields in fields_skipped[] or absent from the snapshot fall through to cache/origin under auto and fail with 412 precondition_failed under explicit source: snapshot, as do scans carrying any selector.

High cardinality

Snapshot facet histograms cap each field at 10,000 distinct values and skip fields beyond it; values scans are the enumeration path for exactly those fields. A values job accumulates its histogram in gateway memory and caps the listing at 1,000,000 distinct values. A scan that crosses the cap completes rather than failing:

The cap applies after the full pass, so every emitted n stays exact.
The listing truncates deterministically to the top 1,000,000 values by count (value-ascending tiebreak); the low-count tail is dropped.
The job and its results carry truncated: true, meaning the listing is incomplete.

truncated, bounded, and approximate are independent flags: truncated is a gateway memory bound on the listing, bounded is upstream top_k saturation on a ranked scan’s counts, and approximate is ANN recall fuzz on a radius ball’s membership.

Fan-out width

Origin scans fan out one upstream request per active shard. threads sets the maximum number of those upstream requests a single scan may have in flight at once. It means concurrent requests, not operating-system threads; the gateway is async.

Resolution order:

threads on the scan request.
spec.scan.threads on the namespace’s Index resource.
The gateway default, 8.

The effective value is clamped to the active shard count and the server cap, 32, then echoed as threads on origin responses and completed scan jobs. Snapshot and cache reads do not fan out, so they ignore this field and omit the echo.

Full-text count

Count rows matching a BM25 query with the fts selector. Full-text counts are exact and always run origin scatter/gather, so source must be omitted, auto, or origin. A filters array, when present, is ANDed on as an extra constraint.

count = await client.create_scan("products", {
    "mode": "count",
    "fts": {"field": "title", "query": "wireless headphones"},
    "filters": ["category", "Eq", "Electronics"],
    "exhaustive": True,
})

count, err := client.CreateScan(ctx, "products", &hevlayer.CreateScanRequest{
    Mode:       "count",
    Fts:        &hevlayer.FtsScan{Field: "title", Query: "wireless headphones"},
    Filters:    []interface{}{"category", "Eq", "Electronics"},
    Exhaustive: true,
})

const count = await client.createScan("products", {
  mode: "count",
  fts: { field: "title", query: "wireless headphones" },
  filters: ["category", "Eq", "Electronics"],
  exhaustive: true,
});

curl -X POST "$LAYER_GATEWAY_URL/v2/namespaces/products/scans" \
  -H "Authorization: Bearer $LAYER_GATEWAY_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "mode": "count",
    "fts": {"field": "title", "query": "wireless headphones"},
    "filters": ["category", "Eq", "Electronics"],
    "exhaustive": true
  }'

Hybrid text count

Count rows in the keyword/fuzzy neighborhood of a HybridText query with the hybrid_text selector. The scan tokenizes query with the HybridText policy, then evaluates the BM25 leg, one fuzzy leg per token, and one surfacing leg per token (the RFC 0057 empty-result fallback’s legs), and counts the de-duplicated union of returned row ids.

This count is a superset of the hybrid_text query route’s deduped rows: the scan always includes the surfacing legs, whereas the query route only adds them when its primary legs (BM25 + fuzzy) return nothing. On a partial-typo query whose primary legs do match, the scan can therefore count more rows than the route returns. Use this selector for a generous live count next to hybrid_text or auto results that routed to hybrid_text; plain fts counts exact BM25 only.

count = await client.create_scan("products", {
    "mode": "count",
    "hybrid_text": {"field": "title", "query": "wireles headphones"},
    "filters": ["category", "Eq", "Electronics"],
})

count, err := client.CreateScan(ctx, "products", &hevlayer.CreateScanRequest{
    Mode:       "count",
    HybridText: &hevlayer.HybridTextScan{Field: "title", Query: "wireles headphones"},
    Filters:    []interface{}{"category", "Eq", "Electronics"},
})

const count = await client.createScan("products", {
  mode: "count",
  hybrid_text: { field: "title", query: "wireles headphones" },
  filters: ["category", "Eq", "Electronics"],
});

curl -X POST "$LAYER_GATEWAY_URL/v2/namespaces/products/scans" \
  -H "Authorization: Bearer $LAYER_GATEWAY_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "mode": "count",
    "hybrid_text": {"field": "title", "query": "wireles headphones"},
    "filters": ["category", "Eq", "Electronics"]
  }'

Radius count

Count rows within radius of a query vector with the ann selector — a distance-ball scan. radius is required and finite (without an upper bound every row is in the ball); field defaults to vector. Like fts, radius counts always run origin scatter/gather.

The count is approximate: ANN recall means the index’s membership of the ball may differ from the true set, independent of saturation, so the response carries approximate: true.

The radius bound is applied by the gateway to the $dist returned by the ranked query. It is not sent upstream as a filter.

count = await client.create_scan("products", {
    "mode": "count",
    "ann": {"field": "vector", "vector": [0.12, -0.3, 0.88], "radius": 0.25},
})

count, err := client.CreateScan(ctx, "products", &hevlayer.CreateScanRequest{
    Mode: "count",
    Ann:  &hevlayer.AnnScan{Field: "vector", Vector: []float64{0.12, -0.3, 0.88}, Radius: 0.25},
})

const count = await client.createScan("products", {
  mode: "count",
  ann: { field: "vector", vector: [0.12, -0.3, 0.88], radius: 0.25 },
});

curl -X POST "$LAYER_GATEWAY_URL/v2/namespaces/products/scans" \
  -H "Authorization: Bearer $LAYER_GATEWAY_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "mode": "count",
    "ann": {"field": "vector", "vector": [0.12, -0.3, 0.88], "radius": 0.25}
  }'

{
  "count": 980,
  "served_by": "origin",
  "approximate": true,
  "bounded": false,
  "timed_out": false,
  "shards_saturated": 0,
  "shards_total": 1,
  "threads": 1,
  "elapsed_ms": 51
}

Bounding ranked scans

Ranked selectors fan out one turbopuffer query per shard, each capped at top_k = 10_000. threads bounds fan-out width: how many shard requests can run at once. exhaustive and timeout_seconds bound depth: what happens when a shard hits that cap and how long recursion can run.

exhaustive: false (default) — one scatter/gather. A saturated shard contributes its cap as a lower bound; the response carries bounded: true with shards_saturated > 0.
exhaustive: true — for BM25, recurse on each saturated shard via score-band pagination ($score < last with an id tiebreak) until every page is short or timeout_seconds elapses. ANN radius scans do not push $dist filters upstream; the gateway counts returned rows whose $dist <= radius and marks the shard exhausted when the first over-radius row appears. If the full page is still inside the radius, the shard remains bounded.

The same threads value applies to the initial round and every exhaustive round over the remaining saturated shards.

bounded and approximate are independent. bounded means a shard saturated and the count is a >= lower bound for the rows the index returned; approximate means the distance ball’s membership is itself fuzzy. An ann count can be bounded: false yet still approximate: true.

Sources

Source	ID mode	Count mode	Values mode
`auto`	Cache when fresh enough, otherwise origin	Snapshot first, then cache/origin.	Snapshot when eligible, then cache/origin.
`snapshot`	Not supported	Latest snapshot only; requires eligible `Eq` or `In`.	Latest snapshot facet listing; requires an unfiltered scan on a field in `fields[]`.
`cache`	Aerospike document cache only	Aerospike document cache only	Aerospike document cache only.
`origin`	turbopuffer paginated scan	turbopuffer paginated scan	turbopuffer paginated scan with gateway-side dedupe.

This table covers the filter selector. The fts, hybrid_text, and ann selectors have no snapshot or cache evaluator, so they always run origin scatter/gather: omitted, auto, and origin all resolve to origin, and snapshot or cache returns 422.

Filters

Scans accept the same turbopuffer filter array as query. On origin scans, the filter is pushed to turbopuffer. On cache scans, the gateway evaluates it against cached document attributes.

Supported cache operators are Eq, NotEq, Gt, Gte, Lt, Lte, In, NotIn, And, Or, and Not. If auto sees a filter the cache cannot evaluate, it uses origin. Explicit source: cache with an unsupported filter fails rather than returning partial results.

Auto-Mode Policy

Auto ties cache freshness to the same consistency watermark used by stable reads. The gateway tracks per-namespace cache_warmed_through, the watermark observed at the end of the last successful origin warm.

Cache state	Watermark state	Action
Empty	any	Run origin and stamp `cache_warmed_through`.
Populated, `cache_warmed_through >= watermark`	observed	Serve cache.
Populated, `cache_warmed_through < watermark`	observed	Serve cache and start a background origin warm.
Populated, no `cache_warmed_through` yet	observed	Serve cache and start a background origin warm.
Populated	not yet observed	Serve cache.

When cache is used, _hevlayer_upserted_at <= cache_warmed_through is added before the user filter so the scan is a stable warmed view.

Operational notes

ID and values scan state is in-memory and ephemeral; it resets on gateway restart.
Count scans have a deadline, default 30s and maximum 300s.
Values jobs cap at 1,000,000 distinct values per scan and set truncated: true when crossed; the listing keeps the top values by count, each with an exact count.
Origin scan fan-out defaults to 8 concurrent upstream requests per scan unless the request or Index.spec.scan.threads sets a different value.
Snapshot-served count scans are exact at the snapshot watermark_ms.