API
Scan
A scan is on-demand row selection over a namespace. It picks rows by one
of three selectors and returns their IDs (mode: ids, an
asynchronous job), their count (mode: count, synchronous), or the
distinct values of one attribute field (mode: values, an asynchronous
job):
| Input | Field | Meaning | Notes |
|---|---|---|---|
| Filter selector | filters | An attribute predicate, or all rows when omitted. | Exact |
| Full-text selector | fts | A BM25 predicate against a text field. | Exact |
| Radius selector | ann | Rows within radius of a query vector. | Approximate (ANN recall) |
| Fan-out control | threads | Maximum concurrent upstream requests for origin scatter/gather. | Origin only; defaults from Index.spec.scan.threads, then 8. |
A request carries at most one ranked selector (fts or ann).
filters is always optional and, when present alongside a ranked
selector, is ANDed onto the match set as an extra constraint. A request
with both fts and ann is a 422.
At cutover, mode: ids is filter-only (ranked IDs are a defined
fast-follow), while mode: count and mode: values support all three
selectors. Use scans for bulk exports, manual inspection, UDF discovery
debugging, cache/origin consistency checks, exact or approximate
counts, and field value discovery.
Routes
| Route | Method | Behavior |
|---|---|---|
POST /v2/namespaces/{ns}/scans | POST | Create an ID or values scan job, or return a count. |
GET /v2/namespaces/{ns}/scans | GET | List scan jobs for the namespace. |
GET /v2/namespaces/{ns}/scans/{id} | GET | Read one scan job. |
GET /v2/namespaces/{ns}/scans/{id}/results | GET | Read completed scan IDs or values. |
DELETE /v2/namespaces/{ns}/scans/{id} | DELETE | Drop the in-memory scan job. |
ID Mode
job = await client.create_scan("products", {
"source": "auto",
"mode": "ids",
"filters": ["category", "Eq", "Electronics"],
"threads": 8,
"page_size": 1000,
})job, err := client.CreateScan(ctx, "products", &hevlayer.CreateScanRequest{
Source: "auto",
Mode: "ids",
Filters: []interface{}{"category", "Eq", "Electronics"},
Threads: 8,
PageSize: 1000,
})const job = await client.createScan("products", {
source: "auto",
mode: "ids",
filters: ["category", "Eq", "Electronics"],
threads: 8,
page_size: 1000,
});curl -X POST "$LAYER_GATEWAY_URL/v2/namespaces/products/scans" \
-H "Authorization: Bearer $LAYER_GATEWAY_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"source": "auto",
"mode": "ids",
"filters": ["category", "Eq", "Electronics"],
"threads": 8,
"page_size": 1000
}' mode defaults to ids. Valid ID-mode sources are auto, cache, and
origin. The Python and TypeScript clients also ship scan(...)
helpers that create the job and poll until it completes; in Go, poll
GetScan until status is completed.
The create response is 202 Accepted:
{
"id": "scan-uuid",
"namespace": "products",
"source": "auto",
"effective_source": "origin",
"status": "running",
"progress": 0,
"documents_scanned": 0,
"threads": 8,
"created_at": "2026-05-26T10:00:00Z"
}
Read IDs after status is completed:
results = await client.get_scan_results("products", job.id, limit=1000, offset=0)results, err := client.GetScanResults(ctx, "products", scanID,
&hevlayer.GetScanResultsParams{Limit: 1000, Offset: 0})const results = await client.getScanResults("products", job.id, {
limit: 1000,
offset: 0,
});curl "$LAYER_GATEWAY_URL/v2/namespaces/products/scans/scan-uuid/results?limit=1000&offset=0" \
-H "Authorization: Bearer $LAYER_GATEWAY_API_KEY" {
"ids": ["doc-1", "doc-2"],
"total": 2
}
Count Mode
count = await client.create_scan("products", {
"mode": "count",
"source": "auto",
"filters": ["category", "Eq", "Electronics"],
"threads": 8,
"timeout_seconds": 30,
})count, err := client.CreateScan(ctx, "products", &hevlayer.CreateScanRequest{
Mode: "count",
Source: "auto",
Filters: []interface{}{"category", "Eq", "Electronics"},
Threads: 8,
TimeoutSeconds: 30,
})const count = await client.createScan("products", {
mode: "count",
source: "auto",
filters: ["category", "Eq", "Electronics"],
threads: 8,
timeout_seconds: 30,
});curl -X POST "$LAYER_GATEWAY_URL/v2/namespaces/products/scans" \
-H "Authorization: Bearer $LAYER_GATEWAY_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"mode": "count",
"source": "auto",
"filters": ["category", "Eq", "Electronics"],
"threads": 8,
"timeout_seconds": 30
}' {
"count": 4210,
"served_by": "snapshot",
"snapshot_sha": "3f9e8b21",
"watermark_ms": 1747300000123,
"elapsed_ms": 3
}
When watermark_ms is present, the response also includes
x-layer-stable-as-of with the same epoch-ms value.
Count-mode sources are auto, snapshot, cache, and origin.
Snapshot reads are eligible only for a single leaf Eq or In filter
on a field present in the latest snapshot fields[]. And, Or,
Not, range operators, fields absent from the snapshot, and skipped
fields fall through under auto and fail with 412 precondition_failed
under source: snapshot.
Live count responses include:
{
"count": 4210,
"served_by": "origin",
"bounded": false,
"timed_out": false,
"shards_saturated": 0,
"shards_total": 1,
"threads": 1,
"elapsed_ms": 42
}
Values Mode
A values scan enumerates the distinct values of one attribute field
over the rows the selector picks, each with its document count. Use it
to discover a field’s value set — what product categories exist, what
tags appear on rows matching a query — instead of confirming values you
already know with counts.
field is required for mode: values (and rejected on other modes
with 422). It must name a scalar string or integer attribute, or an
array of strings — each array element counts once per containing
document. Vector fields are a 422.
job = await client.create_scan("products", {
"mode": "values",
"field": "category",
"source": "auto",
"filters": ["in_stock", "Eq", True],
})job, err := client.CreateScan(ctx, "products", &hevlayer.CreateScanRequest{
Mode: "values",
Field: "category",
Source: "auto",
Filters: []interface{}{"in_stock", "Eq", true},
})const job = await client.createScan("products", {
mode: "values",
field: "category",
source: "auto",
filters: ["in_stock", "Eq", true],
});curl -X POST "$LAYER_GATEWAY_URL/v2/namespaces/products/scans" \
-H "Authorization: Bearer $LAYER_GATEWAY_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"mode": "values",
"field": "category",
"source": "auto",
"filters": ["in_stock", "Eq", true]
}' Like ID mode, the create response is a 202 Accepted job, and the
scan(...) SDK helpers poll it to completion:
{
"id": "scan-uuid",
"namespace": "products",
"mode": "values",
"field": "category",
"source": "auto",
"effective_source": "origin",
"status": "running",
"progress": 0,
"documents_scanned": 0,
"threads": 8,
"created_at": "2026-05-26T10:00:00Z"
}
Read values from the same results route after status is completed,
with the same limit/offset pagination as scan IDs:
{
"values": [
{"v": "electronics", "n": 4210},
{"v": "books", "n": 1240}
],
"total": 2,
"truncated": false
}
v/n is the same vocabulary snapshot facet
histograms use: v is the value, n its document count. Ordering is
deterministic — n descending, then v ascending. Counts are exact
for filter-selector scans; on a ranked scan with a saturated shard the
job carries bounded: true and each n is a >= lower bound.
Precomputed serving
An unfiltered values scan (no filters, no ranked selector) on a field
present in the latest snapshot fields[] is answered straight from the
snapshot’s facet histogram: the job completes during the create call —
the 202 body already shows status: completed — and carries
effective_source: snapshot with snapshot_sha and watermark_ms. Fields in
fields_skipped[] or absent from the snapshot fall through to
cache/origin under auto and fail with 412 precondition_failed under
explicit source: snapshot, as do scans carrying any selector.
High cardinality
Snapshot facet histograms cap each field at 10,000 distinct values and skip fields beyond it; values scans are the enumeration path for exactly those fields. A values job accumulates its histogram in gateway memory and caps the listing at 1,000,000 distinct values. A scan that crosses the cap completes rather than failing:
- The cap applies after the full pass, so every emitted
nstays exact. - The listing truncates deterministically to the top 1,000,000 values by count (value-ascending tiebreak); the low-count tail is dropped.
- The job and its results carry
truncated: true, meaning the listing is incomplete.
truncated, bounded, and approximate are independent flags:
truncated is a gateway memory bound on the listing, bounded is
upstream top_k saturation on a ranked scan’s counts, and approximate
is ANN recall fuzz on a radius ball’s membership.
Fan-out width
Origin scans fan out one upstream request per active shard. threads
sets the maximum number of those upstream requests a single scan may have
in flight at once. It means concurrent requests, not operating-system
threads; the gateway is async.
Resolution order:
threadson the scan request.spec.scan.threadson the namespace’sIndexresource.- The gateway default,
8.
The effective value is clamped to the active shard count and the server
cap, 32, then echoed as threads on origin responses and completed
scan jobs. Snapshot and cache reads do not fan out, so they ignore this
field and omit the echo.
Full-text count
Count rows matching a BM25 query with the fts selector. Full-text
counts are exact and always run origin scatter/gather, so source must be
omitted, auto, or origin. A filters array, when present, is ANDed on
as an extra constraint.
count = await client.create_scan("products", {
"mode": "count",
"fts": {"field": "title", "query": "wireless headphones"},
"filters": ["category", "Eq", "Electronics"],
"exhaustive": True,
})count, err := client.CreateScan(ctx, "products", &hevlayer.CreateScanRequest{
Mode: "count",
Fts: &hevlayer.FtsScan{Field: "title", Query: "wireless headphones"},
Filters: []interface{}{"category", "Eq", "Electronics"},
Exhaustive: true,
})const count = await client.createScan("products", {
mode: "count",
fts: { field: "title", query: "wireless headphones" },
filters: ["category", "Eq", "Electronics"],
exhaustive: true,
});curl -X POST "$LAYER_GATEWAY_URL/v2/namespaces/products/scans" \
-H "Authorization: Bearer $LAYER_GATEWAY_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"mode": "count",
"fts": {"field": "title", "query": "wireless headphones"},
"filters": ["category", "Eq", "Electronics"],
"exhaustive": true
}' Radius count
Count rows within radius of a query vector with the ann selector — a
distance-ball scan. radius is required and finite (without an upper
bound every row is in the ball); field defaults to vector. Like
fts, radius counts always run origin scatter/gather.
The count is approximate: ANN recall means the index’s membership of
the ball may differ from the true set, independent of saturation, so the
response carries approximate: true.
count = await client.create_scan("products", {
"mode": "count",
"ann": {"field": "vector", "vector": [0.12, -0.3, 0.88], "radius": 0.25},
})count, err := client.CreateScan(ctx, "products", &hevlayer.CreateScanRequest{
Mode: "count",
Ann: &hevlayer.AnnScan{Field: "vector", Vector: []float64{0.12, -0.3, 0.88}, Radius: 0.25},
})const count = await client.createScan("products", {
mode: "count",
ann: { field: "vector", vector: [0.12, -0.3, 0.88], radius: 0.25 },
});curl -X POST "$LAYER_GATEWAY_URL/v2/namespaces/products/scans" \
-H "Authorization: Bearer $LAYER_GATEWAY_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"mode": "count",
"ann": {"field": "vector", "vector": [0.12, -0.3, 0.88], "radius": 0.25}
}' {
"count": 980,
"served_by": "origin",
"approximate": true,
"bounded": false,
"timed_out": false,
"shards_saturated": 0,
"shards_total": 1,
"threads": 1,
"elapsed_ms": 51
}
Bounding ranked scans
Ranked selectors fan out one Turbopuffer query per shard, each capped at
top_k = 10_000. threads bounds fan-out width: how many shard
requests can run at once. exhaustive and timeout_seconds bound depth:
what happens when a shard hits that cap and how long recursion can run.
exhaustive: false(default) — one scatter/gather. A saturated shard contributes its cap as a lower bound; the response carriesbounded: truewithshards_saturated > 0.exhaustive: true— recurse on each saturated shard via score-band pagination (BM25:$score < lastwith anidtiebreak; ANN:$dist > last) until every page is short ortimeout_secondselapses.
The same threads value applies to the initial round and every
exhaustive round over the remaining saturated shards.
bounded and approximate are independent. bounded means a shard
saturated and the count is a >= lower bound for the rows the index
returned; approximate means the distance ball’s membership is itself
fuzzy. An ann count can be bounded: false yet still approximate: true.
Sources
| Source | ID mode | Count mode | Values mode |
|---|---|---|---|
auto | Cache when fresh enough, otherwise origin | Snapshot first, then cache/origin. | Snapshot when eligible, then cache/origin. |
snapshot | Not supported | Latest snapshot only; requires eligible Eq or In. | Latest snapshot facet listing; requires an unfiltered scan on a field in fields[]. |
cache | Aerospike document cache only | Aerospike document cache only | Aerospike document cache only. |
origin | Turbopuffer paginated scan | Turbopuffer paginated scan | Turbopuffer paginated scan with gateway-side dedupe. |
This table covers the filter selector. The fts and ann selectors have
no snapshot or cache evaluator, so they always run origin scatter/gather:
omitted, auto, and origin all resolve to origin, and snapshot or
cache returns 422.
Filters
Scans accept the same Turbopuffer filter array as query. On origin scans, the filter is pushed to Turbopuffer. On cache scans, the gateway evaluates it against cached document attributes.
Supported cache operators are Eq, NotEq, Gt, Gte, Lt, Lte,
In, NotIn, And, Or, and Not. If auto sees a filter the cache
cannot evaluate, it uses origin. Explicit source: cache with an
unsupported filter fails rather than returning partial results.
Auto-Mode Policy
Auto ties cache freshness to the same consistency watermark used by
stable reads. The gateway tracks per-namespace
cache_warmed_through, the watermark observed at the end of the last
successful origin warm.
| Cache state | Watermark state | Action |
|---|---|---|
| Empty | any | Run origin and stamp cache_warmed_through. |
Populated, cache_warmed_through >= watermark | observed | Serve cache. |
Populated, cache_warmed_through < watermark | observed | Serve cache and start a background origin warm. |
Populated, no cache_warmed_through yet | observed | Serve cache and start a background origin warm. |
| Populated | not yet observed | Serve cache. |
When cache is used, _hevlayer_upserted_at <= cache_warmed_through is added
before the user filter so the scan is a stable warmed view.
Operational notes
- ID and values scan state is in-memory and ephemeral; it resets on gateway restart.
- Count scans have a deadline, default 30s and maximum 300s.
- Values jobs cap at 1,000,000 distinct values per scan and set
truncated: truewhen crossed; the listing keeps the top values by count, each with an exact count. - Origin scan fan-out defaults to 8 concurrent upstream requests per scan
unless the request or
Index.spec.scan.threadssets a different value. - Snapshot-served count scans are exact at the snapshot
watermark_ms.