API
Warm cache
Layer exposes two warm endpoints. hint_cache_warm is the
Turbopuffer-compatible hint; warm is the Layer-only shortcut that
creates a gateway warm job.
Hint-cache warm
With no query parameters, the call is a raw passthrough: the gateway forwards it to Turbopuffer unchanged and returns the upstream response verbatim. Existing Turbopuffer clients keep their exact wire behavior.
curl "$LAYER_GATEWAY_URL/v1/namespaces/products/hint_cache_warm" \
-H "Authorization: Bearer $LAYER_GATEWAY_API_KEY"
Supplying any warm option (turbopuffer, documents, snapshots,
page_size) switches the call into Layer orchestration. Steps then
default on; each is independently toggleable:
| Step | What it does |
|---|---|
turbopuffer=true | Forwards the warm hint upstream. |
documents=true | Starts an origin warm job to backfill the NVMe cache. |
snapshots=true | Mirrors the latest S3 snapshot body into NVMe. |
result = await client.hint_cache_warm(
"products",
turbopuffer=False,
documents=False,
snapshots=True,
)const result = await client.hintCacheWarm("products", {
turbopuffer: false,
documents: false,
snapshots: true,
});curl "$LAYER_GATEWAY_URL/v1/namespaces/products/hint_cache_warm?turbopuffer=false&documents=false&snapshots=true" \
-H "Authorization: Bearer $LAYER_GATEWAY_API_KEY" The generated Go client omits false query parameters, so it cannot
turn steps off — disable steps over REST (or the Python client) instead.
The orchestrated response reports per-step status:
{
"namespace": "products",
"turbopuffer": { "enabled": true, "status": "completed" },
"documents": {
"enabled": true,
"status": "started",
"job": { "id": "warm-job-uuid", "status": "running" }
},
"snapshots": {
"enabled": true,
"status": "completed",
"key": "snapshots/products/...",
"watermark_ms": 1715600400000,
"sha": "..."
}
}
If documents is enabled, the response includes a warm job; poll it
through /warm-jobs/{id}.
Layer warm
POST /v2/namespaces/{ns}/warm creates an asynchronous job that pages
through Turbopuffer, backfills Aerospike, and refreshes
cache_warmed_through. Use it when bootstrapping a namespace whose data
was written outside the gateway.
job = await client.warm_cache("products", page_size=1000)job, err := client.WarmCache(ctx, "products", &hevlayer.WarmCacheParams{
PageSize: 1000,
})const job = await client.warmCache("products", { pageSize: 1000 });curl -X POST "$LAYER_GATEWAY_URL/v2/namespaces/products/warm?page_size=1000" \
-H "Authorization: Bearer $LAYER_GATEWAY_API_KEY" The response is 202 Accepted with the warm job:
{
"id": "warm-job-uuid",
"namespace": "products",
"status": "running",
"progress": 0,
"documents_scanned": 0,
"created_at": "2026-05-26T10:00:00Z"
}
Poll it through:
job = await client.get_warm_job("products", job.id)job, err := client.GetWarmJob(ctx, "products", jobID)const job = await client.getWarmJob("products", jobId);curl "$LAYER_GATEWAY_URL/v2/namespaces/products/warm-jobs/warm-job-uuid" \
-H "Authorization: Bearer $LAYER_GATEWAY_API_KEY" Cache-cold behavior
The split is deliberate. Fetch is correctness-first: a cache outage must not turn into a missing document. Warm is throughput-first: warming on a cold cache would be wasted work, so the gateway reports the cold state to the caller rather than silently no-op-ing.
A bare hint_cache_warm passthrough never touches the gateway cache, so
it succeeds even while the cache is cold. The orchestrated form returns
503 cache_cold only when documents or snapshots is requested.
For how the cache recovers from an outage and the signals to watch, see the failure-mode runbook.