# hev layer — full docs > Concatenated docs surface. Index at https://hevlayer.com/llms.txt. --- ## Search knowledge graph Version: 2 Generated: 2026-06-14T22:41:19.143Z Content hash: 06c39943e6e6466069dcce5f660b09f2fffd305244ecf9447f9a6c39afc73b42 Context: ## Layer (hev layer) Layer is a **gateway and function runtime for retrieval systems**: a Rust proxy (the *gateway*) that fronts **Turbopuffer**, plus a Kubernetes *operator*, both running in your own cluster. The gateway is wire-compatible with the Turbopuffer client API — existing clients keep working when pointed at it — and Layer documents only what it *adds* on top of upstream routes, exposing Layer-only features under `/v2/`. ### Core building blocks - **Gateway** — transparent Turbopuffer proxy adding fetch, query by id, scans, result count, facet snapshots, a document cache (pull-through reads), write-path stamping, stable reads, query/clickstream history, warm jobs, pipelines, and a UDF runtime. - **Operator** — reconciles four CRDs (`Index`, `InfraRules`, `Pipeline`, `Function`). Decoupled from the gateway, which only ever *reads* CRD status. - **Backing services** (all open source): **Aerospike** (NVMe document cache, ephemeral), **PostgreSQL** (pipeline/indexing-state queue only), **VictoriaMetrics** (metrics), **Karpenter** (node autoscaling), **KEDA** (pod autoscaling to zero). Durable state lives only in **S3** — Layer processes are stateless and elastic. ### Key concepts users ask about - **Stable reads / stable watermark** — a background watcher records an epoch-ms watermark when a namespace's Turbopuffer index status is up-to-date; while updating, queries filter to fully-indexed rows so reads never see partial writes. Surfaced via `stable_as_of`; configured per Index CRD via `consistency`. - **Reserved `hevlayer` attributes** — server-stamped write watermark, shard key, and Function completion/invalidation markers; users must not write `_hevlayer_*` attributes. - **Document cache** — pull-through reads: Aerospike checked first; misses fall through to Turbopuffer/S3 and backfill. Cache failures are soft (never block reads); upstream failures are hard. Hit/miss reported per response. - **Snapshots & facets** — content-addressed S3 facet histograms written when a namespace is stable. - **Scans** — on-demand row selection by filter, full-text (`fts`), or radius (`ann`), returning IDs or counts. Origin scans use `threads` / `spec.scan.threads` to cap concurrent upstream shard fan-out. - **Pipelines vs Functions/UDFs** — pipelines ingest external data into rows; Functions run over rows that already exist. Workers own writes and can patch attributes, fan out with deterministic IDs, or re-upsert rows. Both scale via KEDA off queue depth, pinned to compute pools in `InfraRules`. - **Dashboard** — read-mostly operator GUI reading the same gateway API. ### How users talk about it Users say "the gateway," "drop-in Turbopuffer client," "warm the cache," "stable read," "strongly consistent query," "snapshot," "facet counts," "scan a filter," "stage/claim/embed," "UDF/function," "compute pool," and "scale to zero." Install is two-stage: **Terraform** (AWS resources) then **Helm** (gateway/operator/cache). Glossary: Raw JSON: ```json { "version": 2, "generatedAt": "2026-06-14T22:41:19.143Z", "contentHash": "06c39943e6e6466069dcce5f660b09f2fffd305244ecf9447f9a6c39afc73b42", "context": "## Layer (hev layer)\n\nLayer is a **gateway and function runtime for retrieval systems**: a Rust proxy (the *gateway*) that fronts **Turbopuffer**, plus a Kubernetes *operator*, both running in your own cluster. The gateway is wire-compatible with the Turbopuffer client API — existing clients keep working when pointed at it — and Layer documents only what it *adds* on top of upstream routes, exposing Layer-only features under `/v2/`.\n\n### Core building blocks\n- **Gateway** — transparent Turbopuffer proxy adding fetch, query by id, scans, result count, facet snapshots, a document cache (pull-through reads), write-path stamping, stable reads, query/clickstream history, warm jobs, pipelines, and a UDF runtime.\n- **Operator** — reconciles four CRDs (`Index`, `InfraRules`, `Pipeline`, `Function`). Decoupled from the gateway, which only ever *reads* CRD status.\n- **Backing services** (all open source): **Aerospike** (NVMe document cache, ephemeral), **PostgreSQL** (pipeline/indexing-state queue only), **VictoriaMetrics** (metrics), **Karpenter** (node autoscaling), **KEDA** (pod autoscaling to zero). Durable state lives only in **S3** — Layer processes are stateless and elastic.\n\n### Key concepts users ask about\n- **Stable reads / stable watermark** — a background watcher records an epoch-ms watermark when a namespace's Turbopuffer index status is up-to-date; while updating, queries filter to fully-indexed rows so reads never see partial writes. Surfaced via `stable_as_of`; configured per Index CRD via `consistency`.\n- **Reserved `hevlayer` attributes** — server-stamped write watermark, shard key, and Function completion/invalidation markers; users must not write `_hevlayer_*` attributes.\n- **Document cache** — pull-through reads: Aerospike checked first; misses fall through to Turbopuffer/S3 and backfill. Cache failures are soft (never block reads); upstream failures are hard. Hit/miss reported per response.\n- **Snapshots & facets** — content-addressed S3 facet histograms written when a namespace is stable.\n- **Scans** — on-demand row selection by filter, full-text (`fts`), or radius (`ann`), returning IDs or counts. Origin scans use `threads` / `spec.scan.threads` to cap concurrent upstream shard fan-out.\n- **Pipelines vs Functions/UDFs** — pipelines ingest external data into rows; Functions run over rows that already exist. Workers own writes and can patch attributes, fan out with deterministic IDs, or re-upsert rows. Both scale via KEDA off queue depth, pinned to compute pools in `InfraRules`.\n- **Dashboard** — read-mostly operator GUI reading the same gateway API.\n\n### How users talk about it\nUsers say \"the gateway,\" \"drop-in Turbopuffer client,\" \"warm the cache,\" \"stable read,\" \"strongly consistent query,\" \"snapshot,\" \"facet counts,\" \"scan a filter,\" \"stage/claim/embed,\" \"UDF/function,\" \"compute pool,\" and \"scale to zero.\" Install is two-stage: **Terraform** (AWS resources) then **Helm** (gateway/operator/cache).", "glossary": [], "overview": "## API\n- Introduction — `api/introduction`\n- Authentication — `api/introduction#authentication`\n- Cache warm hint — GET /v1/namespaces/{ns}/hint_cache_warm — `api/introduction#cache-warm-hint--get-v1namespacesnshint_cache_warm`\n- Client fall-through — `api/introduction#client-fall-through`\n- Compatibility posture — `api/introduction#compatibility-posture`\n- Cross-cutting conventions — `api/introduction#cross-cutting-conventions`\n- Enhancements to upstream routes — `api/introduction#enhancements-to-upstream-routes`\n- Install — `api/introduction#install`\n- Metadata — GET /v2/namespaces/{ns}/metadata — `api/introduction#metadata--get-v2namespacesnsmetadata`\n- Query — POST /v2/namespaces/{ns}/query — `api/introduction#query--post-v2namespacesnsquery`\n- Write — POST /v2/namespaces/{ns} — `api/introduction#write--post-v2namespacesns`\n- API keys — `api/keys`\n- Authenticate — `api/keys#authenticate`\n- CLI — `api/keys#cli`\n- Key model — `api/keys#key-model`\n- kubectl — `api/keys#kubectl`\n- List and get — `api/keys#list-and-get`\n- Mint — `api/keys#mint`\n- Revoke and delete — `api/keys#revoke-and-delete`\n- Routes — `api/keys#routes`\n- Using a minted key — `api/keys#using-a-minted-key`\n- Namespace metadata — `api/namespace-metadata`\n- List namespaces — `api/namespace-metadata#list-namespaces`\n- Request — `api/namespace-metadata#request`\n- The layer block — `api/namespace-metadata#the-layer-block`\n- Pipelines — `api/pipelines`\n- Deploy — `api/pipelines#deploy`\n- Document lifecycle — `api/pipelines#document-lifecycle`\n- Embed — `api/pipelines#embed`\n- Extract and chunk — `api/pipelines#extract-and-chunk`\n- Failure model — `api/pipelines#failure-model`\n- File tree — `api/pipelines#file-tree`\n- Trigger a run — `api/pipelines#trigger-a-run`\n- Wait for completion — `api/pipelines#wait-for-completion`\n- Query & Fetch — `api/query`\n- Batch fetch — `api/query#batch-fetch`\n- Behavior matrix — `api/query#behavior-matrix`\n- Counting matches — `api/query#counting-matches`\n- Fetch — `api/query#fetch`\n- Hybrid text fusion — `api/query#hybrid-text-fusion`\n- Multi-query — `api/query#multi-query`\n- Options — `api/query#options`\n- Query by id — `api/query#query-by-id`\n- Query routing — `api/query#query-routing`\n- Response — `api/query#response`\n- Response — `api/query#response-1`\n- Routing policy — `api/query#routing-policy`\n- Semantics — `api/query#semantics`\n- Single fetch — `api/query#single-fetch`\n- Stable reads — `api/query#stable-reads`\n- Tokenization — `api/query#tokenization`\n- Validation — `api/query#validation`\n- Validation — `api/query#validation-1`\n- Response Headers — `api/response-headers`\n- Scan — `api/scans`\n- Auto-Mode Policy — `api/scans#auto-mode-policy`\n- Bounding ranked scans — `api/scans#bounding-ranked-scans`\n- Count Mode — `api/scans#count-mode`\n- Fan-out width — `api/scans#fan-out-width`\n- Filters — `api/scans#filters`\n- Full-text count — `api/scans#full-text-count`\n- High cardinality — `api/scans#high-cardinality`\n- ID Mode — `api/scans#id-mode`\n- Operational notes — `api/scans#operational-notes`\n- Precomputed serving — `api/scans#precomputed-serving`\n- Radius count — `api/scans#radius-count`\n- Routes — `api/scans#routes`\n- Sources — `api/scans#sources`\n- Values Mode — `api/scans#values-mode`\n- Query History — `api/search-history`\n- Clickstream entry — `api/search-history#clickstream-entry`\n- Query parameters — `api/search-history#query-parameters`\n- Routes — `api/search-history#routes`\n- Search history entry — `api/search-history#search-history-entry`\n- Storage — `api/search-history#storage`\n- Tag contract — `api/search-history#tag-contract`\n- Writing metadata — `api/search-history#writing-metadata`\n- Snapshot History — `api/snapshots`\n- Activity — `api/snapshots#activity`\n- History — `api/snapshots#history`\n- Manual snapshot — `api/snapshots#manual-snapshot`\n- Routes — `api/snapshots#routes`\n- Snapshot body — `api/snapshots#snapshot-body`\n- Snapshot policy — `api/snapshots#snapshot-policy`\n- Warm cache — `api/warm-cache`\n- Cache-cold behavior — `api/warm-cache#cache-cold-behavior`\n- Hint-cache warm — `api/warm-cache#hint-cache-warm`\n- Layer warm — `api/warm-cache#layer-warm`\n- Write & Stage — `api/write`\n- Stage — `api/write#stage`\n- Status — `api/write#status`\n## Operations\n- Layer CLI — `cli`\n- Ask The Docs — `cli#ask-the-docs`\n- Configuration — `cli#configuration`\n- Environments — `cli#environments`\n- Inspect An Index — `cli#inspect-an-index`\n- Install — `cli#install`\n- Keys — `cli#keys`\n- Pipelines — `cli#pipelines`\n- Run A Function — `cli#run-a-function`\n- TUI — `cli#tui`\n- Dashboard — `dashboard`\n- Access it needs — `dashboard#access-it-needs`\n- Basic auth — `dashboard#basic-auth`\n- Disabling the dashboard — `dashboard#disabling-the-dashboard`\n- Networking — `dashboard#networking`\n- Operational notes — `dashboard#operational-notes`\n- Failure Modes — `failure-modes`\n- Client fall-through — `failure-modes#client-fall-through`\n- Pipeline stop-writes — `failure-modes#pipeline-stop-writes`\n- Read — `failure-modes#read`\n- Write — `failure-modes#write`\n- Install — `install`\n- Cluster: recommended — `install#cluster-recommended`\n- Cost notes — `install#cost-notes`\n- Default InfraRules — `install#default-infrarules`\n- Gateway auth modes — `install#gateway-auth-modes`\n- Helm — `install#helm`\n- Install shape — `install#install-shape`\n- Outputs — `install#outputs`\n- Required values — `install#required-values`\n- Run the install — `install#run-the-install`\n- Terraform — `install#terraform`\n- What gets installed — `install#what-gets-installed`\n- What it sets up — `install#what-it-sets-up`\n- ApiKey CRD — `kubernetes/apikey-crd`\n- Bootstrapping — `kubernetes/apikey-crd#bootstrapping`\n- Entitlements — `kubernetes/apikey-crd#entitlements`\n- Kubernetes RBAC — `kubernetes/apikey-crd#kubernetes-rbac`\n- Minting — `kubernetes/apikey-crd#minting`\n- Spec — `kubernetes/apikey-crd#spec`\n- Verification — `kubernetes/apikey-crd#verification`\n- Function CRD — `kubernetes/function-crd`\n- GPU classifier — `kubernetes/function-crd#gpu-classifier`\n- Lifecycle — `kubernetes/function-crd#lifecycle`\n- Scaling — `kubernetes/function-crd#scaling`\n- Selection — `kubernetes/function-crd#selection`\n- Simple classifier — `kubernetes/function-crd#simple-classifier`\n- Tuning knobs — `kubernetes/function-crd#tuning-knobs`\n- Version markers — `kubernetes/function-crd#version-markers`\n- Worker — `kubernetes/function-crd#worker`\n- Writeback — `kubernetes/function-crd#writeback`\n- Index CRD — `kubernetes/index-crd`\n- Backend — `kubernetes/index-crd#backend`\n- Cache policy — `kubernetes/index-crd#cache-policy`\n- Scan policy — `kubernetes/index-crd#scan-policy`\n- Snapshot policy — `kubernetes/index-crd#snapshot-policy`\n- Status — `kubernetes/index-crd#status`\n- Operator Overview — `kubernetes/operator`\n- CRDs — `kubernetes/operator#crds`\n- Relationship to the gateway — `kubernetes/operator#relationship-to-the-gateway`\n- Scheduling and node pools — `kubernetes/operator#scheduling-and-node-pools`\n- Pipeline CRD — `kubernetes/pipeline-crd`\n- Pipeline id — `kubernetes/pipeline-crd#pipeline-id`\n- Scaling — `kubernetes/pipeline-crd#scaling`\n- Source — `kubernetes/pipeline-crd#source`\n- Status — `kubernetes/pipeline-crd#status`\n- Target — `kubernetes/pipeline-crd#target`\n- Worker — `kubernetes/pipeline-crd#worker`\n- InfraRules CRD — `kubernetes/scaling-crd`\n- Compute pools — `kubernetes/scaling-crd#compute-pools`\n- Document cache rules — `kubernetes/scaling-crd#document-cache-rules`\n- InfraRules — `kubernetes/scaling-crd#infrarules`\n- Workload scaling — `kubernetes/scaling-crd#workload-scaling`\n- VectorStore CRD — `kubernetes/vectorstore-crd`\n- Connection — `kubernetes/vectorstore-crd#connection`\n- Inbound auth — `kubernetes/vectorstore-crd#inbound-auth`\n- Routing — `kubernetes/vectorstore-crd#routing`\n- Status — `kubernetes/vectorstore-crd#status`\n- Warehouse CRD — `kubernetes/warehouse-crd`\n- Connection — `kubernetes/warehouse-crd#connection`\n- Deletion — `kubernetes/warehouse-crd#deletion`\n- Keys — `kubernetes/warehouse-crd#keys`\n- Pipeline source — `kubernetes/warehouse-crd#pipeline-source`\n- Rotation — `kubernetes/warehouse-crd#rotation`\n- Status — `kubernetes/warehouse-crd#status`\n- Verification — `kubernetes/warehouse-crd#verification`\n## Overview\n- Agents — `agents`\n- 1. Install the CLIs — `agents#1-install-the-clis`\n- 2. Add the docs skill — `agents#2-add-the-docs-skill`\n- 3. Add the layer CLI skill — `agents#3-add-the-layer-cli-skill`\n- 4. Ask — `agents#4-ask`\n- The verbs — `agents#the-verbs`\n- Why answers stay grounded — `agents#why-answers-stay-grounded`\n- Concepts — `concepts`\n- Control loops — `concepts#control-loops`\n- Document cache — `concepts#document-cache`\n- Gateway enhancements — `concepts#gateway-enhancements`\n- Glossary — `concepts#glossary`\n- Kubernetes autoscaling — `concepts#kubernetes-autoscaling`\n- Scatter/gather — `concepts#scattergather`\n- Document model — `document-model`\n- FAQ — `faq`\n- How do I get started? — `faq#how-do-i-get-started`\n- How much will it cost? — `faq#how-much-will-it-cost`\n- Is hev layer a hosted service? — `faq#is-hev-layer-a-hosted-service`\n- What is the licensing for hev layer? — `faq#what-is-the-licensing-for-hev-layer`\n- Who built hev layer? — `faq#who-built-hev-layer`\n- Will any of it be open source? — `faq#will-any-of-it-be-open-source`\n- Will it be a paid product? — `faq#will-it-be-a-paid-product`\n- No Guarantees — `guarantees`\n- Commitments — `guarantees#commitments`\n- Introduction — `index`\n- Limits — `limits`\n- No limits — `limits#no-limits`\n- Roadmap & Changelog — `roadmap`\n- 0.1 Blockers — `roadmap#01-blockers`\n- 0.1 Release (UAT) — `roadmap#01-release-uat`\n- API hardening — `roadmap#api-hardening`\n- Later — `roadmap#later`\n- Lifecycle and operability — `roadmap#lifecycle-and-operability`\n- Search — `roadmap#search`\n- Surfaces — `roadmap#surfaces`\n- Up Next — `roadmap#up-next`\n- Tradeoffs — `tradeoffs`", "suggestions": [ "How do I limit scan fan-out against Turbopuffer?", "How do I get stable reads after a write?", "What's the difference between a pipeline and a UDF?", "What happens when the document cache is down?", "How do I query for documents similar to one I already have?" ], "nodes": [ { "id": "agents", "kind": "section", "title": "Agents", "heading": null, "group": "Overview", "url": "/docs/agents", "summary": "Use the Layer docs and layer CLI from your coding agent with one-file skills that work across agent harnesses. These docs are queryable from the command line. The same engine behind the ⌘K search on this site ships as a…", "facts": [ { "kind": "code", "literal": "⌘K", "chunkId": "agents" }, { "kind": "code", "literal": "layer", "chunkId": "agents" }, { "kind": "code", "literal": "SKILL.md", "chunkId": "agents" }, { "kind": "code", "literal": "AGENTS.md", "chunkId": "agents" }, { "kind": "value", "literal": "Callout.astro", "chunkId": "agents" } ], "sources": [ { "chunkId": "agents", "url": "/docs/agents", "anchor": null } ], "mode": "agent-primary", "terms": [ "layer", "docs", "coding", "agent", "file", "skills", "work", "across", "harnesses", "these", "queryable", "command", "line", "same", "engine", "behind", "search", "site", "ships", "skill", "agents", "callout", "astro", "read", "cite", "directly", "scraping", "server", "also", "lets", "operate", "environments", "indexes", "pipelines", "udfs", "function", "runs", "bodies", "below", "plain" ] }, { "id": "agents#1-install-the-clis", "kind": "section", "title": "Agents", "heading": "1. Install the CLIs", "group": "Overview", "url": "/docs/agents#1-install-the-clis", "summary": "1. Install the CLIs go install github.com/hev/ask/cmd/ask@latest The ask binary is self-contained; any agent harness that can run a shell command can use it. From a Layer checkout, build the layer CLI when the agent shou…", "facts": [ { "kind": "code", "literal": "go install github.com/hev/ask/cmd/ask@latest", "chunkId": "agents#1-install-the-clis" }, { "kind": "code", "literal": "go build -o layer ./apps/layer-cli", "chunkId": "agents#1-install-the-clis" }, { "kind": "code", "literal": "ask", "chunkId": "agents#1-install-the-clis" }, { "kind": "code", "literal": "layer", "chunkId": "agents#1-install-the-clis" } ], "sources": [ { "chunkId": "agents#1-install-the-clis", "url": "/docs/agents#1-install-the-clis", "anchor": "1-install-the-clis" } ], "mode": "agent-primary", "terms": [ "install", "clis", "github", "latest", "binary", "self", "contained", "agent", "harness", "shell", "command", "layer", "checkout", "build", "shou", "apps", "should", "operate", "environments", "instead", "only", "searching", "docs" ] }, { "id": "agents#2-add-the-docs-skill", "kind": "section", "title": "Agents", "heading": "2. Add the docs skill", "group": "Overview", "url": "/docs/agents#2-add-the-docs-skill", "summary": "2. Add the docs skill Set AGENTSKILLHOME to your harness's skill directory, such as /.codex/skills for Codex or /.claude/skills for Claude Code. AGENTSKILLHOME=\"${AGENTSKILLHOME:-${CODEXHOME:-$HOME/.codex}/skills}\" mkdir…", "facts": [ { "kind": "code", "literal": "AGENT_SKILL_HOME", "chunkId": "agents#2-add-the-docs-skill" }, { "kind": "code", "literal": "~/.codex/skills", "chunkId": "agents#2-add-the-docs-skill" }, { "kind": "code", "literal": "~/.claude/skills", "chunkId": "agents#2-add-the-docs-skill" } ], "sources": [ { "chunkId": "agents#2-add-the-docs-skill", "url": "/docs/agents#2-add-the-docs-skill", "anchor": "2-add-the-docs-skill" } ], "mode": "agent-primary", "terms": [ "docs", "skill", "agentskillhome", "harness", "directory", "such", "codex", "skills", "claude", "code", "codexhome", "home", "mkdir", "agent", "hevlayer", "name", "description", "query", "layer", "user", "asks", "about", "turbopuffer", "gateway", "stable", "reads", "watermark", "document", "cache", "warm", "jobs", "scans", "filter", "full", "text", "radius", "snapshots", "pipelines", "udfs", "index" ] }, { "id": "agents#3-add-the-layer-cli-skill", "kind": "section", "title": "Agents", "heading": "3. Add the layer CLI skill", "group": "Overview", "url": "/docs/agents#3-add-the-layer-cli-skill", "summary": "3. Add the layer CLI skill Use this skill when an agent should inspect or operate Layer through the layer CLI. The skill keeps read-only inspection, docs lookup, and mutating operations separate. AGENTSKILLHOME=\"${AGENTS…", "facts": [ { "kind": "code", "literal": "layer", "chunkId": "agents#3-add-the-layer-cli-skill" } ], "sources": [ { "chunkId": "agents#3-add-the-layer-cli-skill", "url": "/docs/agents#3-add-the-layer-cli-skill", "anchor": "3-add-the-layer-cli-skill" } ], "mode": "agent-primary", "terms": [ "layer", "skill", "agent", "should", "inspect", "operate", "through", "keeps", "read", "only", "inspection", "docs", "lookup", "mutating", "operations", "separate", "agentskillhome", "agents", "codexhome", "home", "codex", "skills", "mkdir", "hevlayer", "name", "description", "user", "asks", "environments", "query", "list", "indexes", "pipelines", "udfs", "open", "delete", "function", "manifests", "terminal", "checkout" ] }, { "id": "agents#4-ask", "kind": "section", "title": "Agents", "heading": "4. Ask", "group": "Overview", "url": "/docs/agents#4-ask", "summary": "4. Ask ask --endpoint https://hevlayer.com/api/ask search \"cache is down\" { \"results\": [ { \"title\": \"Concepts\", \"heading\": \"Document cache\", \"url\": \"/docs/concepts#document-cache\", \"group\": \"Overview\", \"snippet\": \"The do…", "facts": [ { "kind": "code", "literal": "ask --endpoint https://hevlayer.com/api/ask search \"cache is down\"", "chunkId": "agents#4-ask" }, { "kind": "code", "literal": "{\n \"results\": [\n {\n \"title\": \"Concepts\",\n \"heading\": \"Document cache\",\n \"url\": \"/docs/concepts#document-cache\",\n \"group\": \"Overview\",\n \"snippet\": \"The document cache does two jobs: pull-through document reads...\"\n }\n ]\n}", "chunkId": "agents#4-ask" }, { "kind": "code", "literal": "section get", "chunkId": "agents#4-ask" } ], "sources": [ { "chunkId": "agents#4-ask", "url": "/docs/agents#4-ask", "anchor": "4-ask" } ], "mode": "agent-primary", "terms": [ "endpoint", "https", "hevlayer", "search", "cache", "down", "results", "title", "concepts", "heading", "document", "docs", "group", "overview", "snippet", "does", "jobs", "pull", "through", "reads", "section", "here", "agent", "typically", "runs", "winning", "answers", "citation" ] }, { "id": "agents#the-verbs", "kind": "section", "title": "Agents", "heading": "The verbs", "group": "Overview", "url": "/docs/agents#the-verbs", "summary": "The verbs Verb Returns overview Orientation context plus the full section map with stable ids search \" \" Ranked sections with snippets and deep links section get \" \" One section: summary, exact identifiers, source URL gl…", "facts": [ { "kind": "code", "literal": "overview", "chunkId": "agents#the-verbs" }, { "kind": "code", "literal": "search \"\"", "chunkId": "agents#the-verbs" }, { "kind": "code", "literal": "section get \"\"", "chunkId": "agents#the-verbs" }, { "kind": "code", "literal": "glossary get \"\"", "chunkId": "agents#the-verbs" }, { "kind": "code", "literal": "watermark", "chunkId": "agents#the-verbs" } ], "sources": [ { "chunkId": "agents#the-verbs", "url": "/docs/agents#the-verbs", "anchor": "the-verbs" } ], "mode": "agent-primary", "terms": [ "verbs", "verb", "returns", "overview", "orientation", "context", "plus", "full", "section", "stable", "search", "ranked", "sections", "snippets", "deep", "links", "summary", "exact", "identifiers", "source", "query", "glossary", "term", "watermark", "product", "resolved", "through", "aliases" ] }, { "id": "agents#why-answers-stay-grounded", "kind": "section", "title": "Agents", "heading": "Why answers stay grounded", "group": "Overview", "url": "/docs/agents#why-answers-stay-grounded", "summary": "Why answers stay grounded Search runs over a committed, reviewable digest of these docs — the same corpus, heading by heading, that renders on this site. Every anchor in it is verified against the rendered pages in CI, s…", "facts": [ { "kind": "value", "literal": "llms.txt", "chunkId": "agents#why-answers-stay-grounded" }, { "kind": "value", "literal": "llms-full.txt", "chunkId": "agents#why-answers-stay-grounded" } ], "sources": [ { "chunkId": "agents#why-answers-stay-grounded", "url": "/docs/agents#why-answers-stay-grounded", "anchor": "why-answers-stay-grounded" } ], "mode": "agent-primary", "terms": [ "answers", "stay", "grounded", "search", "runs", "committed", "reviewable", "digest", "these", "docs", "same", "corpus", "heading", "renders", "site", "every", "anchor", "verified", "against", "rendered", "pages", "llms", "full", "cited", "deep", "link", "like", "query", "stable", "reads", "always", "resolves", "change", "rebuilt", "recommitted", "verb", "above", "read", "public", "nothing" ] }, { "id": "api/introduction", "kind": "section", "title": "Introduction", "heading": null, "group": "API", "url": "/docs/api/introduction", "summary": "What Layer adds on top of the Turbopuffer wire contract, and how to point a client at the gateway. Layer matches the Turbopuffer wire contract so existing clients keep working when you point them at the gateway. Where a…", "facts": [ { "kind": "value", "literal": "CodeTabs.astro", "chunkId": "api/introduction" }, { "kind": "value", "literal": "Upstream.astro", "chunkId": "api/introduction" } ], "sources": [ { "chunkId": "api/introduction", "url": "/docs/api/introduction", "anchor": null } ], "mode": "source-primary", "terms": [ "layer", "adds", "turbopuffer", "wire", "contract", "point", "client", "gateway", "matches", "existing", "clients", "keep", "working", "codetabs", "astro", "upstream", "route", "equivalent", "site", "documents", "behavior", "itself", "follow", "docs", "link", "page", "underlying", "request", "response", "shape" ] }, { "id": "api/introduction#authentication", "kind": "section", "title": "Introduction", "heading": "Authentication", "group": "API", "url": "/docs/api/introduction#authentication", "summary": "Authentication Every request carries Authorization: Bearer . The gateway accepts two kinds of bearer: The store key. The default VectorStore credential (the Turbopuffer key you already own) is accepted as an admin bearer…", "facts": [ { "kind": "code", "literal": "Authorization: Bearer ", "chunkId": "api/introduction#authentication" }, { "kind": "code", "literal": "VectorStore", "chunkId": "api/introduction#authentication" }, { "kind": "code", "literal": "read", "chunkId": "api/introduction#authentication" }, { "kind": "code", "literal": "write", "chunkId": "api/introduction#authentication" }, { "kind": "code", "literal": "admin", "chunkId": "api/introduction#authentication" }, { "kind": "code", "literal": "LAYER_GATEWAY_URL", "chunkId": "api/introduction#authentication" }, { "kind": "code", "literal": "LAYER_GATEWAY_API_KEY", "chunkId": "api/introduction#authentication" }, { "kind": "code", "literal": "deriveFromStore", "chunkId": "api/introduction#authentication" }, { "kind": "code", "literal": "keys", "chunkId": "api/introduction#authentication" }, { "kind": "code", "literal": "TURBOPUFFER_API_KEY", "chunkId": "api/introduction#authentication" }, { "kind": "code", "literal": "TURBOPUFFER_API_URL", "chunkId": "api/introduction#authentication" }, { "kind": "code", "literal": "https://aws-us-east-1.turbopuffer.com", "chunkId": "api/introduction#authentication" } ], "sources": [ { "chunkId": "api/introduction#authentication", "url": "/docs/api/introduction#authentication", "anchor": "authentication" } ], "mode": "source-primary", "terms": [ "authentication", "every", "request", "carries", "authorization", "bearer", "gateway", "accepts", "kinds", "store", "default", "vectorstore", "credential", "turbopuffer", "already", "accepted", "admin", "read", "write", "layer", "derivefromstore", "keys", "https", "east", "drop", "point", "existing", "client", "keep", "setup", "full", "access", "minted", "mint", "scoped", "namespaces", "crossed", "hand", "team", "service" ] }, { "id": "api/introduction#cache-warm-hint--get-v1namespacesnshint_cache_warm", "kind": "section", "title": "Introduction", "heading": "Cache warm hint — GET /v1/namespaces/{ns}/hint_cache_warm", "group": "API", "url": "/docs/api/introduction#cache-warm-hint--get-v1namespacesnshint_cache_warm", "summary": "Cache warm hint — GET /v1/namespaces/{ns}/hintcachewarm Upstream contract for the cache warm hint. With no query parameters: a raw upstream passthrough, response returned verbatim. With any warm option supplied: forwards…", "facts": [ { "kind": "code", "literal": "GET /v1/namespaces/{ns}/hint_cache_warm", "chunkId": "api/introduction#cache-warm-hint--get-v1namespacesnshint_cache_warm" }, { "kind": "value", "literal": "turbopuffer.com", "chunkId": "api/introduction#cache-warm-hint--get-v1namespacesnshint_cache_warm" } ], "sources": [ { "chunkId": "api/introduction#cache-warm-hint--get-v1namespacesnshint_cache_warm", "url": "/docs/api/introduction#cache-warm-hint--get-v1namespacesnshint_cache_warm", "anchor": "cache-warm-hint--get-v1namespacesnshint_cache_warm" } ], "mode": "source-primary", "terms": [ "cache", "warm", "hint", "namespaces", "hintcachewarm", "upstream", "contract", "query", "parameters", "passthrough", "response", "returned", "verbatim", "option", "supplied", "forwards", "turbopuffer", "runs", "layer", "side", "steps", "backfill", "aerospike", "document", "origin", "plus", "mirror", "latest", "snapshot", "body", "step", "independently", "toggleable", "request", "page" ] }, { "id": "api/introduction#client-fall-through", "kind": "section", "title": "Introduction", "heading": "Client fall-through", "group": "API", "url": "/docs/api/introduction#client-fall-through", "summary": "Client fall-through The Python, Go, and TypeScript SDKs can fall through to Turbopuffer direct when the gateway is unreachable. The fallback is limited to calls that can be satisfied without Layer state: simple vector qu…", "facts": [ { "kind": "code", "literal": "write_namespace", "chunkId": "api/introduction#client-fall-through" }, { "kind": "code", "literal": "WriteNamespace", "chunkId": "api/introduction#client-fall-through" }, { "kind": "code", "literal": "writeNamespace", "chunkId": "api/introduction#client-fall-through" }, { "kind": "code", "literal": "query_turbopuffer_namespace", "chunkId": "api/introduction#client-fall-through" }, { "kind": "code", "literal": "QueryTurbopufferNamespace", "chunkId": "api/introduction#client-fall-through" }, { "kind": "code", "literal": "queryTurbopufferNamespace", "chunkId": "api/introduction#client-fall-through" }, { "kind": "code", "literal": "turbopuffer_direct", "chunkId": "api/introduction#client-fall-through" }, { "kind": "code", "literal": "nearest_to_id", "chunkId": "api/introduction#client-fall-through" }, { "kind": "code", "literal": "fallback_to_turbopuffer=False", "chunkId": "api/introduction#client-fall-through" }, { "kind": "code", "literal": "AsyncHevlayer", "chunkId": "api/introduction#client-fall-through" }, { "kind": "code", "literal": "WithFallbackToTurbopuffer(false)", "chunkId": "api/introduction#client-fall-through" }, { "kind": "code", "literal": "fallbackToTurbopuffer: false", "chunkId": "api/introduction#client-fall-through" } ], "sources": [ { "chunkId": "api/introduction#client-fall-through", "url": "/docs/api/introduction#client-fall-through", "anchor": "client-fall-through" } ], "mode": "source-primary", "terms": [ "client", "fall", "through", "python", "typescript", "sdks", "turbopuffer", "direct", "gateway", "unreachable", "fallback", "limited", "calls", "satisfied", "without", "layer", "state", "simple", "vector", "write", "namespace", "writenamespace", "query", "queryturbopuffernamespace", "nearest", "false", "asynchevlayer", "withfallbacktoturbopuffer", "fallbacktoturbopuffer", "queries", "compatible", "methods", "such", "schema", "listing", "clients", "emit", "warning", "perf", "field" ] }, { "id": "api/introduction#compatibility-posture", "kind": "section", "title": "Introduction", "heading": "Compatibility posture", "group": "API", "url": "/docs/api/introduction#compatibility-posture", "summary": "Compatibility posture Layer aims to be a drop-in for existing Turbopuffer clients. Routes that the upstream does not implement are namespaced under /v2/ and do not shadow upstream behavior. If a Turbopuffer client sends…", "facts": [ { "kind": "code", "literal": "/v2/", "chunkId": "api/introduction#compatibility-posture" } ], "sources": [ { "chunkId": "api/introduction#compatibility-posture", "url": "/docs/api/introduction#compatibility-posture", "anchor": "compatibility-posture" } ], "mode": "source-primary", "terms": [ "compatibility", "posture", "layer", "aims", "drop", "existing", "turbopuffer", "clients", "routes", "upstream", "does", "implement", "namespaced", "under", "shadow", "behavior", "client", "sends", "request", "route", "doesn", "proxy", "gateway", "returns", "silently", "might", "handle", "differently" ] }, { "id": "api/introduction#cross-cutting-conventions", "kind": "section", "title": "Introduction", "heading": "Cross-cutting conventions", "group": "API", "url": "/docs/api/introduction#cross-cutting-conventions", "summary": "Cross-cutting conventions These apply to every endpoint Layer proxies, whether the route is upstream-compatible or Layer-only. hevlayer reserved. Document attributes prefixed with hevlayer are reserved for the proxy laye…", "facts": [ { "kind": "code", "literal": "_hevlayer_*", "chunkId": "api/introduction#cross-cutting-conventions" }, { "kind": "code", "literal": "_hevlayer_", "chunkId": "api/introduction#cross-cutting-conventions" }, { "kind": "code", "literal": "_hevlayer_upserted_at", "chunkId": "api/introduction#cross-cutting-conventions" }, { "kind": "code", "literal": "x-layer-cache", "chunkId": "api/introduction#cross-cutting-conventions" }, { "kind": "code", "literal": "hit", "chunkId": "api/introduction#cross-cutting-conventions" }, { "kind": "code", "literal": "miss", "chunkId": "api/introduction#cross-cutting-conventions" }, { "kind": "code", "literal": "miss-on-error", "chunkId": "api/introduction#cross-cutting-conventions" }, { "kind": "code", "literal": "x-layer-stable-as-of", "chunkId": "api/introduction#cross-cutting-conventions" }, { "kind": "code", "literal": "x-layer-next-cursor", "chunkId": "api/introduction#cross-cutting-conventions" } ], "sources": [ { "chunkId": "api/introduction#cross-cutting-conventions", "url": "/docs/api/introduction#cross-cutting-conventions", "anchor": "cross-cutting-conventions" } ], "mode": "source-primary", "terms": [ "cross", "cutting", "conventions", "these", "apply", "every", "endpoint", "layer", "proxies", "whether", "route", "upstream", "compatible", "only", "hevlayer", "reserved", "document", "attributes", "prefixed", "proxy", "laye", "upserted", "cache", "miss", "error", "stable", "next", "cursor", "writing", "validation", "reading", "fine", "explicitly", "requested", "gateway", "stamps", "hevlayerupsertedat", "itself", "upsert", "patch" ] }, { "id": "api/introduction#enhancements-to-upstream-routes", "kind": "section", "title": "Introduction", "heading": "Enhancements to upstream routes", "group": "API", "url": "/docs/api/introduction#enhancements-to-upstream-routes", "summary": "Enhancements to upstream routes Each of the routes below is wire-compatible with Turbopuffer. The body of each section describes only what Layer overlays on top.", "facts": [], "sources": [ { "chunkId": "api/introduction#enhancements-to-upstream-routes", "url": "/docs/api/introduction#enhancements-to-upstream-routes", "anchor": "enhancements-to-upstream-routes" } ], "mode": "source-primary", "terms": [ "enhancements", "upstream", "routes", "below", "wire", "compatible", "turbopuffer", "body", "section", "describes", "only", "layer", "overlays" ] }, { "id": "api/introduction#install", "kind": "section", "title": "Introduction", "heading": "Install", "group": "API", "url": "/docs/api/introduction#install", "summary": "Install There are four ways to call Layer: the Python client, the Go client, the TypeScript client, and the REST API itself. The clients are generated from apps/layer-gateway/openapi.yaml, so all four expose the same ope…", "facts": [ { "kind": "code", "literal": "pip install hevlayer # Python 3.11+\ngo get github.com/hev/layer/clients/go # Go 1.22+\nnpm install hevlayer # Node 18+", "chunkId": "api/introduction#install" }, { "kind": "code", "literal": "import os\n\nfrom hevlayer import AsyncHevlayer\n\nclient = AsyncHevlayer(\n base_url=os.environ[\"LAYER_GATEWAY_URL\"],\n api_key=os.environ[\"LAYER_GATEWAY_API_KEY\"],\n)", "chunkId": "api/introduction#install" }, { "kind": "code", "literal": "import (\n \"os\"\n\n hevlayer \"github.com/hev/layer/clients/go\"\n)\n\nclient := hevlayer.NewClient(\n hevlayer.WithBaseURL(os.Getenv(\"LAYER_GATEWAY_URL\")),\n hevlayer.WithAPIKey(os.Getenv(\"LAYER_GATEWAY_API_KEY\")),\n)", "chunkId": "api/introduction#install" }, { "kind": "code", "literal": "import { Hevlayer } from \"hevlayer\";\n\nconst client = new Hevlayer({\n baseUrl: process.env.LAYER_GATEWAY_URL,\n apiKey: process.env.LAYER_GATEWAY_API_KEY,\n});", "chunkId": "api/introduction#install" }, { "kind": "code", "literal": "curl \"$LAYER_GATEWAY_URL/v2/namespaces\" \\\n -H \"Authorization: Bearer $LAYER_GATEWAY_API_KEY\"", "chunkId": "api/introduction#install" }, { "kind": "code", "literal": "apps/layer-gateway/openapi.yaml", "chunkId": "api/introduction#install" }, { "kind": "code", "literal": "client", "chunkId": "api/introduction#install" }, { "kind": "code", "literal": "ctx context.Context", "chunkId": "api/introduction#install" } ], "sources": [ { "chunkId": "api/introduction#install", "url": "/docs/api/introduction#install", "anchor": "install" } ], "mode": "source-primary", "terms": [ "install", "there", "four", "ways", "call", "layer", "python", "client", "typescript", "rest", "itself", "clients", "generated", "apps", "gateway", "openapi", "yaml", "expose", "same", "hevlayer", "github", "node", "import", "asynchevlayer", "base", "environ", "newclient", "withbaseurl", "getenv", "withapikey", "const", "baseurl", "process", "apikey", "curl", "namespaces", "authorization", "bearer", "context", "operations" ] }, { "id": "api/introduction#metadata--get-v2namespacesnsmetadata", "kind": "section", "title": "Introduction", "heading": "Metadata — GET /v2/namespaces/{ns}/metadata", "group": "API", "url": "/docs/api/introduction#metadata--get-v2namespacesnsmetadata", "summary": "Metadata — GET /v2/namespaces/{ns}/metadata Upstream contract for namespace metadata — schema, row count, index status, timestamps. Proxied upstream verbatim, then enriched with a layer block containing stableasof and is…", "facts": [ { "kind": "code", "literal": "GET /v2/namespaces/{ns}/metadata", "chunkId": "api/introduction#metadata--get-v2namespacesnsmetadata" }, { "kind": "code", "literal": "layer", "chunkId": "api/introduction#metadata--get-v2namespacesnsmetadata" }, { "kind": "code", "literal": "stable_as_of", "chunkId": "api/introduction#metadata--get-v2namespacesnsmetadata" }, { "kind": "code", "literal": "is_stable", "chunkId": "api/introduction#metadata--get-v2namespacesnsmetadata" }, { "kind": "value", "literal": "turbopuffer.com", "chunkId": "api/introduction#metadata--get-v2namespacesnsmetadata" } ], "sources": [ { "chunkId": "api/introduction#metadata--get-v2namespacesnsmetadata", "url": "/docs/api/introduction#metadata--get-v2namespacesnsmetadata", "anchor": "metadata--get-v2namespacesnsmetadata" } ], "mode": "source-primary", "terms": [ "metadata", "namespaces", "upstream", "contract", "namespace", "schema", "count", "index", "status", "timestamps", "proxied", "verbatim", "enriched", "layer", "block", "containing", "stableasof", "stable", "turbopuffer", "isstable", "page" ] }, { "id": "api/introduction#query--post-v2namespacesnsquery", "kind": "section", "title": "Introduction", "heading": "Query — POST /v2/namespaces/{ns}/query", "group": "API", "url": "/docs/api/introduction#query--post-v2namespacesnsquery", "summary": "Query — POST /v2/namespaces/{ns}/query Upstream contract for vector and FTS queries — request shape, ranking, filters, attribute selection. Stable reads via an injected hevlayerupsertedat <= watermark predicate while the…", "facts": [ { "kind": "code", "literal": "POST /v2/namespaces/{ns}/query", "chunkId": "api/introduction#query--post-v2namespacesnsquery" }, { "kind": "code", "literal": "_hevlayer_upserted_at <= watermark", "chunkId": "api/introduction#query--post-v2namespacesnsquery" }, { "kind": "code", "literal": "updating", "chunkId": "api/introduction#query--post-v2namespacesnsquery" }, { "kind": "code", "literal": "x-layer-stable-as-of", "chunkId": "api/introduction#query--post-v2namespacesnsquery" }, { "kind": "value", "literal": "turbopuffer.com", "chunkId": "api/introduction#query--post-v2namespacesnsquery" } ], "sources": [ { "chunkId": "api/introduction#query--post-v2namespacesnsquery", "url": "/docs/api/introduction#query--post-v2namespacesnsquery", "anchor": "query--post-v2namespacesnsquery" } ], "mode": "source-primary", "terms": [ "query", "post", "namespaces", "upstream", "contract", "vector", "queries", "request", "shape", "ranking", "filters", "attribute", "selection", "stable", "reads", "injected", "hevlayerupsertedat", "watermark", "predicate", "while", "hevlayer", "upserted", "updating", "layer", "turbopuffer", "index", "shot", "retry", "filter", "forced", "race", "write", "storm", "returned", "read", "responses", "callers", "correlate", "freshness", "across" ] }, { "id": "api/introduction#write--post-v2namespacesns", "kind": "section", "title": "Introduction", "heading": "Write — POST /v2/namespaces/{ns}", "group": "API", "url": "/docs/api/introduction#write--post-v2namespacesns", "summary": "Write — POST /v2/namespaces/{ns} Upstream contract for upsert, delete, and patchrows. Best-effort Aerospike document-cache mirror before explicit-id upstream writes. Server-stamped hevlayerupsertedat on every upsert and…", "facts": [ { "kind": "code", "literal": "POST /v2/namespaces/{ns}", "chunkId": "api/introduction#write--post-v2namespacesns" }, { "kind": "code", "literal": "patch_rows", "chunkId": "api/introduction#write--post-v2namespacesns" }, { "kind": "code", "literal": "_hevlayer_upserted_at", "chunkId": "api/introduction#write--post-v2namespacesns" }, { "kind": "code", "literal": "_hevlayer_*", "chunkId": "api/introduction#write--post-v2namespacesns" }, { "kind": "value", "literal": "turbopuffer.com", "chunkId": "api/introduction#write--post-v2namespacesns" } ], "sources": [ { "chunkId": "api/introduction#write--post-v2namespacesns", "url": "/docs/api/introduction#write--post-v2namespacesns", "anchor": "write--post-v2namespacesns" } ], "mode": "source-primary", "terms": [ "write", "post", "namespaces", "upstream", "contract", "upsert", "delete", "patchrows", "best", "effort", "aerospike", "document", "cache", "mirror", "before", "explicit", "writes", "server", "stamped", "hevlayerupsertedat", "every", "patch", "rows", "hevlayer", "upserted", "turbopuffer", "powers", "consistency", "watermark", "query", "path", "attributes", "reserved", "rejected", "page" ] }, { "id": "api/keys", "kind": "section", "title": "API keys", "heading": null, "group": "API", "url": "/docs/api/keys", "summary": "Mint, verify, and revoke keys over REST; entitlements open stores, warehouses, or Layer itself. Layer mints its own API keys. What a key opens is declared per resource: each entitlement names a VectorStore, a Warehouse,…", "facts": [ { "kind": "code", "literal": "VectorStore", "chunkId": "api/keys" }, { "kind": "code", "literal": "Warehouse", "chunkId": "api/keys" }, { "kind": "code", "literal": "ApiKey", "chunkId": "api/keys" }, { "kind": "value", "literal": "CodeTabs.astro", "chunkId": "api/keys" } ], "sources": [ { "chunkId": "api/keys", "url": "/docs/api/keys", "anchor": null } ], "mode": "source-primary", "terms": [ "mint", "verify", "revoke", "keys", "rest", "entitlements", "open", "stores", "warehouses", "layer", "itself", "mints", "opens", "declared", "resource", "entitlement", "names", "vectorstore", "warehouse", "apikey", "codetabs", "astro", "carries", "scopes", "claims", "target", "page", "covers", "model", "surface", "anything", "default", "credential", "store", "already", "accepted", "admin", "bearer", "minting", "starts" ] }, { "id": "api/keys#authenticate", "kind": "section", "title": "API keys", "heading": "Authenticate", "group": "API", "url": "/docs/api/keys#authenticate", "summary": "Authenticate External systems present a raw token and get back keyId — a stable actor id — plus the full entitlements map, then make their own authorization decisions from the claims. This is the verb that makes Layer a…", "facts": [ { "kind": "code", "literal": "identity = await client.authenticate_key({\"token\": presented})\nclaims = identity.entitlements[\"warehouse.prod-snowflake\"].claims", "chunkId": "api/keys#authenticate" }, { "kind": "code", "literal": "identity, err := client.AuthenticateKey(ctx, &hevlayer.AuthenticateKeyRequest{\n\tToken: presented,\n})\nclaims := identity.Entitlements[\"warehouse.prod-snowflake\"].Claims", "chunkId": "api/keys#authenticate" }, { "kind": "code", "literal": "const identity = await client.authenticateKey({ token: presented });\nconst claims = identity.entitlements[\"warehouse.prod-snowflake\"].claims;", "chunkId": "api/keys#authenticate" }, { "kind": "code", "literal": "curl -X POST \"$LAYER_GATEWAY_URL/v2/keys/authenticate\" \\\n -H \"Content-Type: application/json\" \\\n -d '{\"token\": \"hvl_iqGFsDD2PNkyhCqr59jjvKuKL47vqXMz\"}'", "chunkId": "api/keys#authenticate" }, { "kind": "code", "literal": "keyId", "chunkId": "api/keys#authenticate" }, { "kind": "code", "literal": "200", "chunkId": "api/keys#authenticate" }, { "kind": "code", "literal": "{keyId, name, owner, entitlements, expiresAt}", "chunkId": "api/keys#authenticate" }, { "kind": "code", "literal": "401", "chunkId": "api/keys#authenticate" } ], "sources": [ { "chunkId": "api/keys#authenticate", "url": "/docs/api/keys#authenticate", "anchor": "authenticate" } ], "mode": "source-primary", "terms": [ "authenticate", "external", "systems", "present", "token", "back", "keyid", "stable", "actor", "plus", "full", "entitlements", "make", "their", "authorization", "decisions", "claims", "verb", "makes", "layer", "identity", "await", "client", "presented", "warehouse", "prod", "snowflake", "authenticatekey", "hevlayer", "authenticatekeyrequest", "const", "curl", "post", "gateway", "keys", "content", "type", "application", "json", "iqgfsdd2pnkyhcqr59jjvkukl47vqxmz" ] }, { "id": "api/keys#cli", "kind": "section", "title": "API keys", "heading": "CLI", "group": "API", "url": "/docs/api/keys#cli", "summary": "CLI The same operations from the CLI: layer keys mint cohort-reader --owner acme \\ --entitle vectorstore.prod-turbopuffer=read \\ --namespaces \"cohort-\" \\ --claim warehouse.prod-snowflake=\"notes:cohort::read\" layer keys l…", "facts": [ { "kind": "code", "literal": "layer keys mint cohort-reader --owner acme \\\n --entitle vectorstore.prod-turbopuffer=read \\\n --namespaces \"cohort-*\" \\\n --claim warehouse.prod-snowflake=\"notes:cohort:*:read\"\nlayer keys ls\nlayer keys revoke cohort-reader", "chunkId": "api/keys#cli" }, { "kind": "code", "literal": "layer keys mint", "chunkId": "api/keys#cli" } ], "sources": [ { "chunkId": "api/keys#cli", "url": "/docs/api/keys#cli", "anchor": "cli" } ], "mode": "source-primary", "terms": [ "same", "operations", "layer", "keys", "mint", "cohort", "reader", "owner", "acme", "entitle", "vectorstore", "prod", "turbopuffer", "read", "namespaces", "claim", "warehouse", "snowflake", "notes", "revoke", "prints", "token", "once", "alone", "stdout", "piping", "metadata", "table", "goes", "stderr" ] }, { "id": "api/keys#key-model", "kind": "section", "title": "API keys", "heading": "Key model", "group": "API", "url": "/docs/api/keys#key-model", "summary": "Key model Tokens look like hvliqGFsDD2PNkyhCqr59jjvKuKL47vqXMz. The raw token is returned once, in the mint response. Layer stores only one-way hashes; a lost token is revoked and re-minted, never recovered. Every key is…", "facts": [ { "kind": "code", "literal": "hvl_iqGFsDD2PNkyhCqr59jjvKuKL47vqXMz", "chunkId": "api/keys#key-model" }, { "kind": "code", "literal": "ApiKey", "chunkId": "api/keys#key-model" }, { "kind": "code", "literal": "is the audit trail, and", "chunkId": "api/keys#key-model" } ], "sources": [ { "chunkId": "api/keys#key-model", "url": "/docs/api/keys#key-model", "anchor": "key-model" } ], "mode": "source-primary", "terms": [ "model", "tokens", "look", "like", "hvliqgfsdd2pnkyhcqr59jjvkukl47vqxmz", "token", "returned", "once", "mint", "response", "layer", "stores", "only", "hashes", "lost", "revoked", "minted", "never", "recovered", "every", "iqgfsdd2pnkyhcqr59jjvkukl47vqxmz", "apikey", "audit", "trail", "resource", "cluster", "kubectl", "apikeys", "apply", "equal", "authoring", "surface", "operator", "mints", "delivers", "secret", "revoking", "keeps", "record", "deleting" ] }, { "id": "api/keys#kubectl", "kind": "section", "title": "API keys", "heading": "kubectl", "group": "API", "url": "/docs/api/keys#kubectl", "summary": "kubectl The CRD is the other authoring surface — apply an ApiKey with no credential and the operator mints the token into a Secret named by status.secretRef: kubectl apply -f key.yaml kubectl get apikeys -n layer kubectl…", "facts": [ { "kind": "code", "literal": "kubectl apply -f key.yaml\nkubectl get apikeys -n layer\nkubectl get secret apikey-cohort-reader -n layer -o jsonpath='{.data.token}' | base64 -d", "chunkId": "api/keys#kubectl" }, { "kind": "code", "literal": "ApiKey", "chunkId": "api/keys#kubectl" }, { "kind": "code", "literal": "status.secretRef", "chunkId": "api/keys#kubectl" }, { "kind": "code", "literal": "and", "chunkId": "api/keys#kubectl" }, { "kind": "flag", "literal": "-o", "chunkId": "api/keys#kubectl" } ], "sources": [ { "chunkId": "api/keys#kubectl", "url": "/docs/api/keys#kubectl", "anchor": "kubectl" } ], "mode": "source-primary", "terms": [ "kubectl", "other", "authoring", "surface", "apply", "apikey", "credential", "operator", "mints", "token", "secret", "named", "status", "secretref", "yaml", "apikeys", "layer", "cohort", "reader", "jsonpath", "data", "base64", "both", "surfaces", "round", "trip", "through", "schema", "keys", "keyid", "spellings", "same", "object", "page", "full", "resource", "model", "kubernetes", "rbac", "chart" ] }, { "id": "api/keys#list-and-get", "kind": "section", "title": "API keys", "heading": "List and get", "group": "API", "url": "/docs/api/keys#list-and-get", "summary": "List and get keys = await client.listkeys() key = await client.getkey(\"0a1b2c3d-4e5f-6071-8293-a4b5c6d7e8f9\") keys, err := client.ListKeys(ctx, nil) key, err := client.GetKey(ctx, \"0a1b2c3d-4e5f-6071-8293-a4b5c6d7e8f9\")…", "facts": [ { "kind": "code", "literal": "keys = await client.list_keys()\nkey = await client.get_key(\"0a1b2c3d-4e5f-6071-8293-a4b5c6d7e8f9\")", "chunkId": "api/keys#list-and-get" }, { "kind": "code", "literal": "keys, err := client.ListKeys(ctx, nil)\nkey, err := client.GetKey(ctx, \"0a1b2c3d-4e5f-6071-8293-a4b5c6d7e8f9\")", "chunkId": "api/keys#list-and-get" }, { "kind": "code", "literal": "const keys = await client.listKeys();\nconst key = await client.getKey(\"0a1b2c3d-4e5f-6071-8293-a4b5c6d7e8f9\");", "chunkId": "api/keys#list-and-get" }, { "kind": "code", "literal": "curl \"$LAYER_GATEWAY_URL/v2/keys\" \\\n -H \"Authorization: Bearer $LAYER_GATEWAY_API_KEY\"", "chunkId": "api/keys#list-and-get" }, { "kind": "code", "literal": "lastSeenAt", "chunkId": "api/keys#list-and-get" } ], "sources": [ { "chunkId": "api/keys#list-and-get", "url": "/docs/api/keys#list-and-get", "anchor": "list-and-get" } ], "mode": "source-primary", "terms": [ "list", "keys", "await", "client", "listkeys", "getkey", "0a1b2c3d", "4e5f", "6071", "8293", "a4b5c6d7e8f9", "const", "curl", "layer", "gateway", "authorization", "bearer", "lastseenat", "layergatewayurl", "layergatewayapikey", "keyid", "name", "cohort", "reader", "owner", "acme", "entitlements", "vectorstore", "prod", "turbopuffer", "scopes", "read", "namespaces", "warehouse", "snowflake", "claims", "notes", "phase", "active", "createdat" ] }, { "id": "api/keys#mint", "kind": "section", "title": "API keys", "heading": "Mint", "group": "API", "url": "/docs/api/keys#mint", "summary": "Mint key = await client.mintkey({ \"name\": \"cohort-reader\", \"owner\": \"acme\", \"entitlements\": { \"vectorstore.prod-turbopuffer\": { \"scopes\": [\"read\"], \"namespaces\": [\"cohort-\"], }, \"warehouse.prod-snowflake\": { \"claims\": [\"…", "facts": [ { "kind": "code", "literal": "const key = await client.mintKey({\n name: \"cohort-reader\",\n owner: \"acme\",\n entitlements: {\n \"vectorstore.prod-turbopuffer\": {\n scopes: [\"read\"],\n namespaces: [\"cohort-*\"],\n },\n \"warehouse.prod-snowflake\": {\n claims: [\"notes:cohort:*:read\"],\n },\n },\n expiresAfter: \"365d\",\n});\nconsole.log(key.token); // shown once, never again", "chunkId": "api/keys#mint" }, { "kind": "code", "literal": "201 Created", "chunkId": "api/keys#mint" }, { "kind": "code", "literal": "name", "chunkId": "api/keys#mint" }, { "kind": "code", "literal": "ApiKey", "chunkId": "api/keys#mint" }, { "kind": "code", "literal": "owner", "chunkId": "api/keys#mint" }, { "kind": "code", "literal": "description", "chunkId": "api/keys#mint" }, { "kind": "code", "literal": "entitlements", "chunkId": "api/keys#mint" }, { "kind": "code", "literal": "vectorstore.", "chunkId": "api/keys#mint" }, { "kind": "code", "literal": "warehouse.", "chunkId": "api/keys#mint" }, { "kind": "code", "literal": "layer", "chunkId": "api/keys#mint" }, { "kind": "code", "literal": "expiresAfter", "chunkId": "api/keys#mint" }, { "kind": "code", "literal": "never", "chunkId": "api/keys#mint" }, { "kind": "code", "literal": "365d", "chunkId": "api/keys#mint" }, { "kind": "code", "literal": "expiresAt", "chunkId": "api/keys#mint" }, { "kind": "code", "literal": "EntitlementTargetMissing", "chunkId": "api/keys#mint" } ], "sources": [ { "chunkId": "api/keys#mint", "url": "/docs/api/keys#mint", "anchor": "mint" } ], "mode": "source-primary", "terms": [ "mint", "await", "client", "mintkey", "name", "cohort", "reader", "owner", "acme", "entitlements", "vectorstore", "prod", "turbopuffer", "scopes", "read", "namespaces", "warehouse", "snowflake", "claims", "const", "notes", "expiresafter", "365d", "console", "token", "shown", "once", "never", "again", "created", "apikey", "description", "layer", "expiresat", "entitlementtargetmissing", "print", "hevlayer", "mintkeyrequest", "apikeyentitlements", "string" ] }, { "id": "api/keys#revoke-and-delete", "kind": "section", "title": "API keys", "heading": "Revoke and delete", "group": "API", "url": "/docs/api/keys#revoke-and-delete", "summary": "Revoke and delete await client.revokekey(\"0a1b2c3d-4e5f-6071-8293-a4b5c6d7e8f9\") await client.deletekey(\"0a1b2c3d-4e5f-6071-8293-a4b5c6d7e8f9\") , err := client.RevokeKey(ctx, \"0a1b2c3d-4e5f-6071-8293-a4b5c6d7e8f9\") , err…", "facts": [ { "kind": "code", "literal": "await client.revoke_key(\"0a1b2c3d-4e5f-6071-8293-a4b5c6d7e8f9\")\nawait client.delete_key(\"0a1b2c3d-4e5f-6071-8293-a4b5c6d7e8f9\")", "chunkId": "api/keys#revoke-and-delete" }, { "kind": "code", "literal": "_, err := client.RevokeKey(ctx, \"0a1b2c3d-4e5f-6071-8293-a4b5c6d7e8f9\")\n_, err = client.DeleteKey(ctx, \"0a1b2c3d-4e5f-6071-8293-a4b5c6d7e8f9\")", "chunkId": "api/keys#revoke-and-delete" }, { "kind": "code", "literal": "await client.revokeKey(\"0a1b2c3d-4e5f-6071-8293-a4b5c6d7e8f9\");\nawait client.deleteKey(\"0a1b2c3d-4e5f-6071-8293-a4b5c6d7e8f9\");", "chunkId": "api/keys#revoke-and-delete" }, { "kind": "code", "literal": "curl -X POST \"$LAYER_GATEWAY_URL/v2/keys/$KEY_ID/revoke\" \\\n -H \"Authorization: Bearer $LAYER_GATEWAY_API_KEY\"", "chunkId": "api/keys#revoke-and-delete" }, { "kind": "code", "literal": "?includeRevoked", "chunkId": "api/keys#revoke-and-delete" } ], "sources": [ { "chunkId": "api/keys#revoke-and-delete", "url": "/docs/api/keys#revoke-and-delete", "anchor": "revoke-and-delete" } ], "mode": "source-primary", "terms": [ "revoke", "delete", "await", "client", "revokekey", "0a1b2c3d", "4e5f", "6071", "8293", "a4b5c6d7e8f9", "deletekey", "curl", "post", "layer", "gateway", "keys", "authorization", "bearer", "includerevoked", "layergatewayurl", "keyid", "layergatewayapikey", "stops", "accepting", "revoked", "within", "seconds", "rotation", "mint", "deploy", "there", "place", "stay", "listings", "audit", "record", "itself", "should" ] }, { "id": "api/keys#routes", "kind": "section", "title": "API keys", "heading": "Routes", "group": "API", "url": "/docs/api/keys#routes", "summary": "Routes Route Method Auth Behavior /v2/keys POST admin Mint a key. The only response that contains the raw token. /v2/keys GET admin List keys — metadata, never material. ?includeRevoked adds revoked and expired keys. /v2…", "facts": [ { "kind": "code", "literal": "/v2/keys", "chunkId": "api/keys#routes" }, { "kind": "code", "literal": "?includeRevoked", "chunkId": "api/keys#routes" }, { "kind": "code", "literal": "/v2/keys/{keyId}", "chunkId": "api/keys#routes" }, { "kind": "code", "literal": "/v2/keys/{keyId}/revoke", "chunkId": "api/keys#routes" }, { "kind": "code", "literal": "/v2/keys/authenticate", "chunkId": "api/keys#routes" }, { "kind": "code", "literal": "layer", "chunkId": "api/keys#routes" }, { "kind": "code", "literal": "admin", "chunkId": "api/keys#routes" }, { "kind": "code", "literal": "authenticate", "chunkId": "api/keys#routes" } ], "sources": [ { "chunkId": "api/keys#routes", "url": "/docs/api/keys#routes", "anchor": "routes" } ], "mode": "source-primary", "terms": [ "routes", "route", "method", "auth", "behavior", "keys", "post", "admin", "mint", "only", "response", "contains", "token", "list", "metadata", "never", "material", "includerevoked", "adds", "revoked", "expired", "keyid", "revoke", "authenticate", "layer", "idempotent", "record", "stays", "delete", "hard", "none", "exchange", "identity", "entitlements", "here", "means", "entitlement", "scope", "bootstrap", "gateway" ] }, { "id": "api/keys#using-a-minted-key", "kind": "section", "title": "API keys", "heading": "Using a minted key", "group": "API", "url": "/docs/api/keys#using-a-minted-key", "summary": "Using a minted key A minted key works anywhere its entitlements reach. A vectorstore. entitlement opens data-plane routes against that store, inside its namespace globs: curl \"$LAYERGATEWAYURL/v2/namespaces/cohort-7/quer…", "facts": [ { "kind": "code", "literal": "curl \"$LAYER_GATEWAY_URL/v2/namespaces/cohort-7/query\" \\\n -X POST \\\n -H \"Authorization: Bearer hvl_iqGFsDD2PNkyhCqr59jjvKuKL47vqXMz\" \\\n -H \"Content-Type: application/json\" \\\n -d '{\"rank_by\": [\"text\", \"BM25\", \"acme\"], \"top_k\": 10}'", "chunkId": "api/keys#using-a-minted-key" }, { "kind": "code", "literal": "{\"error\": \"namespace not in key grant\", \"namespace\": \"orders\"}", "chunkId": "api/keys#using-a-minted-key" }, { "kind": "code", "literal": "{\"error\": \"insufficient API key scope\", \"required_scope\": \"admin\"}", "chunkId": "api/keys#using-a-minted-key" }, { "kind": "code", "literal": "vectorstore.", "chunkId": "api/keys#using-a-minted-key" } ], "sources": [ { "chunkId": "api/keys#using-a-minted-key", "url": "/docs/api/keys#using-a-minted-key", "anchor": "using-a-minted-key" } ], "mode": "source-primary", "terms": [ "minted", "works", "anywhere", "entitlements", "reach", "vectorstore", "entitlement", "opens", "data", "plane", "routes", "against", "store", "inside", "namespace", "globs", "curl", "layergatewayurl", "namespaces", "cohort", "quer", "layer", "gateway", "query", "post", "authorization", "bearer", "iqgfsdd2pnkyhcqr59jjvkukl47vqxmz", "content", "type", "application", "json", "rank", "text", "bm25", "acme", "error", "grant", "orders", "insufficient" ] }, { "id": "api/namespace-metadata", "kind": "section", "title": "Namespace metadata", "heading": null, "group": "API", "url": "/docs/api/namespace-metadata", "summary": "Read namespace metadata enriched with Layer freshness signals. The metadata payload is proxied verbatim from the upstream /v2/namespaces/{ns}/metadata endpoint. Schema, row counts, index status, and timestamps follow the…", "facts": [ { "kind": "code", "literal": "/v2/namespaces/{ns}/metadata", "chunkId": "api/namespace-metadata" }, { "kind": "value", "literal": "Upstream.astro", "chunkId": "api/namespace-metadata" }, { "kind": "value", "literal": "CodeTabs.astro", "chunkId": "api/namespace-metadata" }, { "kind": "value", "literal": "turbopuffer.com", "chunkId": "api/namespace-metadata" } ], "sources": [ { "chunkId": "api/namespace-metadata", "url": "/docs/api/namespace-metadata", "anchor": null } ], "mode": "source-primary", "terms": [ "read", "namespace", "metadata", "enriched", "layer", "freshness", "signals", "payload", "proxied", "verbatim", "upstream", "namespaces", "endpoint", "schema", "counts", "index", "status", "timestamps", "follow", "astro", "codetabs", "turbopuffer", "contract", "adds", "single", "object" ] }, { "id": "api/namespace-metadata#list-namespaces", "kind": "section", "title": "Namespace metadata", "heading": "List namespaces", "group": "API", "url": "/docs/api/namespace-metadata#list-namespaces", "summary": "List namespaces GET /v2/namespaces is a Layer-only augmented listing. It pages the upstream namespace list and enriches each row with stability and cache signals. It is the endpoint the dashboard's inventory view reads.…", "facts": [ { "kind": "code", "literal": "namespaces = await client.list_namespaces(prefix=\"prod\", page_size=100)", "chunkId": "api/namespace-metadata#list-namespaces" }, { "kind": "code", "literal": "namespaces, err := client.ListNamespaces(ctx, &hevlayer.ListNamespacesParams{\n Prefix: \"prod\",\n PageSize: 100,\n})", "chunkId": "api/namespace-metadata#list-namespaces" }, { "kind": "code", "literal": "const namespaces = await client.listNamespaces({\n prefix: \"prod\",\n pageSize: 100,\n});", "chunkId": "api/namespace-metadata#list-namespaces" }, { "kind": "code", "literal": "curl \"$LAYER_GATEWAY_URL/v2/namespaces?prefix=prod&page_size=100\" \\\n -H \"Authorization: Bearer $LAYER_GATEWAY_API_KEY\"", "chunkId": "api/namespace-metadata#list-namespaces" }, { "kind": "code", "literal": "{\n \"namespaces\": [\n {\n \"name\": \"products\",\n \"row_count\": 12500,\n \"size_bytes\": 48800000,\n \"stable_as_of_ms\": 1715600400000,\n \"is_stable\": true,\n \"index\": { \"status\": \"up-to-date\" },\n \"cache_state\": {\"state\": \"warm\", \"warm_inflight\": false},\n \"last_write_ms\": 1715600399000,\n \"shadow\": false,\n \"labels\": {}\n }\n ],\n \"next_cursor\": \"...\"\n}", "chunkId": "api/namespace-metadata#list-namespaces" }, { "kind": "code", "literal": "GET /v2/namespaces", "chunkId": "api/namespace-metadata#list-namespaces" }, { "kind": "code", "literal": "is_stable", "chunkId": "api/namespace-metadata#list-namespaces" }, { "kind": "code", "literal": "index.status", "chunkId": "api/namespace-metadata#list-namespaces" }, { "kind": "code", "literal": "\"up-to-date\"", "chunkId": "api/namespace-metadata#list-namespaces" }, { "kind": "code", "literal": "\"updating\"", "chunkId": "api/namespace-metadata#list-namespaces" }, { "kind": "code", "literal": "stable_as_of_ms", "chunkId": "api/namespace-metadata#list-namespaces" }, { "kind": "code", "literal": "indexed", "chunkId": "api/namespace-metadata#list-namespaces" }, { "kind": "code", "literal": "index_lag_rows", "chunkId": "api/namespace-metadata#list-namespaces" }, { "kind": "code", "literal": "GET /v2/namespaces/{namespace}/metadata", "chunkId": "api/namespace-metadata#list-namespaces" }, { "kind": "code", "literal": "index", "chunkId": "api/namespace-metadata#list-namespaces" }, { "kind": "code", "literal": "index.unindexed_bytes", "chunkId": "api/namespace-metadata#list-namespaces" }, { "kind": "code", "literal": "updating", "chunkId": "api/namespace-metadata#list-namespaces" }, { "kind": "code", "literal": "prefix", "chunkId": "api/namespace-metadata#list-namespaces" }, { "kind": "code", "literal": "cursor", "chunkId": "api/namespace-metadata#list-namespaces" }, { "kind": "code", "literal": "next_cursor", "chunkId": "api/namespace-metadata#list-namespaces" }, { "kind": "code", "literal": "page_size", "chunkId": "api/namespace-metadata#list-namespaces" }, { "kind": "code", "literal": "metadata_error", "chunkId": "api/namespace-metadata#list-namespaces" }, { "kind": "code", "literal": "NAMESPACE_LIST_CACHE_TTL_MS", "chunkId": "api/namespace-metadata#list-namespaces" }, { "kind": "code", "literal": "10000", "chunkId": "api/namespace-metadata#list-namespaces" } ], "sources": [ { "chunkId": "api/namespace-metadata#list-namespaces", "url": "/docs/api/namespace-metadata#list-namespaces", "anchor": "list-namespaces" } ], "mode": "source-primary", "terms": [ "list", "namespaces", "layer", "only", "augmented", "listing", "pages", "upstream", "namespace", "enriches", "stability", "cache", "signals", "endpoint", "dashboard", "inventory", "view", "reads", "await", "client", "prefix", "prod", "page", "size", "listnamespaces", "hevlayer", "listnamespacesparams", "pagesize", "const", "curl", "gateway", "authorization", "bearer", "name", "products", "count", "12500", "bytes", "48800000", "stable" ] }, { "id": "api/namespace-metadata#request", "kind": "section", "title": "Namespace metadata", "heading": "Request", "group": "API", "url": "/docs/api/namespace-metadata#request", "summary": "Request metadata = await client.getnamespacemetadata(\"products\") metadata, err := client.GetNamespaceMetadata(ctx, \"products\") const metadata = await client.getNamespaceMetadata(\"products\"); curl \"$LAYERGATEWAYURL/v2/nam…", "facts": [ { "kind": "code", "literal": "metadata = await client.get_namespace_metadata(\"products\")", "chunkId": "api/namespace-metadata#request" }, { "kind": "code", "literal": "metadata, err := client.GetNamespaceMetadata(ctx, \"products\")", "chunkId": "api/namespace-metadata#request" }, { "kind": "code", "literal": "const metadata = await client.getNamespaceMetadata(\"products\");", "chunkId": "api/namespace-metadata#request" }, { "kind": "code", "literal": "curl \"$LAYER_GATEWAY_URL/v2/namespaces/products/metadata\" \\\n -H \"Authorization: Bearer $LAYER_GATEWAY_API_KEY\"", "chunkId": "api/namespace-metadata#request" } ], "sources": [ { "chunkId": "api/namespace-metadata#request", "url": "/docs/api/namespace-metadata#request", "anchor": "request" } ], "mode": "source-primary", "terms": [ "request", "metadata", "await", "client", "getnamespacemetadata", "products", "const", "curl", "layergatewayurl", "namespace", "layer", "gateway", "namespaces", "authorization", "bearer", "layergatewayapikey", "proxied", "turbopuffer", "verbatim", "schema", "approxrowcount", "12500", "approxlogicalbytes", "48800000", "createdat", "2026", "15t10", "updatedat", "12t18", "lastwriteat", "index", "status", "date", "enhancement", "stableasof", "1715600400000", "isstable", "true", "indexed", "indexlagrows" ] }, { "id": "api/namespace-metadata#the-layer-block", "kind": "section", "title": "Namespace metadata", "heading": "The layer block", "group": "API", "url": "/docs/api/namespace-metadata#the-layer-block", "summary": "The layer block Field Meaning stableasof Epoch-ms watermark from the most recent stable poll. Null on cold start before the watcher has observed a stable namespace. isstable Whether the most recent poll observed index.st…", "facts": [ { "kind": "code", "literal": "layer", "chunkId": "api/namespace-metadata#the-layer-block" }, { "kind": "code", "literal": "stable_as_of", "chunkId": "api/namespace-metadata#the-layer-block" }, { "kind": "code", "literal": "is_stable", "chunkId": "api/namespace-metadata#the-layer-block" }, { "kind": "code", "literal": "index.status == \"up-to-date\"", "chunkId": "api/namespace-metadata#the-layer-block" }, { "kind": "code", "literal": "indexed", "chunkId": "api/namespace-metadata#the-layer-block" }, { "kind": "code", "literal": "index_lag_rows", "chunkId": "api/namespace-metadata#the-layer-block" }, { "kind": "code", "literal": "CONSISTENCY_STABLE_POLL_INTERVAL_MS", "chunkId": "api/namespace-metadata#the-layer-block" }, { "kind": "code", "literal": "CONSISTENCY_POLL_INTERVAL_MS", "chunkId": "api/namespace-metadata#the-layer-block" }, { "kind": "code", "literal": "is_stable: true", "chunkId": "api/namespace-metadata#the-layer-block" }, { "kind": "code", "literal": "indexed: false", "chunkId": "api/namespace-metadata#the-layer-block" } ], "sources": [ { "chunkId": "api/namespace-metadata#the-layer-block", "url": "/docs/api/namespace-metadata#the-layer-block", "anchor": "the-layer-block" } ], "mode": "source-primary", "terms": [ "layer", "block", "field", "meaning", "stableasof", "epoch", "watermark", "most", "recent", "stable", "poll", "null", "cold", "start", "before", "watcher", "observed", "namespace", "isstable", "whether", "index", "status", "date", "indexed", "rows", "consistency", "interval", "true", "false", "once", "catches", "every", "carries", "vector", "snapshot", "count", "caught", "while", "still", "awaiting" ] }, { "id": "api/pipelines", "kind": "section", "title": "Pipelines", "heading": null, "group": "API", "url": "/docs/api/pipelines", "summary": "Organize a two-stage indexing pipeline: extract + chunk on CPU, embed on GPU, trigger runs and wait for completion. The pipeline API keeps the code you need to index data simple and organized. A typical pipeline has two…", "facts": [ { "kind": "value", "literal": "CodeTabs.astro", "chunkId": "api/pipelines" } ], "sources": [ { "chunkId": "api/pipelines", "url": "/docs/api/pipelines", "anchor": null } ], "mode": "source-primary", "terms": [ "organize", "stage", "indexing", "pipeline", "extract", "chunk", "embed", "trigger", "runs", "wait", "completion", "keeps", "code", "need", "index", "data", "simple", "organized", "typical", "codetabs", "astro", "stages", "extraction", "chunking", "followed", "embedding", "guide", "walks", "through", "best", "practice", "layout", "concepts", "expand" ] }, { "id": "api/pipelines#deploy", "kind": "section", "title": "Pipelines", "heading": "Deploy", "group": "API", "url": "/docs/api/pipelines#deploy", "summary": "Deploy Build the two workers into the images your YAML references and push them to a registry your cluster can pull — Layer does not build images. Then apply the resources: kubectl apply -f pipelines/ The operator create…", "facts": [ { "kind": "code", "literal": "kubectl apply -f pipelines/", "chunkId": "api/pipelines#deploy" } ], "sources": [ { "chunkId": "api/pipelines#deploy", "url": "/docs/api/pipelines#deploy", "anchor": "deploy" } ], "mode": "source-primary", "terms": [ "deploy", "build", "workers", "images", "yaml", "references", "push", "registry", "cluster", "pull", "layer", "does", "apply", "resources", "kubectl", "pipelines", "operator", "create", "creates", "deployment", "resource", "embed", "pool", "keda", "object", "order", "doesn", "matter", "here", "gateway", "pipeline", "before", "enqueues", "batch", "staging", "exist", "returns", "never", "missing", "nothing" ] }, { "id": "api/pipelines#document-lifecycle", "kind": "section", "title": "Pipelines", "heading": "Document lifecycle", "group": "API", "url": "/docs/api/pipelines#document-lifecycle", "summary": "Document lifecycle put chunks put vectors (new doc) ──────────► pending ──────────────► indexed ▲ │ re-stage (idempotent) pending — chunks stored, waiting for embedding. indexed — vectors written to Turbopuffer. embeddin…", "facts": [ { "kind": "code", "literal": "put chunks put vectors\n (new doc) ──────────► pending ──────────────► indexed\n ▲\n │ re-stage (idempotent)", "chunkId": "api/pipelines#document-lifecycle" }, { "kind": "code", "literal": "embedding", "chunkId": "api/pipelines#document-lifecycle" }, { "kind": "code", "literal": "pending", "chunkId": "api/pipelines#document-lifecycle" } ], "sources": [ { "chunkId": "api/pipelines#document-lifecycle", "url": "/docs/api/pipelines#document-lifecycle", "anchor": "document-lifecycle" } ], "mode": "source-primary", "terms": [ "document", "lifecycle", "chunks", "vectors", "pending", "indexed", "stage", "idempotent", "stored", "waiting", "embedding", "written", "turbopuffer", "embeddin", "claim", "documents", "only", "while", "leased", "worker", "recover", "lease", "expires", "staging", "resets", "reprocess", "after", "source", "data", "changes" ] }, { "id": "api/pipelines#embed", "kind": "section", "title": "Pipelines", "heading": "Embed", "group": "API", "url": "/docs/api/pipelines#embed", "summary": "Embed The GPU worker claims pending documents, reads their chunks back, and writes vectors. Writing vectors upserts to Turbopuffer and marks the document indexed. Claims are leased, so a worker that crashes loses nothing…", "facts": [ { "kind": "code", "literal": "indexed", "chunkId": "api/pipelines#embed" } ], "sources": [ { "chunkId": "api/pipelines#embed", "url": "/docs/api/pipelines#embed", "anchor": "embed" } ], "mode": "source-primary", "terms": [ "embed", "worker", "claims", "pending", "documents", "reads", "their", "chunks", "back", "writes", "vectors", "writing", "upserts", "turbopuffer", "marks", "document", "indexed", "leased", "crashes", "loses", "nothing", "hevlayer", "import", "asynchevlayer", "sentencetransformers", "sentencetransformer", "pipeline", "environ", "hevlayerpipelineid", "model", "minilm", "async", "main", "none", "baseurl", "hevlayerbaseurl", "apikey", "layergatewayapikey", "layer", "while" ] }, { "id": "api/pipelines#extract-and-chunk", "kind": "section", "title": "Pipelines", "heading": "Extract and chunk", "group": "API", "url": "/docs/api/pipelines#extract-and-chunk", "summary": "Extract and chunk The CPU worker reads the source, splits text into chunks, and stages them. Staging chunks stores them durably (S3, cached in the document cache) and marks the document pending. The worker hardcodes noth…", "facts": [ { "kind": "code", "literal": "pending", "chunkId": "api/pipelines#extract-and-chunk" }, { "kind": "code", "literal": "spec.sourceRef", "chunkId": "api/pipelines#extract-and-chunk" }, { "kind": "code", "literal": "sourceRef", "chunkId": "api/pipelines#extract-and-chunk" }, { "kind": "code", "literal": "pipelines/extract-chunk.yaml", "chunkId": "api/pipelines#extract-and-chunk" } ], "sources": [ { "chunkId": "api/pipelines#extract-and-chunk", "url": "/docs/api/pipelines#extract-and-chunk", "anchor": "extract-and-chunk" } ], "mode": "source-primary", "terms": [ "extract", "chunk", "worker", "reads", "source", "splits", "text", "chunks", "stages", "staging", "stores", "durably", "cached", "document", "cache", "marks", "pending", "hardcodes", "noth", "spec", "sourceref", "pipelines", "yaml", "nothing", "operator", "injects", "pipeline", "gateway", "environment", "variables", "page", "queue", "below", "comes", "declared", "extractchunk", "hevlayer", "import", "asynchevlayer", "environ" ] }, { "id": "api/pipelines#failure-model", "kind": "section", "title": "Pipelines", "heading": "Failure model", "group": "API", "url": "/docs/api/pipelines#failure-model", "summary": "Failure model Turbopuffer write failures are hard: the vectors route returns 502 and the document stays in embedding for re-claim. Aerospike cache failures do not block chunk reads when S3 backing is present; PostgreSQL…", "facts": [ { "kind": "code", "literal": "embedding", "chunkId": "api/pipelines#failure-model" } ], "sources": [ { "chunkId": "api/pipelines#failure-model", "url": "/docs/api/pipelines#failure-model", "anchor": "failure-model" } ], "mode": "source-primary", "terms": [ "failure", "model", "turbopuffer", "write", "failures", "hard", "vectors", "route", "returns", "document", "stays", "embedding", "claim", "aerospike", "cache", "block", "chunk", "reads", "backing", "present", "postgresql", "connectivity", "return", "should", "retried", "backoff", "stop", "writes", "recovery", "path", "metrics", "watch", "live", "mode", "runbook", "lease", "expiry", "handled", "server", "side" ] }, { "id": "api/pipelines#file-tree", "kind": "section", "title": "Pipelines", "heading": "File tree", "group": "API", "url": "/docs/api/pipelines#file-tree", "summary": "File tree indexer/ ├── pipelines/ │ ├── extract-chunk.yaml # CPU stage — Pipeline resource │ └── embed.yaml # GPU stage — Pipeline resource ├── extractchunk.py # read the source, stage chunks ├── embed.py # claim pending…", "facts": [ { "kind": "code", "literal": "indexer/\n├── pipelines/\n│ ├── extract-chunk.yaml # CPU stage — Pipeline resource\n│ └── embed.yaml # GPU stage — Pipeline resource\n├── extract_chunk.py # read the source, stage chunks\n├── embed.py # claim pending docs, write vectors\n└── app.py # REST API: trigger a run, wait for completion", "chunkId": "api/pipelines#file-tree" }, { "kind": "code", "literal": "pipelineId: products", "chunkId": "api/pipelines#file-tree" } ], "sources": [ { "chunkId": "api/pipelines#file-tree", "url": "/docs/api/pipelines#file-tree", "anchor": "file-tree" } ], "mode": "source-primary", "terms": [ "file", "tree", "indexer", "pipelines", "extract", "chunk", "yaml", "stage", "pipeline", "resource", "embed", "extractchunk", "read", "source", "chunks", "claim", "pending", "docs", "write", "vectors", "rest", "trigger", "wait", "completion", "pipelineid", "products", "files", "declare", "worker", "images", "pools", "scaling", "fields", "both", "workers", "share", "queue", "page", "code", "shown" ] }, { "id": "api/pipelines#trigger-a-run", "kind": "section", "title": "Pipelines", "heading": "Trigger a run", "group": "API", "url": "/docs/api/pipelines#trigger-a-run", "summary": "Trigger a run The app exposes the pipeline to the rest of your system as one endpoint: POST /index-runs sends a batch to the source queue, then waits for the run to complete and returns the snapshot it produced. The pipe…", "facts": [ { "kind": "code", "literal": "POST /index-runs", "chunkId": "api/pipelines#trigger-a-run" } ], "sources": [ { "chunkId": "api/pipelines#trigger-a-run", "url": "/docs/api/pipelines#trigger-a-run", "anchor": "trigger-a-run" } ], "mode": "source-primary", "terms": [ "trigger", "exposes", "pipeline", "rest", "system", "endpoint", "post", "index", "runs", "sends", "batch", "source", "queue", "waits", "complete", "returns", "snapshot", "produced", "pipe", "created", "first", "target", "namespace", "code", "fastapi", "import", "hevlayer", "asynchevlayer", "hevlayererror", "https", "east", "amazonaws", "123456789", "product", "updates", "boto3", "client", "layer", "baseurl", "environ" ] }, { "id": "api/pipelines#wait-for-completion", "kind": "section", "title": "Pipelines", "heading": "Wait for completion", "group": "API", "url": "/docs/api/pipelines#wait-for-completion", "summary": "Wait for completion A run is complete in two steps: the queue drains, then the consistency watcher observes the namespace stable and writes a snapshot past the run's watermark. pendingcount is the same signal KEDA scales…", "facts": [ { "kind": "code", "literal": "pending_count", "chunkId": "api/pipelines#wait-for-completion" } ], "sources": [ { "chunkId": "api/pipelines#wait-for-completion", "url": "/docs/api/pipelines#wait-for-completion", "anchor": "wait-for-completion" } ], "mode": "source-primary", "terms": [ "wait", "completion", "complete", "steps", "queue", "drains", "consistency", "watcher", "observes", "namespace", "stable", "writes", "snapshot", "past", "watermark", "pendingcount", "same", "signal", "keda", "scales", "pending", "count", "reaches", "zero", "embed", "pool", "back", "addresses", "facet", "listings", "counts", "exact", "flip", "application", "async", "drain", "none", "while", "true", "status" ] }, { "id": "api/query", "kind": "section", "title": "Query & Fetch", "heading": null, "group": "API", "url": "/docs/api/query", "summary": "Vector similarity search with stable reads, query by id, and cached document fetch. Query response bodies are wire-compatible with the upstream POST /v2/namespaces/{ns}/query endpoint. Layer metadata is reported in x-lay…", "facts": [ { "kind": "code", "literal": "POST /v2/namespaces/{ns}/query", "chunkId": "api/query" }, { "kind": "code", "literal": "x-layer-*", "chunkId": "api/query" }, { "kind": "value", "literal": "Upstream.astro", "chunkId": "api/query" }, { "kind": "value", "literal": "CodeTabs.astro", "chunkId": "api/query" }, { "kind": "value", "literal": "turbopuffer.com", "chunkId": "api/query" } ], "sources": [ { "chunkId": "api/query", "url": "/docs/api/query", "anchor": null } ], "mode": "source-primary", "terms": [ "vector", "similarity", "search", "stable", "reads", "query", "cached", "document", "fetch", "response", "bodies", "wire", "compatible", "upstream", "post", "namespaces", "endpoint", "layer", "metadata", "reported", "astro", "codetabs", "turbopuffer", "headers" ] }, { "id": "api/query#batch-fetch", "kind": "section", "title": "Query & Fetch", "heading": "Batch fetch", "group": "API", "url": "/docs/api/query#batch-fetch", "summary": "Batch fetch batch = await client.fetchdocuments(\"products\", { \"ids\": [\"asin-1\", \"asin-2\", \"asin-3\"], \"includeattributes\": [\"title\"], }) batch, err := client.FetchDocuments(ctx, \"products\", &hevlayer.FetchDocumentsRequest…", "facts": [ { "kind": "code", "literal": "batch = await client.fetch_documents(\"products\", {\n \"ids\": [\"asin-1\", \"asin-2\", \"asin-3\"],\n \"include_attributes\": [\"title\"],\n})", "chunkId": "api/query#batch-fetch" }, { "kind": "code", "literal": "batch, err := client.FetchDocuments(ctx, \"products\", &hevlayer.FetchDocumentsRequest{\n Ids: []string{\"asin-1\", \"asin-2\", \"asin-3\"},\n IncludeAttributes: []string{\"title\"},\n})", "chunkId": "api/query#batch-fetch" }, { "kind": "code", "literal": "const batch = await client.fetchDocuments(\"products\", {\n ids: [\"asin-1\", \"asin-2\", \"asin-3\"],\n include_attributes: [\"title\"],\n});", "chunkId": "api/query#batch-fetch" }, { "kind": "code", "literal": "curl -X POST \"$LAYER_GATEWAY_URL/v2/namespaces/products/documents\" \\\n -H \"Authorization: Bearer $LAYER_GATEWAY_API_KEY\" \\\n -H \"Content-Type: application/json\" \\\n -d '{\n \"ids\": [\"asin-1\", \"asin-2\", \"asin-3\"],\n \"include_attributes\": [\"title\"]\n }'", "chunkId": "api/query#batch-fetch" }, { "kind": "code", "literal": "{\n \"documents\": [\n {\"id\": \"asin-1\", \"attributes\": {\"title\": \"...\"}},\n {\"id\": \"asin-3\", \"attributes\": {\"title\": \"...\"}}\n ],\n \"missing\": [\"asin-2\"]\n}", "chunkId": "api/query#batch-fetch" }, { "kind": "code", "literal": "documents", "chunkId": "api/query#batch-fetch" }, { "kind": "code", "literal": "missing", "chunkId": "api/query#batch-fetch" } ], "sources": [ { "chunkId": "api/query#batch-fetch", "url": "/docs/api/query#batch-fetch", "anchor": "batch-fetch" } ], "mode": "source-primary", "terms": [ "batch", "fetch", "await", "client", "fetchdocuments", "products", "asin", "includeattributes", "title", "hevlayer", "fetchdocumentsrequest", "documents", "include", "attributes", "string", "const", "curl", "post", "layer", "gateway", "namespaces", "authorization", "bearer", "content", "type", "application", "json", "missing", "layergatewayurl", "layergatewayapikey", "returns", "found", "inline", "instead", "partial", "preserves", "request", "order", "could", "find" ] }, { "id": "api/query#behavior-matrix", "kind": "section", "title": "Query & Fetch", "heading": "Behavior matrix", "group": "API", "url": "/docs/api/query#behavior-matrix", "summary": "Behavior matrix Cache state Single fetch Batch fetch Hit cache cache Miss, upstream present upstream + backfill upstream + backfill Miss, upstream absent 404 inline missing Cache unavailable upstream, miss-on-error upstr…", "facts": [ { "kind": "code", "literal": "missing", "chunkId": "api/query#behavior-matrix" }, { "kind": "code", "literal": "miss-on-error", "chunkId": "api/query#behavior-matrix" } ], "sources": [ { "chunkId": "api/query#behavior-matrix", "url": "/docs/api/query#behavior-matrix", "anchor": "behavior-matrix" } ], "mode": "source-primary", "terms": [ "behavior", "matrix", "cache", "state", "single", "fetch", "batch", "miss", "upstream", "present", "backfill", "absent", "inline", "missing", "unavailable", "error", "upstr" ] }, { "id": "api/query#counting-matches", "kind": "section", "title": "Query & Fetch", "heading": "Counting matches", "group": "API", "url": "/docs/api/query#counting-matches", "summary": "Counting matches To count how many rows match a full-text or vector query, use scan count mode with the fts or ann selector. Ranked counts share the single /scans endpoint with filter counts — fts is exact, ann is a radi…", "facts": [ { "kind": "code", "literal": "fts", "chunkId": "api/query#counting-matches" }, { "kind": "code", "literal": "ann", "chunkId": "api/query#counting-matches" }, { "kind": "code", "literal": "/scans", "chunkId": "api/query#counting-matches" }, { "kind": "code", "literal": "approximate", "chunkId": "api/query#counting-matches" }, { "kind": "code", "literal": "exhaustive", "chunkId": "api/query#counting-matches" } ], "sources": [ { "chunkId": "api/query#counting-matches", "url": "/docs/api/query#counting-matches", "anchor": "counting-matches" } ], "mode": "source-primary", "terms": [ "counting", "matches", "count", "many", "rows", "match", "full", "text", "vector", "query", "scan", "mode", "selector", "ranked", "counts", "share", "single", "scans", "endpoint", "filter", "exact", "radi", "approximate", "exhaustive", "radius", "flagged", "both", "honor", "flag", "deadline" ] }, { "id": "api/query#fetch", "kind": "section", "title": "Query & Fetch", "heading": "Fetch", "group": "API", "url": "/docs/api/query#fetch", "summary": "Fetch Fetch is a Layer-only endpoint with no upstream equivalent. The NVMe cache is checked first; on miss or error the gateway falls through to Turbopuffer and backfills the cache best-effort.", "facts": [], "sources": [ { "chunkId": "api/query#fetch", "url": "/docs/api/query#fetch", "anchor": "fetch" } ], "mode": "source-primary", "terms": [ "fetch", "layer", "only", "endpoint", "upstream", "equivalent", "nvme", "cache", "checked", "first", "miss", "error", "gateway", "falls", "through", "turbopuffer", "backfills", "best", "effort" ] }, { "id": "api/query#hybrid-text-fusion", "kind": "section", "title": "Query & Fetch", "heading": "Hybrid text fusion", "group": "API", "url": "/docs/api/query#hybrid-text-fusion", "summary": "Hybrid text fusion BM25 misses typos and morphological variants; fuzzy matching alone loses the relevance signal BM25 provides. HybridText runs both in one request: the gateway tokenizes your input string, expands it int…", "facts": [ { "kind": "code", "literal": "response = await client.query_namespace(\"support-tickets\", {\n \"rank_by\": [\"content\", \"HybridText\", \"conection timout kubernets\"],\n \"top_k\": 10,\n \"filters\": [\"tenant\", \"Eq\", \"t-42\"],\n \"include_attributes\": [\"content\", \"title\"],\n})", "chunkId": "api/query#hybrid-text-fusion" }, { "kind": "code", "literal": "response, err := client.QueryNamespace(ctx, \"support-tickets\", &hevlayer.QueryRequest{\n RankBy: []any{\"content\", \"HybridText\", \"conection timout kubernets\"},\n TopK: 10,\n Filters: []any{\"tenant\", \"Eq\", \"t-42\"},\n IncludeAttributes: []string{\"content\", \"title\"},\n})", "chunkId": "api/query#hybrid-text-fusion" }, { "kind": "code", "literal": "const response = await client.queryNamespace(\"support-tickets\", {\n rank_by: [\"content\", \"HybridText\", \"conection timout kubernets\"],\n top_k: 10,\n filters: [\"tenant\", \"Eq\", \"t-42\"],\n include_attributes: [\"content\", \"title\"],\n});", "chunkId": "api/query#hybrid-text-fusion" }, { "kind": "code", "literal": "curl -X POST \"$LAYER_GATEWAY_URL/v2/namespaces/support-tickets/query\" \\\n -H \"Authorization: Bearer $LAYER_GATEWAY_API_KEY\" \\\n -H \"Content-Type: application/json\" \\\n -d '{\n \"rank_by\": [\"content\", \"HybridText\", \"conection timout kubernets\"],\n \"top_k\": 10,\n \"filters\": [\"tenant\", \"Eq\", \"t-42\"],\n \"include_attributes\": [\"content\", \"title\"]\n }'", "chunkId": "api/query#hybrid-text-fusion" }, { "kind": "code", "literal": "[\"content\", \"HybridText\", \"conection timout kubernets\", {\n \"fuzziness\": \"auto\",\n \"rank_constant\": 60,\n \"per_leg_limit\": null\n}]", "chunkId": "api/query#hybrid-text-fusion" }, { "kind": "code", "literal": "HybridText", "chunkId": "api/query#hybrid-text-fusion" }, { "kind": "code", "literal": "rank_by", "chunkId": "api/query#hybrid-text-fusion" }, { "kind": "code", "literal": "alyze", "chunkId": "api/query#hybrid-text-fusion" }, { "kind": "code", "literal": "fuzziness", "chunkId": "api/query#hybrid-text-fusion" }, { "kind": "code", "literal": "\"auto\"", "chunkId": "api/query#hybrid-text-fusion" }, { "kind": "code", "literal": "rank_constant", "chunkId": "api/query#hybrid-text-fusion" }, { "kind": "code", "literal": "60", "chunkId": "api/query#hybrid-text-fusion" }, { "kind": "code", "literal": "per_leg_limit", "chunkId": "api/query#hybrid-text-fusion" }, { "kind": "code", "literal": "clamp(5 × top_k, 50, 200)", "chunkId": "api/query#hybrid-text-fusion" }, { "kind": "code", "literal": "threads", "chunkId": "api/query#hybrid-text-fusion" }, { "kind": "code", "literal": "Index.spec.scan.threads", "chunkId": "api/query#hybrid-text-fusion" }, { "kind": "value", "literal": "github.com", "chunkId": "api/query#hybrid-text-fusion" } ], "sources": [ { "chunkId": "api/query#hybrid-text-fusion", "url": "/docs/api/query#hybrid-text-fusion", "anchor": "hybrid-text-fusion" } ], "mode": "source-primary", "terms": [ "hybrid", "text", "fusion", "bm25", "misses", "typos", "morphological", "variants", "fuzzy", "matching", "alone", "loses", "relevance", "signal", "provides", "hybridtext", "runs", "both", "request", "gateway", "tokenizes", "input", "string", "expands", "response", "await", "client", "query", "namespace", "support", "tickets", "rank", "content", "conection", "timout", "kubernets", "filters", "tenant", "include", "attributes" ] }, { "id": "api/query#multi-query", "kind": "section", "title": "Query & Fetch", "heading": "Multi-query", "group": "API", "url": "/docs/api/query#multi-query", "summary": "Multi-query nearesttoid fuses several seeds into a single ranking. To run several independent queries in one round trip, each with its own ranking, post a queries array. The response is a parallel results array: one rank…", "facts": [ { "kind": "code", "literal": "batch = await client.multi_query_turbopuffer_namespace(\"products\", {\n \"queries\": [\n {\"rank_by\": [\"vector\", \"ANN\", [0.1, 0.2, 0.3]], \"top_k\": 10},\n {\"rank_by\": [\"title\", \"BM25\", \"wireless earbuds\"], \"top_k\": 10},\n ],\n})\n# batch.results[0].rows ranked by vector; batch.results[1].rows by text", "chunkId": "api/query#multi-query" }, { "kind": "code", "literal": "batch, err := client.MultiQueryTurbopufferNamespace(ctx, \"products\",\n &hevlayer.TurbopufferMultiQueryRequest{\n Queries: []hevlayer.TurbopufferQueryRequest{\n {\"rank_by\": []any{\"vector\", \"ANN\", []float64{0.1, 0.2, 0.3}}, \"top_k\": 10},\n {\"rank_by\": []any{\"title\", \"BM25\", \"wireless earbuds\"}, \"top_k\": 10},\n },\n })", "chunkId": "api/query#multi-query" }, { "kind": "code", "literal": "const batch = await client.multiQueryTurbopufferNamespace(\"products\", {\n queries: [\n { rank_by: [\"vector\", \"ANN\", [0.1, 0.2, 0.3]], top_k: 10 },\n { rank_by: [\"title\", \"BM25\", \"wireless earbuds\"], top_k: 10 },\n ],\n});\n// batch.results[0].rows ranked by vector; batch.results[1].rows by text", "chunkId": "api/query#multi-query" }, { "kind": "code", "literal": "curl -X POST \"$LAYER_GATEWAY_URL/v2/namespaces/products/query?stainless_overload=multiQuery\" \\\n -H \"Authorization: Bearer $LAYER_GATEWAY_API_KEY\" \\\n -H \"Content-Type: application/json\" \\\n -d '{\n \"queries\": [\n {\"rank_by\": [\"vector\", \"ANN\", [0.1, 0.2, 0.3]], \"top_k\": 10},\n {\"rank_by\": [\"title\", \"BM25\", \"wireless earbuds\"], \"top_k\": 10}\n ]\n }'", "chunkId": "api/query#multi-query" }, { "kind": "code", "literal": "nearest_to_id", "chunkId": "api/query#multi-query" }, { "kind": "code", "literal": "queries", "chunkId": "api/query#multi-query" }, { "kind": "code", "literal": "results", "chunkId": "api/query#multi-query" }, { "kind": "code", "literal": "rank_by", "chunkId": "api/query#multi-query" }, { "kind": "code", "literal": "{ \"results\": [{ \"rows\": ... }] }", "chunkId": "api/query#multi-query" }, { "kind": "code", "literal": "x-layer-stable-as-of", "chunkId": "api/query#multi-query" }, { "kind": "code", "literal": "vector", "chunkId": "api/query#multi-query" }, { "kind": "code", "literal": "cursor", "chunkId": "api/query#multi-query" }, { "kind": "code", "literal": "rerank_by", "chunkId": "api/query#multi-query" }, { "kind": "value", "literal": "turbopuffer.com", "chunkId": "api/query#multi-query" } ], "sources": [ { "chunkId": "api/query#multi-query", "url": "/docs/api/query#multi-query", "anchor": "multi-query" } ], "mode": "source-primary", "terms": [ "multi", "query", "nearesttoid", "fuses", "several", "seeds", "single", "ranking", "independent", "queries", "round", "trip", "post", "array", "response", "parallel", "results", "rank", "batch", "await", "client", "turbopuffer", "namespace", "products", "vector", "title", "bm25", "wireless", "earbuds", "rows", "ranked", "text", "multiqueryturbopuffernamespace", "hevlayer", "turbopuffermultiqueryrequest", "turbopufferqueryrequest", "float64", "const", "curl", "layer" ] }, { "id": "api/query#options", "kind": "section", "title": "Query & Fetch", "heading": "Options", "group": "API", "url": "/docs/api/query#options", "summary": "Options The optional fourth tuple element: Option Default Meaning route \"auto\" Force \"hybridtext\", \"semantic\", or \"fused\" instead of applying the policy. Used on re-issue after a deferral, and for A/B comparison of strat…", "facts": [ { "kind": "code", "literal": "route", "chunkId": "api/query#options" }, { "kind": "code", "literal": "\"auto\"", "chunkId": "api/query#options" }, { "kind": "code", "literal": "\"hybrid_text\"", "chunkId": "api/query#options" }, { "kind": "code", "literal": "\"semantic\"", "chunkId": "api/query#options" }, { "kind": "code", "literal": "\"fused\"", "chunkId": "api/query#options" }, { "kind": "code", "literal": "vector", "chunkId": "api/query#options" }, { "kind": "code", "literal": "hybrid", "chunkId": "api/query#options" }, { "kind": "code", "literal": "routing", "chunkId": "api/query#options" } ], "sources": [ { "chunkId": "api/query#options", "url": "/docs/api/query#options", "anchor": "options" } ], "mode": "source-primary", "terms": [ "options", "optional", "fourth", "tuple", "element", "option", "default", "meaning", "route", "auto", "force", "hybridtext", "semantic", "fused", "instead", "applying", "policy", "issue", "after", "deferral", "comparison", "strat", "hybrid", "text", "vector", "routing", "strategies", "same", "input", "query", "dimensionality", "must", "match", "namespace", "chosen", "expands", "legs", "defaults", "apply", "echo" ] }, { "id": "api/query#query-by-id", "kind": "section", "title": "Query & Fetch", "heading": "Query by id", "group": "API", "url": "/docs/api/query#query-by-id", "summary": "Query by id Pass nearesttoid in place of vector to rank by stored document vectors instead of a raw query vector — exactly one of the two is required. nearesttoid takes an array of document ids: the gateway resolves each…", "facts": [ { "kind": "code", "literal": "response = await client.query_namespace(\"products\", {\n \"nearest_to_id\": [\"asin-B08N5WRWNW\", \"asin-B07PXGQC1Q\"],\n \"top_k\": 10,\n \"include_attributes\": [\"title\", \"category\"],\n})", "chunkId": "api/query#query-by-id" }, { "kind": "code", "literal": "response, err := client.QueryNamespace(ctx, \"products\", &hevlayer.QueryRequest{\n NearestToID: []string{\"asin-B08N5WRWNW\", \"asin-B07PXGQC1Q\"},\n TopK: 10,\n IncludeAttributes: []string{\"title\", \"category\"},\n})", "chunkId": "api/query#query-by-id" }, { "kind": "code", "literal": "const response = await client.queryNamespace(\"products\", {\n nearest_to_id: [\"asin-B08N5WRWNW\", \"asin-B07PXGQC1Q\"],\n top_k: 10,\n include_attributes: [\"title\", \"category\"],\n});", "chunkId": "api/query#query-by-id" }, { "kind": "code", "literal": "curl -X POST \"$LAYER_GATEWAY_URL/v2/namespaces/products/query\" \\\n -H \"Authorization: Bearer $LAYER_GATEWAY_API_KEY\" \\\n -H \"Content-Type: application/json\" \\\n -d '{\n \"nearest_to_id\": [\"asin-B08N5WRWNW\", \"asin-B07PXGQC1Q\"],\n \"top_k\": 10,\n \"include_attributes\": [\"title\", \"category\"]\n }'", "chunkId": "api/query#query-by-id" }, { "kind": "code", "literal": "nearest_to_id", "chunkId": "api/query#query-by-id" }, { "kind": "code", "literal": "vector", "chunkId": "api/query#query-by-id" } ], "sources": [ { "chunkId": "api/query#query-by-id", "url": "/docs/api/query#query-by-id", "anchor": "query-by-id" } ], "mode": "source-primary", "terms": [ "query", "pass", "nearesttoid", "place", "vector", "rank", "stored", "document", "vectors", "instead", "exactly", "required", "takes", "array", "gateway", "resolves", "response", "await", "client", "namespace", "products", "nearest", "asin", "b08n5wrwnw", "b07pxgqc1q", "include", "attributes", "title", "category", "querynamespace", "hevlayer", "queryrequest", "string", "topk", "includeattributes", "const", "curl", "post", "layer", "namespaces" ] }, { "id": "api/query#query-routing", "kind": "section", "title": "Query & Fetch", "heading": "Query routing", "group": "API", "url": "/docs/api/query#query-routing", "summary": "Query routing Real search boxes receive both \"timout\" and \"why do pods lose their connection during deploys\". The first wants hybrid text fusion; the second wants semantic retrieval — lexical legs add noise on long conve…", "facts": [ { "kind": "code", "literal": "\"timout\"", "chunkId": "api/query#query-routing" }, { "kind": "code", "literal": "Auto", "chunkId": "api/query#query-routing" }, { "kind": "code", "literal": "rank_by", "chunkId": "api/query#query-routing" } ], "sources": [ { "chunkId": "api/query#query-routing", "url": "/docs/api/query#query-routing", "anchor": "query-routing" } ], "mode": "source-primary", "terms": [ "query", "routing", "real", "search", "boxes", "receive", "both", "timout", "pods", "lose", "their", "connection", "during", "deploys", "first", "wants", "hybrid", "text", "fusion", "second", "semantic", "retrieval", "lexical", "legs", "noise", "long", "conve", "auto", "rank", "conversational", "input", "underperforms", "short", "identifier", "shaped", "tokens", "layer", "only", "rankby", "spelling" ] }, { "id": "api/query#response", "kind": "section", "title": "Query & Fetch", "heading": "Response", "group": "API", "url": "/docs/api/query#response", "summary": "Response Results are the upstream RRF-fused list. A hybrid block echoes the effective expansion so defaults are never invisible: { \"rows\": [ { \"id\": \"ticket-4117\", \"$score\": 0.0639, \"content\": \"...\", \"title\": \"Connection…", "facts": [ { "kind": "code", "literal": "{\n \"rows\": [\n {\n \"id\": \"ticket-4117\",\n \"$score\": 0.0639,\n \"content\": \"...\",\n \"title\": \"Connection timeout on Kubernetes ingress\"\n }\n ],\n \"hybrid\": {\n \"tokens\": [\"conection\", \"timout\", \"kubernets\"],\n \"tokens_dropped\": 0,\n \"fuzziness\": \"auto\",\n \"rank_constant\": 60,\n \"legs\": 4,\n \"per_leg_limit\": 50\n }\n}", "chunkId": "api/query#response" }, { "kind": "code", "literal": "hybrid", "chunkId": "api/query#response" }, { "kind": "code", "literal": "$score", "chunkId": "api/query#response" }, { "kind": "code", "literal": "tokens", "chunkId": "api/query#response" }, { "kind": "code", "literal": "tokens_dropped", "chunkId": "api/query#response" }, { "kind": "code", "literal": "legs", "chunkId": "api/query#response" }, { "kind": "code", "literal": "HybridText", "chunkId": "api/query#response" }, { "kind": "code", "literal": "threads", "chunkId": "api/query#response" }, { "kind": "code", "literal": "rerank_by", "chunkId": "api/query#response" } ], "sources": [ { "chunkId": "api/query#response", "url": "/docs/api/query#response", "anchor": "response" } ], "mode": "source-primary", "terms": [ "response", "results", "upstream", "fused", "list", "hybrid", "block", "echoes", "effective", "expansion", "defaults", "never", "invisible", "rows", "ticket", "4117", "score", "0639", "content", "title", "connection", "timeout", "kubernetes", "ingress", "tokens", "conection", "timout", "kubernets", "dropped", "fuzziness", "auto", "rank", "constant", "legs", "limit", "hybridtext", "threads", "rerank", "tokensdropped", "rankconstant" ] }, { "id": "api/query#response-1", "kind": "section", "title": "Query & Fetch", "heading": "Response", "group": "API", "url": "/docs/api/query#response-1", "summary": "Response Every Auto response carries a routing block: { \"rows\": [{\"id\": \"ticket-4117\", \"$score\": 0.0639, \"title\": \"...\"}], \"routing\": { \"route\": \"hybridtext\", \"policy\": \"v1\", \"tokens\": 1, \"executed\": true }, \"hybrid\": {\"…", "facts": [ { "kind": "code", "literal": "{\n \"rows\": [{\"id\": \"ticket-4117\", \"$score\": 0.0639, \"title\": \"...\"}],\n \"routing\": {\n \"route\": \"hybrid_text\",\n \"policy\": \"v1\",\n \"tokens\": 1,\n \"executed\": true\n },\n \"hybrid\": {\"tokens\": [\"timout\"], \"tokens_dropped\": 0, \"fuzziness\": \"auto\", \"rank_constant\": 60, \"legs\": 2, \"per_leg_limit\": 50}\n}", "chunkId": "api/query#response-1" }, { "kind": "code", "literal": "Auto", "chunkId": "api/query#response-1" }, { "kind": "code", "literal": "routing", "chunkId": "api/query#response-1" }, { "kind": "code", "literal": "route", "chunkId": "api/query#response-1" }, { "kind": "code", "literal": "policy", "chunkId": "api/query#response-1" }, { "kind": "code", "literal": "\"forced\"", "chunkId": "api/query#response-1" }, { "kind": "code", "literal": "tokens", "chunkId": "api/query#response-1" }, { "kind": "code", "literal": "executed", "chunkId": "api/query#response-1" }, { "kind": "code", "literal": "false", "chunkId": "api/query#response-1" }, { "kind": "code", "literal": "rows", "chunkId": "api/query#response-1" } ], "sources": [ { "chunkId": "api/query#response-1", "url": "/docs/api/query#response-1", "anchor": "response-1" } ], "mode": "source-primary", "terms": [ "response", "every", "auto", "carries", "routing", "block", "rows", "ticket", "4117", "score", "0639", "title", "route", "hybridtext", "policy", "tokens", "executed", "true", "hybrid", "text", "timout", "dropped", "fuzziness", "rank", "constant", "legs", "limit", "forced", "false", "tokensdropped", "rankconstant", "perleglimit", "field", "meaning", "strategy", "chosen", "version", "made", "decision", "supplied" ] }, { "id": "api/query#routing-policy", "kind": "section", "title": "Query & Fetch", "heading": "Routing policy", "group": "API", "url": "/docs/api/query#routing-policy", "summary": "Routing policy The v1 policy reads the token count of the input under the same tokenizer policy as hybrid text fusion: Tokens Route Runs ≤ 2 hybridtext The hybrid text fusion expansion. ≥ 8 semantic ANN over the supplied…", "facts": [ { "kind": "code", "literal": "hybrid_text", "chunkId": "api/query#routing-policy" }, { "kind": "code", "literal": "semantic", "chunkId": "api/query#routing-policy" }, { "kind": "code", "literal": "fused", "chunkId": "api/query#routing-policy" }, { "kind": "code", "literal": "vector", "chunkId": "api/query#routing-policy" }, { "kind": "code", "literal": "\"policy\": \"v1\"", "chunkId": "api/query#routing-policy" } ], "sources": [ { "chunkId": "api/query#routing-policy", "url": "/docs/api/query#routing-policy", "anchor": "routing-policy" } ], "mode": "source-primary", "terms": [ "routing", "policy", "reads", "token", "count", "input", "under", "same", "tokenizer", "hybrid", "text", "fusion", "tokens", "route", "runs", "hybridtext", "expansion", "semantic", "supplied", "fused", "vector", "query", "both", "merged", "upstream", "availability", "never", "changes", "chosen", "only", "whether", "executes", "request", "always", "execute", "supplies", "defer", "otherwise", "versioned", "threshold" ] }, { "id": "api/query#semantics", "kind": "section", "title": "Query & Fetch", "heading": "Semantics", "group": "API", "url": "/docs/api/query#semantics", "summary": "Semantics One round trip. The expansion is a single upstream multi-query fused by rerankby: [\"RRF\", ...]. Layer implements no fusion math and does not reorder results. One consistency cut. Request-level filters are repli…", "facts": [ { "kind": "code", "literal": "rerank_by: [\"RRF\", ...]", "chunkId": "api/query#semantics" }, { "kind": "code", "literal": "filters", "chunkId": "api/query#semantics" }, { "kind": "code", "literal": "x-layer-stable-as-of", "chunkId": "api/query#semantics" }, { "kind": "code", "literal": "HybridText", "chunkId": "api/query#semantics" } ], "sources": [ { "chunkId": "api/query#semantics", "url": "/docs/api/query#semantics", "anchor": "semantics" } ], "mode": "source-primary", "terms": [ "semantics", "round", "trip", "expansion", "single", "upstream", "multi", "query", "fused", "rerankby", "layer", "implements", "fusion", "math", "does", "reorder", "results", "consistency", "request", "level", "filters", "repli", "rerank", "stable", "hybridtext", "replicated", "every", "read", "watermark", "predicate", "injected", "legs", "same", "responses", "carry", "usual", "nothing", "partial", "failing", "fails" ] }, { "id": "api/query#single-fetch", "kind": "section", "title": "Query & Fetch", "heading": "Single fetch", "group": "API", "url": "/docs/api/query#single-fetch", "summary": "Single fetch doc = await client.fetchdocument( \"products\", \"asin-B08N5WRWNW\", includeattributes=[\"title\", \"category\"], ) doc, err := client.FetchDocument(ctx, \"products\", \"asin-B08N5WRWNW\", &hevlayer.FetchDocumentParams{…", "facts": [ { "kind": "code", "literal": "doc = await client.fetch_document(\n \"products\",\n \"asin-B08N5WRWNW\",\n include_attributes=[\"title\", \"category\"],\n)", "chunkId": "api/query#single-fetch" }, { "kind": "code", "literal": "doc, err := client.FetchDocument(ctx, \"products\", \"asin-B08N5WRWNW\",\n &hevlayer.FetchDocumentParams{\n IncludeAttributes: []string{\"title\", \"category\"},\n })", "chunkId": "api/query#single-fetch" }, { "kind": "code", "literal": "const doc = await client.fetchDocument(\"products\", \"asin-B08N5WRWNW\", {\n includeAttributes: [\"title\", \"category\"],\n});", "chunkId": "api/query#single-fetch" }, { "kind": "code", "literal": "curl \"$LAYER_GATEWAY_URL/v2/namespaces/products/documents/asin-B08N5WRWNW?include_attributes=title,category\" \\\n -H \"Authorization: Bearer $LAYER_GATEWAY_API_KEY\"", "chunkId": "api/query#single-fetch" }, { "kind": "code", "literal": "x-layer-cache: hit", "chunkId": "api/query#single-fetch" }, { "kind": "code", "literal": "x-layer-cache: miss", "chunkId": "api/query#single-fetch" }, { "kind": "code", "literal": "x-layer-cache: miss-on-error", "chunkId": "api/query#single-fetch" } ], "sources": [ { "chunkId": "api/query#single-fetch", "url": "/docs/api/query#single-fetch", "anchor": "single-fetch" } ], "mode": "source-primary", "terms": [ "single", "fetch", "await", "client", "fetchdocument", "products", "asin", "b08n5wrwnw", "includeattributes", "title", "category", "hevlayer", "fetchdocumentparams", "document", "include", "attributes", "string", "const", "curl", "layer", "gateway", "namespaces", "documents", "authorization", "bearer", "cache", "miss", "error", "layergatewayurl", "layergatewayapikey", "outcome", "status", "header", "cached", "upstream", "backfilled", "unavailable", "missing", "both", "layers" ] }, { "id": "api/query#stable-reads", "kind": "section", "title": "Query & Fetch", "heading": "Stable reads", "group": "API", "url": "/docs/api/query#stable-reads", "summary": "Stable reads Layer uses the same query syntax as upstream but defaults to stable reads. Every response carries an x-layer-stable-as-of watermark: the point the upstream index is known to be caught up to. A query issued r…", "facts": [ { "kind": "code", "literal": "HTTP/1.1 200 OK\nx-layer-stable-as-of: 1715600400000\n\n{\"rows\":[{\"id\":\"asin-B08N5WRWNW\",\"$dist\":0.42,\"title\":\"...\"}]}", "chunkId": "api/query#stable-reads" }, { "kind": "code", "literal": "x-layer-stable-as-of", "chunkId": "api/query#stable-reads" }, { "kind": "code", "literal": "consistency=eventual", "chunkId": "api/query#stable-reads" }, { "kind": "code", "literal": "index.status", "chunkId": "api/query#stable-reads" }, { "kind": "code", "literal": "poll_start - safety_margin", "chunkId": "api/query#stable-reads" }, { "kind": "code", "literal": "Updating", "chunkId": "api/query#stable-reads" }, { "kind": "code", "literal": "_hevlayer_upserted_at <= watermark", "chunkId": "api/query#stable-reads" }, { "kind": "code", "literal": "Stable", "chunkId": "api/query#stable-reads" }, { "kind": "code", "literal": "Unknown", "chunkId": "api/query#stable-reads" }, { "kind": "code", "literal": "x-layer-next-cursor", "chunkId": "api/query#stable-reads" }, { "kind": "code", "literal": "cursor", "chunkId": "api/query#stable-reads" }, { "kind": "code", "literal": "consistency", "chunkId": "api/query#stable-reads" }, { "kind": "code", "literal": "CONSISTENCY_POLL_INTERVAL_MS", "chunkId": "api/query#stable-reads" }, { "kind": "code", "literal": "CONSISTENCY_STABLE_POLL_INTERVAL_MS", "chunkId": "api/query#stable-reads" }, { "kind": "code", "literal": "CONSISTENCY_SAFETY_MARGIN_MS", "chunkId": "api/query#stable-reads" } ], "sources": [ { "chunkId": "api/query#stable-reads", "url": "/docs/api/query#stable-reads", "anchor": "stable-reads" } ], "mode": "source-primary", "terms": [ "stable", "reads", "layer", "uses", "same", "query", "syntax", "upstream", "defaults", "every", "response", "carries", "watermark", "point", "index", "known", "caught", "issued", "http", "1715600400000", "rows", "asin", "b08n5wrwnw", "dist", "title", "consistency", "eventual", "status", "poll", "start", "safety", "margin", "updating", "hevlayer", "upserted", "unknown", "next", "cursor", "interval", "right" ] }, { "id": "api/query#tokenization", "kind": "section", "title": "Query & Fetch", "heading": "Tokenization", "group": "API", "url": "/docs/api/query#tokenization", "summary": "Tokenization The input string becomes tokens under a fixed, documented policy: 1. Split on Unicode (UAX #29) word boundaries and lowercase, using alyze — the code behind Turbopuffer's production wordv4 tokenizer. Punctua…", "facts": [ { "kind": "code", "literal": "alyze", "chunkId": "api/query#tokenization" }, { "kind": "code", "literal": "word_v4", "chunkId": "api/query#tokenization" }, { "kind": "code", "literal": "tokens_dropped", "chunkId": "api/query#tokenization" } ], "sources": [ { "chunkId": "api/query#tokenization", "url": "/docs/api/query#tokenization", "anchor": "tokenization" } ], "mode": "source-primary", "terms": [ "tokenization", "input", "string", "becomes", "tokens", "under", "fixed", "documented", "policy", "split", "unicode", "word", "boundaries", "lowercase", "alyze", "code", "behind", "turbopuffer", "production", "wordv4", "tokenizer", "punctua", "dropped", "punctuation", "only", "never", "survive", "drop", "shorter", "characters", "dedupe", "fuzzy", "legs", "bm25", "upstream", "subquery", "limit", "counted", "tokensdropped", "stemming" ] }, { "id": "api/query#validation", "kind": "section", "title": "Query & Fetch", "heading": "Validation", "group": "API", "url": "/docs/api/query#validation", "summary": "Validation All return 422: Condition Why Input yields zero tokens under the policy Nothing to expand. cursor present Fused scores do not form the monotone bands pagination relies on. HybridText inside a queries array The…", "facts": [ { "kind": "code", "literal": "422", "chunkId": "api/query#validation" }, { "kind": "code", "literal": "cursor", "chunkId": "api/query#validation" }, { "kind": "code", "literal": "HybridText", "chunkId": "api/query#validation" }, { "kind": "code", "literal": "queries", "chunkId": "api/query#validation" }, { "kind": "code", "literal": "fuzziness", "chunkId": "api/query#validation" }, { "kind": "code", "literal": "\"auto\" \\| 0 \\| 1 \\| 2", "chunkId": "api/query#validation" }, { "kind": "code", "literal": "rank_constant", "chunkId": "api/query#validation" }, { "kind": "code", "literal": "per_leg_limit", "chunkId": "api/query#validation" }, { "kind": "code", "literal": "threads", "chunkId": "api/query#validation" } ], "sources": [ { "chunkId": "api/query#validation", "url": "/docs/api/query#validation", "anchor": "validation" } ], "mode": "source-primary", "terms": [ "validation", "return", "condition", "input", "yields", "zero", "tokens", "under", "policy", "nothing", "expand", "cursor", "present", "fused", "scores", "form", "monotone", "bands", "pagination", "relies", "hybridtext", "inside", "queries", "array", "fuzziness", "auto", "rank", "constant", "limit", "threads", "expansion", "multi", "query", "deep", "construction", "rankconstant", "perleglimit", "range", "gateway", "pick" ] }, { "id": "api/query#validation-1", "kind": "section", "title": "Query & Fetch", "heading": "Validation", "group": "API", "url": "/docs/api/query#validation-1", "summary": "Validation All return 422: Condition Why Forced \"semantic\" or \"fused\" without vector Forcing asserts you have the vector; only auto-routing defers. Input yields zero tokens under the policy Nothing to route. vector dimen…", "facts": [ { "kind": "code", "literal": "422", "chunkId": "api/query#validation-1" }, { "kind": "code", "literal": "\"semantic\"", "chunkId": "api/query#validation-1" }, { "kind": "code", "literal": "\"fused\"", "chunkId": "api/query#validation-1" }, { "kind": "code", "literal": "vector", "chunkId": "api/query#validation-1" }, { "kind": "code", "literal": "cursor", "chunkId": "api/query#validation-1" }, { "kind": "code", "literal": "Auto", "chunkId": "api/query#validation-1" }, { "kind": "code", "literal": "queries", "chunkId": "api/query#validation-1" } ], "sources": [ { "chunkId": "api/query#validation-1", "url": "/docs/api/query#validation-1", "anchor": "validation-1" } ], "mode": "source-primary", "terms": [ "validation", "return", "condition", "forced", "semantic", "fused", "without", "vector", "forcing", "asserts", "only", "auto", "routing", "defers", "input", "yields", "zero", "tokens", "under", "policy", "nothing", "route", "dimen", "cursor", "queries", "dimensionality", "mismatch", "same", "check", "plain", "query", "present", "inside", "array", "inherited", "hybrid", "text", "fusion" ] }, { "id": "api/response-headers", "kind": "section", "title": "Response Headers", "heading": null, "group": "API", "url": "/docs/api/response-headers", "summary": "Layer metadata returned alongside wire-compatible response bodies. Layer keeps Turbopuffer-compatible read bodies in the upstream shape and returns Layer-specific metadata in response headers. Header Values Returned by x…", "facts": [ { "kind": "code", "literal": "x-layer-stable-as-of", "chunkId": "api/response-headers" }, { "kind": "code", "literal": "x-layer-next-cursor", "chunkId": "api/response-headers" }, { "kind": "code", "literal": "x-layer-cache", "chunkId": "api/response-headers" }, { "kind": "code", "literal": "hit", "chunkId": "api/response-headers" }, { "kind": "code", "literal": "miss", "chunkId": "api/response-headers" }, { "kind": "code", "literal": "miss-on-error", "chunkId": "api/response-headers" }, { "kind": "code", "literal": "x-layer-warning", "chunkId": "api/response-headers" }, { "kind": "code", "literal": "vector_attribute_dropped", "chunkId": "api/response-headers" }, { "kind": "code", "literal": "traceparent", "chunkId": "api/response-headers" }, { "kind": "code", "literal": "query_namespace", "chunkId": "api/response-headers" }, { "kind": "code", "literal": "rows", "chunkId": "api/response-headers" }, { "kind": "code", "literal": "stable_as_of", "chunkId": "api/response-headers" }, { "kind": "code", "literal": "next_cursor", "chunkId": "api/response-headers" } ], "sources": [ { "chunkId": "api/response-headers", "url": "/docs/api/response-headers", "anchor": null } ], "mode": "source-primary", "terms": [ "layer", "metadata", "returned", "alongside", "wire", "compatible", "response", "bodies", "keeps", "turbopuffer", "read", "upstream", "shape", "returns", "specific", "headers", "header", "values", "stable", "next", "cursor", "cache", "miss", "error", "warning", "vector", "attribute", "dropped", "traceparent", "query", "namespace", "rows", "epoch", "milliseconds", "multi", "scan", "counts", "opaque", "token", "single" ] }, { "id": "api/scans", "kind": "section", "title": "Scan", "heading": null, "group": "API", "url": "/docs/api/scans", "summary": "On-demand row selection by filter, full-text, or radius — IDs, count, or values. A scan is on-demand row selection over a namespace. It picks rows by one of three selectors and returns their IDs (mode: ids, an asynchrono…", "facts": [ { "kind": "code", "literal": "mode: ids", "chunkId": "api/scans" }, { "kind": "code", "literal": "mode: count", "chunkId": "api/scans" }, { "kind": "code", "literal": "mode: values", "chunkId": "api/scans" }, { "kind": "code", "literal": "filters", "chunkId": "api/scans" }, { "kind": "code", "literal": "fts", "chunkId": "api/scans" }, { "kind": "code", "literal": "ann", "chunkId": "api/scans" }, { "kind": "code", "literal": "radius", "chunkId": "api/scans" }, { "kind": "code", "literal": "threads", "chunkId": "api/scans" }, { "kind": "code", "literal": "Index.spec.scan.threads", "chunkId": "api/scans" }, { "kind": "code", "literal": "422", "chunkId": "api/scans" }, { "kind": "value", "literal": "CodeTabs.astro", "chunkId": "api/scans" } ], "sources": [ { "chunkId": "api/scans", "url": "/docs/api/scans", "anchor": null } ], "mode": "source-primary", "terms": [ "demand", "selection", "filter", "full", "text", "radius", "count", "values", "scan", "namespace", "picks", "rows", "three", "selectors", "returns", "their", "mode", "asynchrono", "filters", "threads", "index", "spec", "codetabs", "astro", "asynchronous", "synchronous", "distinct", "attribute", "field", "input", "meaning", "notes", "selector", "predicate", "omitted", "exact", "bm25", "against", "within", "query" ] }, { "id": "api/scans#auto-mode-policy", "kind": "section", "title": "Scan", "heading": "Auto-Mode Policy", "group": "API", "url": "/docs/api/scans#auto-mode-policy", "summary": "Auto-Mode Policy Auto ties cache freshness to the same consistency watermark used by stable reads. The gateway tracks per-namespace cachewarmedthrough, the watermark observed at the end of the last successful origin warm…", "facts": [ { "kind": "code", "literal": "cache_warmed_through", "chunkId": "api/scans#auto-mode-policy" }, { "kind": "code", "literal": "cache_warmed_through >= watermark", "chunkId": "api/scans#auto-mode-policy" }, { "kind": "code", "literal": "cache_warmed_through < watermark", "chunkId": "api/scans#auto-mode-policy" }, { "kind": "code", "literal": "_hevlayer_upserted_at <= cache_warmed_through", "chunkId": "api/scans#auto-mode-policy" } ], "sources": [ { "chunkId": "api/scans#auto-mode-policy", "url": "/docs/api/scans#auto-mode-policy", "anchor": "auto-mode-policy" } ], "mode": "source-primary", "terms": [ "auto", "mode", "policy", "ties", "cache", "freshness", "same", "consistency", "watermark", "stable", "reads", "gateway", "tracks", "namespace", "cachewarmedthrough", "observed", "last", "successful", "origin", "warm", "warmed", "through", "hevlayer", "upserted", "state", "action", "empty", "stamp", "populated", "serve", "start", "background", "hevlayerupsertedat", "added", "before", "user", "filter", "scan", "view" ] }, { "id": "api/scans#bounding-ranked-scans", "kind": "section", "title": "Scan", "heading": "Bounding ranked scans", "group": "API", "url": "/docs/api/scans#bounding-ranked-scans", "summary": "Bounding ranked scans Ranked selectors fan out one Turbopuffer query per shard, each capped at topk = 10000. threads bounds fan-out width: how many shard requests can run at once. exhaustive and timeoutseconds bound dept…", "facts": [ { "kind": "code", "literal": "top_k = 10_000", "chunkId": "api/scans#bounding-ranked-scans" }, { "kind": "code", "literal": "threads", "chunkId": "api/scans#bounding-ranked-scans" }, { "kind": "code", "literal": "exhaustive", "chunkId": "api/scans#bounding-ranked-scans" }, { "kind": "code", "literal": "timeout_seconds", "chunkId": "api/scans#bounding-ranked-scans" }, { "kind": "code", "literal": "exhaustive: false", "chunkId": "api/scans#bounding-ranked-scans" }, { "kind": "code", "literal": "bounded: true", "chunkId": "api/scans#bounding-ranked-scans" }, { "kind": "code", "literal": "shards_saturated > 0", "chunkId": "api/scans#bounding-ranked-scans" }, { "kind": "code", "literal": "exhaustive: true", "chunkId": "api/scans#bounding-ranked-scans" }, { "kind": "code", "literal": "$score < last", "chunkId": "api/scans#bounding-ranked-scans" }, { "kind": "code", "literal": "id", "chunkId": "api/scans#bounding-ranked-scans" }, { "kind": "code", "literal": "$dist > last", "chunkId": "api/scans#bounding-ranked-scans" }, { "kind": "code", "literal": "bounded", "chunkId": "api/scans#bounding-ranked-scans" }, { "kind": "code", "literal": "approximate", "chunkId": "api/scans#bounding-ranked-scans" }, { "kind": "code", "literal": ">=", "chunkId": "api/scans#bounding-ranked-scans" }, { "kind": "code", "literal": "ann", "chunkId": "api/scans#bounding-ranked-scans" }, { "kind": "code", "literal": "bounded: false", "chunkId": "api/scans#bounding-ranked-scans" }, { "kind": "code", "literal": "approximate: true", "chunkId": "api/scans#bounding-ranked-scans" } ], "sources": [ { "chunkId": "api/scans#bounding-ranked-scans", "url": "/docs/api/scans#bounding-ranked-scans", "anchor": "bounding-ranked-scans" } ], "mode": "source-primary", "terms": [ "bounding", "ranked", "scans", "selectors", "turbopuffer", "query", "shard", "capped", "topk", "10000", "threads", "bounds", "width", "many", "requests", "once", "exhaustive", "timeoutseconds", "bound", "dept", "timeout", "seconds", "false", "bounded", "true", "shards", "saturated", "score", "last", "dist", "approximate", "depth", "happens", "hits", "long", "recursion", "default", "scatter", "gather", "contributes" ] }, { "id": "api/scans#count-mode", "kind": "section", "title": "Scan", "heading": "Count Mode", "group": "API", "url": "/docs/api/scans#count-mode", "summary": "Count Mode count = await client.createscan(\"products\", { \"mode\": \"count\", \"source\": \"auto\", \"filters\": [\"category\", \"Eq\", \"Electronics\"], \"threads\": 8, \"timeoutseconds\": 30, }) count, err := client.CreateScan(ctx, \"produ…", "facts": [ { "kind": "code", "literal": "count = await client.create_scan(\"products\", {\n \"mode\": \"count\",\n \"source\": \"auto\",\n \"filters\": [\"category\", \"Eq\", \"Electronics\"],\n \"threads\": 8,\n \"timeout_seconds\": 30,\n})", "chunkId": "api/scans#count-mode" }, { "kind": "code", "literal": "count, err := client.CreateScan(ctx, \"products\", &hevlayer.CreateScanRequest{\n Mode: \"count\",\n Source: \"auto\",\n Filters: []interface{}{\"category\", \"Eq\", \"Electronics\"},\n Threads: 8,\n TimeoutSeconds: 30,\n})", "chunkId": "api/scans#count-mode" }, { "kind": "code", "literal": "const count = await client.createScan(\"products\", {\n mode: \"count\",\n source: \"auto\",\n filters: [\"category\", \"Eq\", \"Electronics\"],\n threads: 8,\n timeout_seconds: 30,\n});", "chunkId": "api/scans#count-mode" }, { "kind": "code", "literal": "curl -X POST \"$LAYER_GATEWAY_URL/v2/namespaces/products/scans\" \\\n -H \"Authorization: Bearer $LAYER_GATEWAY_API_KEY\" \\\n -H \"Content-Type: application/json\" \\\n -d '{\n \"mode\": \"count\",\n \"source\": \"auto\",\n \"filters\": [\"category\", \"Eq\", \"Electronics\"],\n \"threads\": 8,\n \"timeout_seconds\": 30\n }'", "chunkId": "api/scans#count-mode" }, { "kind": "code", "literal": "{\n \"count\": 4210,\n \"served_by\": \"snapshot\",\n \"snapshot_sha\": \"3f9e8b21\",\n \"watermark_ms\": 1747300000123,\n \"elapsed_ms\": 3\n}", "chunkId": "api/scans#count-mode" }, { "kind": "code", "literal": "{\n \"count\": 4210,\n \"served_by\": \"origin\",\n \"bounded\": false,\n \"timed_out\": false,\n \"shards_saturated\": 0,\n \"shards_total\": 1,\n \"threads\": 1,\n \"elapsed_ms\": 42\n}", "chunkId": "api/scans#count-mode" }, { "kind": "code", "literal": "watermark_ms", "chunkId": "api/scans#count-mode" }, { "kind": "code", "literal": "x-layer-stable-as-of", "chunkId": "api/scans#count-mode" }, { "kind": "code", "literal": "auto", "chunkId": "api/scans#count-mode" }, { "kind": "code", "literal": "snapshot", "chunkId": "api/scans#count-mode" }, { "kind": "code", "literal": "cache", "chunkId": "api/scans#count-mode" }, { "kind": "code", "literal": "origin", "chunkId": "api/scans#count-mode" }, { "kind": "code", "literal": "Eq", "chunkId": "api/scans#count-mode" }, { "kind": "code", "literal": "In", "chunkId": "api/scans#count-mode" }, { "kind": "code", "literal": "fields[]", "chunkId": "api/scans#count-mode" }, { "kind": "code", "literal": "And", "chunkId": "api/scans#count-mode" }, { "kind": "code", "literal": "Or", "chunkId": "api/scans#count-mode" }, { "kind": "code", "literal": "Not", "chunkId": "api/scans#count-mode" }, { "kind": "code", "literal": "412 precondition_failed", "chunkId": "api/scans#count-mode" }, { "kind": "code", "literal": "source: snapshot", "chunkId": "api/scans#count-mode" } ], "sources": [ { "chunkId": "api/scans#count-mode", "url": "/docs/api/scans#count-mode", "anchor": "count-mode" } ], "mode": "source-primary", "terms": [ "count", "mode", "await", "client", "createscan", "products", "source", "auto", "filters", "category", "electronics", "threads", "timeoutseconds", "produ", "create", "scan", "timeout", "seconds", "hevlayer", "createscanrequest", "interface", "const", "curl", "post", "layer", "gateway", "namespaces", "scans", "authorization", "bearer", "content", "type", "application", "json", "4210", "served", "snapshot", "3f9e8b21", "watermark", "1747300000123" ] }, { "id": "api/scans#fan-out-width", "kind": "section", "title": "Scan", "heading": "Fan-out width", "group": "API", "url": "/docs/api/scans#fan-out-width", "summary": "Fan-out width Origin scans fan out one upstream request per active shard. threads sets the maximum number of those upstream requests a single scan may have in flight at once. It means concurrent requests, not operating-s…", "facts": [ { "kind": "code", "literal": "threads", "chunkId": "api/scans#fan-out-width" }, { "kind": "code", "literal": "spec.scan.threads", "chunkId": "api/scans#fan-out-width" }, { "kind": "code", "literal": "Index", "chunkId": "api/scans#fan-out-width" }, { "kind": "code", "literal": "32", "chunkId": "api/scans#fan-out-width" } ], "sources": [ { "chunkId": "api/scans#fan-out-width", "url": "/docs/api/scans#fan-out-width", "anchor": "fan-out-width" } ], "mode": "source-primary", "terms": [ "width", "origin", "scans", "upstream", "request", "active", "shard", "threads", "sets", "maximum", "number", "those", "requests", "single", "scan", "flight", "once", "means", "concurrent", "operating", "spec", "index", "system", "gateway", "async", "resolution", "order", "namespace", "resource", "default", "effective", "value", "clamped", "count", "server", "echoed", "responses", "completed", "jobs", "snapshot" ] }, { "id": "api/scans#filters", "kind": "section", "title": "Scan", "heading": "Filters", "group": "API", "url": "/docs/api/scans#filters", "summary": "Filters Scans accept the same Turbopuffer filter array as query. On origin scans, the filter is pushed to Turbopuffer. On cache scans, the gateway evaluates it against cached document attributes. Supported cache operator…", "facts": [ { "kind": "code", "literal": "Eq", "chunkId": "api/scans#filters" }, { "kind": "code", "literal": "NotEq", "chunkId": "api/scans#filters" }, { "kind": "code", "literal": "Gt", "chunkId": "api/scans#filters" }, { "kind": "code", "literal": "Gte", "chunkId": "api/scans#filters" }, { "kind": "code", "literal": "Lt", "chunkId": "api/scans#filters" }, { "kind": "code", "literal": "Lte", "chunkId": "api/scans#filters" }, { "kind": "code", "literal": "In", "chunkId": "api/scans#filters" }, { "kind": "code", "literal": "NotIn", "chunkId": "api/scans#filters" }, { "kind": "code", "literal": "And", "chunkId": "api/scans#filters" }, { "kind": "code", "literal": "Or", "chunkId": "api/scans#filters" }, { "kind": "code", "literal": "Not", "chunkId": "api/scans#filters" }, { "kind": "code", "literal": "auto", "chunkId": "api/scans#filters" }, { "kind": "code", "literal": "source: cache", "chunkId": "api/scans#filters" } ], "sources": [ { "chunkId": "api/scans#filters", "url": "/docs/api/scans#filters", "anchor": "filters" } ], "mode": "source-primary", "terms": [ "filters", "scans", "accept", "same", "turbopuffer", "filter", "array", "query", "origin", "pushed", "cache", "gateway", "evaluates", "against", "cached", "document", "attributes", "supported", "operator", "noteq", "notin", "auto", "source", "operators", "sees", "cannot", "evaluate", "uses", "explicit", "unsupported", "fails", "rather", "returning", "partial", "results" ] }, { "id": "api/scans#full-text-count", "kind": "section", "title": "Scan", "heading": "Full-text count", "group": "API", "url": "/docs/api/scans#full-text-count", "summary": "Full-text count Count rows matching a BM25 query with the fts selector. Full-text counts are exact and always run origin scatter/gather, so source must be omitted, auto, or origin. A filters array, when present, is ANDed…", "facts": [ { "kind": "code", "literal": "count = await client.create_scan(\"products\", {\n \"mode\": \"count\",\n \"fts\": {\"field\": \"title\", \"query\": \"wireless headphones\"},\n \"filters\": [\"category\", \"Eq\", \"Electronics\"],\n \"exhaustive\": True,\n})", "chunkId": "api/scans#full-text-count" }, { "kind": "code", "literal": "count, err := client.CreateScan(ctx, \"products\", &hevlayer.CreateScanRequest{\n Mode: \"count\",\n Fts: &hevlayer.FtsScan{Field: \"title\", Query: \"wireless headphones\"},\n Filters: []interface{}{\"category\", \"Eq\", \"Electronics\"},\n Exhaustive: true,\n})", "chunkId": "api/scans#full-text-count" }, { "kind": "code", "literal": "const count = await client.createScan(\"products\", {\n mode: \"count\",\n fts: { field: \"title\", query: \"wireless headphones\" },\n filters: [\"category\", \"Eq\", \"Electronics\"],\n exhaustive: true,\n});", "chunkId": "api/scans#full-text-count" }, { "kind": "code", "literal": "curl -X POST \"$LAYER_GATEWAY_URL/v2/namespaces/products/scans\" \\\n -H \"Authorization: Bearer $LAYER_GATEWAY_API_KEY\" \\\n -H \"Content-Type: application/json\" \\\n -d '{\n \"mode\": \"count\",\n \"fts\": {\"field\": \"title\", \"query\": \"wireless headphones\"},\n \"filters\": [\"category\", \"Eq\", \"Electronics\"],\n \"exhaustive\": true\n }'", "chunkId": "api/scans#full-text-count" }, { "kind": "code", "literal": "fts", "chunkId": "api/scans#full-text-count" }, { "kind": "code", "literal": "source", "chunkId": "api/scans#full-text-count" }, { "kind": "code", "literal": "auto", "chunkId": "api/scans#full-text-count" }, { "kind": "code", "literal": "origin", "chunkId": "api/scans#full-text-count" }, { "kind": "code", "literal": "filters", "chunkId": "api/scans#full-text-count" } ], "sources": [ { "chunkId": "api/scans#full-text-count", "url": "/docs/api/scans#full-text-count", "anchor": "full-text-count" } ], "mode": "source-primary", "terms": [ "full", "text", "count", "rows", "matching", "bm25", "query", "selector", "counts", "exact", "always", "origin", "scatter", "gather", "source", "must", "omitted", "auto", "filters", "array", "present", "anded", "await", "client", "create", "scan", "products", "mode", "field", "title", "wireless", "headphones", "category", "electronics", "exhaustive", "true", "createscan", "hevlayer", "createscanrequest", "ftsscan" ] }, { "id": "api/scans#high-cardinality", "kind": "section", "title": "Scan", "heading": "High cardinality", "group": "API", "url": "/docs/api/scans#high-cardinality", "summary": "High cardinality Snapshot facet histograms cap each field at 10,000 distinct values and skip fields beyond it; values scans are the enumeration path for exactly those fields. A values job accumulates its histogram in gat…", "facts": [ { "kind": "code", "literal": "truncated: true", "chunkId": "api/scans#high-cardinality" }, { "kind": "code", "literal": "truncated", "chunkId": "api/scans#high-cardinality" }, { "kind": "code", "literal": "bounded", "chunkId": "api/scans#high-cardinality" }, { "kind": "code", "literal": "approximate", "chunkId": "api/scans#high-cardinality" }, { "kind": "code", "literal": "top_k", "chunkId": "api/scans#high-cardinality" } ], "sources": [ { "chunkId": "api/scans#high-cardinality", "url": "/docs/api/scans#high-cardinality", "anchor": "high-cardinality" } ], "mode": "source-primary", "terms": [ "high", "cardinality", "snapshot", "facet", "histograms", "field", "distinct", "values", "skip", "fields", "beyond", "scans", "enumeration", "path", "exactly", "those", "accumulates", "histogram", "truncated", "true", "bounded", "approximate", "gateway", "memory", "caps", "listing", "scan", "crosses", "completes", "rather", "failing", "applies", "after", "full", "pass", "every", "emitted", "stays", "exact", "truncates" ] }, { "id": "api/scans#id-mode", "kind": "section", "title": "Scan", "heading": "ID Mode", "group": "API", "url": "/docs/api/scans#id-mode", "summary": "ID Mode job = await client.createscan(\"products\", { \"source\": \"auto\", \"mode\": \"ids\", \"filters\": [\"category\", \"Eq\", \"Electronics\"], \"threads\": 8, \"pagesize\": 1000, }) job, err := client.CreateScan(ctx, \"products\", &hevlay…", "facts": [ { "kind": "code", "literal": "job = await client.create_scan(\"products\", {\n \"source\": \"auto\",\n \"mode\": \"ids\",\n \"filters\": [\"category\", \"Eq\", \"Electronics\"],\n \"threads\": 8,\n \"page_size\": 1000,\n})", "chunkId": "api/scans#id-mode" }, { "kind": "code", "literal": "job, err := client.CreateScan(ctx, \"products\", &hevlayer.CreateScanRequest{\n Source: \"auto\",\n Mode: \"ids\",\n Filters: []interface{}{\"category\", \"Eq\", \"Electronics\"},\n Threads: 8,\n PageSize: 1000,\n})", "chunkId": "api/scans#id-mode" }, { "kind": "code", "literal": "const job = await client.createScan(\"products\", {\n source: \"auto\",\n mode: \"ids\",\n filters: [\"category\", \"Eq\", \"Electronics\"],\n threads: 8,\n page_size: 1000,\n});", "chunkId": "api/scans#id-mode" }, { "kind": "code", "literal": "curl -X POST \"$LAYER_GATEWAY_URL/v2/namespaces/products/scans\" \\\n -H \"Authorization: Bearer $LAYER_GATEWAY_API_KEY\" \\\n -H \"Content-Type: application/json\" \\\n -d '{\n \"source\": \"auto\",\n \"mode\": \"ids\",\n \"filters\": [\"category\", \"Eq\", \"Electronics\"],\n \"threads\": 8,\n \"page_size\": 1000\n }'", "chunkId": "api/scans#id-mode" }, { "kind": "code", "literal": "{\n \"id\": \"scan-uuid\",\n \"namespace\": \"products\",\n \"source\": \"auto\",\n \"effective_source\": \"origin\",\n \"status\": \"running\",\n \"progress\": 0,\n \"documents_scanned\": 0,\n \"threads\": 8,\n \"created_at\": \"2026-05-26T10:00:00Z\"\n}", "chunkId": "api/scans#id-mode" }, { "kind": "code", "literal": "results = await client.get_scan_results(\"products\", job.id, limit=1000, offset=0)", "chunkId": "api/scans#id-mode" }, { "kind": "code", "literal": "results, err := client.GetScanResults(ctx, \"products\", scanID,\n &hevlayer.GetScanResultsParams{Limit: 1000, Offset: 0})", "chunkId": "api/scans#id-mode" }, { "kind": "code", "literal": "const results = await client.getScanResults(\"products\", job.id, {\n limit: 1000,\n offset: 0,\n});", "chunkId": "api/scans#id-mode" }, { "kind": "code", "literal": "curl \"$LAYER_GATEWAY_URL/v2/namespaces/products/scans/scan-uuid/results?limit=1000&offset=0\" \\\n -H \"Authorization: Bearer $LAYER_GATEWAY_API_KEY\"", "chunkId": "api/scans#id-mode" }, { "kind": "code", "literal": "{\n \"ids\": [\"doc-1\", \"doc-2\"],\n \"total\": 2\n}", "chunkId": "api/scans#id-mode" }, { "kind": "code", "literal": "mode", "chunkId": "api/scans#id-mode" }, { "kind": "code", "literal": "ids", "chunkId": "api/scans#id-mode" }, { "kind": "code", "literal": "auto", "chunkId": "api/scans#id-mode" }, { "kind": "code", "literal": "cache", "chunkId": "api/scans#id-mode" }, { "kind": "code", "literal": "origin", "chunkId": "api/scans#id-mode" }, { "kind": "code", "literal": "scan(...)", "chunkId": "api/scans#id-mode" }, { "kind": "code", "literal": "GetScan", "chunkId": "api/scans#id-mode" }, { "kind": "code", "literal": "status", "chunkId": "api/scans#id-mode" }, { "kind": "code", "literal": "completed", "chunkId": "api/scans#id-mode" }, { "kind": "code", "literal": "202 Accepted", "chunkId": "api/scans#id-mode" } ], "sources": [ { "chunkId": "api/scans#id-mode", "url": "/docs/api/scans#id-mode", "anchor": "id-mode" } ], "mode": "source-primary", "terms": [ "mode", "await", "client", "createscan", "products", "source", "auto", "filters", "category", "electronics", "threads", "pagesize", "1000", "hevlay", "create", "scan", "page", "size", "hevlayer", "createscanrequest", "interface", "const", "curl", "post", "layer", "gateway", "namespaces", "scans", "authorization", "bearer", "content", "type", "application", "json", "uuid", "namespace", "effective", "origin", "status", "running" ] }, { "id": "api/scans#operational-notes", "kind": "section", "title": "Scan", "heading": "Operational notes", "group": "API", "url": "/docs/api/scans#operational-notes", "summary": "Operational notes ID and values scan state is in-memory and ephemeral; it resets on gateway restart. Count scans have a deadline, default 30s and maximum 300s. Values jobs cap at 1,000,000 distinct values per scan and se…", "facts": [ { "kind": "code", "literal": "truncated: true", "chunkId": "api/scans#operational-notes" }, { "kind": "code", "literal": "Index.spec.scan.threads", "chunkId": "api/scans#operational-notes" }, { "kind": "code", "literal": "watermark_ms", "chunkId": "api/scans#operational-notes" } ], "sources": [ { "chunkId": "api/scans#operational-notes", "url": "/docs/api/scans#operational-notes", "anchor": "operational-notes" } ], "mode": "source-primary", "terms": [ "operational", "notes", "values", "scan", "state", "memory", "ephemeral", "resets", "gateway", "restart", "count", "scans", "deadline", "default", "maximum", "300s", "jobs", "distinct", "truncated", "true", "index", "spec", "threads", "watermark", "crossed", "listing", "keeps", "exact", "origin", "defaults", "concurrent", "upstream", "requests", "unless", "request", "sets", "different", "value", "snapshot", "served" ] }, { "id": "api/scans#precomputed-serving", "kind": "section", "title": "Scan", "heading": "Precomputed serving", "group": "API", "url": "/docs/api/scans#precomputed-serving", "summary": "Precomputed serving An unfiltered values scan (no filters, no ranked selector) on a field present in the latest snapshot fields[] is answered straight from the snapshot's facet histogram: the job completes during the cre…", "facts": [ { "kind": "code", "literal": "filters", "chunkId": "api/scans#precomputed-serving" }, { "kind": "code", "literal": "fields[]", "chunkId": "api/scans#precomputed-serving" }, { "kind": "code", "literal": "202", "chunkId": "api/scans#precomputed-serving" }, { "kind": "code", "literal": "status: completed", "chunkId": "api/scans#precomputed-serving" }, { "kind": "code", "literal": "effective_source: snapshot", "chunkId": "api/scans#precomputed-serving" }, { "kind": "code", "literal": "snapshot_sha", "chunkId": "api/scans#precomputed-serving" }, { "kind": "code", "literal": "watermark_ms", "chunkId": "api/scans#precomputed-serving" }, { "kind": "code", "literal": "fields_skipped[]", "chunkId": "api/scans#precomputed-serving" }, { "kind": "code", "literal": "auto", "chunkId": "api/scans#precomputed-serving" }, { "kind": "code", "literal": "412 precondition_failed", "chunkId": "api/scans#precomputed-serving" }, { "kind": "code", "literal": "source: snapshot", "chunkId": "api/scans#precomputed-serving" } ], "sources": [ { "chunkId": "api/scans#precomputed-serving", "url": "/docs/api/scans#precomputed-serving", "anchor": "precomputed-serving" } ], "mode": "source-primary", "terms": [ "precomputed", "serving", "unfiltered", "values", "scan", "filters", "ranked", "selector", "field", "present", "latest", "snapshot", "fields", "answered", "straight", "facet", "histogram", "completes", "during", "status", "completed", "effective", "source", "watermark", "skipped", "auto", "precondition", "failed", "create", "call", "body", "already", "shows", "carries", "effectivesource", "snapshotsha", "watermarkms", "fieldsskipped", "absent", "fall" ] }, { "id": "api/scans#radius-count", "kind": "section", "title": "Scan", "heading": "Radius count", "group": "API", "url": "/docs/api/scans#radius-count", "summary": "Radius count Count rows within radius of a query vector with the ann selector — a distance-ball scan. radius is required and finite (without an upper bound every row is in the ball); field defaults to vector. Like fts, r…", "facts": [ { "kind": "code", "literal": "count = await client.create_scan(\"products\", {\n \"mode\": \"count\",\n \"ann\": {\"field\": \"vector\", \"vector\": [0.12, -0.3, 0.88], \"radius\": 0.25},\n})", "chunkId": "api/scans#radius-count" }, { "kind": "code", "literal": "count, err := client.CreateScan(ctx, \"products\", &hevlayer.CreateScanRequest{\n Mode: \"count\",\n Ann: &hevlayer.AnnScan{Field: \"vector\", Vector: []float64{0.12, -0.3, 0.88}, Radius: 0.25},\n})", "chunkId": "api/scans#radius-count" }, { "kind": "code", "literal": "const count = await client.createScan(\"products\", {\n mode: \"count\",\n ann: { field: \"vector\", vector: [0.12, -0.3, 0.88], radius: 0.25 },\n});", "chunkId": "api/scans#radius-count" }, { "kind": "code", "literal": "curl -X POST \"$LAYER_GATEWAY_URL/v2/namespaces/products/scans\" \\\n -H \"Authorization: Bearer $LAYER_GATEWAY_API_KEY\" \\\n -H \"Content-Type: application/json\" \\\n -d '{\n \"mode\": \"count\",\n \"ann\": {\"field\": \"vector\", \"vector\": [0.12, -0.3, 0.88], \"radius\": 0.25}\n }'", "chunkId": "api/scans#radius-count" }, { "kind": "code", "literal": "{\n \"count\": 980,\n \"served_by\": \"origin\",\n \"approximate\": true,\n \"bounded\": false,\n \"timed_out\": false,\n \"shards_saturated\": 0,\n \"shards_total\": 1,\n \"threads\": 1,\n \"elapsed_ms\": 51\n}", "chunkId": "api/scans#radius-count" }, { "kind": "code", "literal": "radius", "chunkId": "api/scans#radius-count" }, { "kind": "code", "literal": "ann", "chunkId": "api/scans#radius-count" }, { "kind": "code", "literal": "field", "chunkId": "api/scans#radius-count" }, { "kind": "code", "literal": "vector", "chunkId": "api/scans#radius-count" }, { "kind": "code", "literal": "fts", "chunkId": "api/scans#radius-count" }, { "kind": "code", "literal": "approximate: true", "chunkId": "api/scans#radius-count" } ], "sources": [ { "chunkId": "api/scans#radius-count", "url": "/docs/api/scans#radius-count", "anchor": "radius-count" } ], "mode": "source-primary", "terms": [ "radius", "count", "rows", "within", "query", "vector", "selector", "distance", "ball", "scan", "required", "finite", "without", "upper", "bound", "every", "field", "defaults", "like", "await", "client", "create", "products", "mode", "createscan", "hevlayer", "createscanrequest", "annscan", "float64", "const", "curl", "post", "layer", "gateway", "namespaces", "scans", "authorization", "bearer", "content", "type" ] }, { "id": "api/scans#routes", "kind": "section", "title": "Scan", "heading": "Routes", "group": "API", "url": "/docs/api/scans#routes", "summary": "Routes Route Method Behavior POST /v2/namespaces/{ns}/scans POST Create an ID or values scan job, or return a count. GET /v2/namespaces/{ns}/scans GET List scan jobs for the namespace. GET /v2/namespaces/{ns}/scans/{id}…", "facts": [ { "kind": "code", "literal": "POST /v2/namespaces/{ns}/scans", "chunkId": "api/scans#routes" }, { "kind": "code", "literal": "GET /v2/namespaces/{ns}/scans", "chunkId": "api/scans#routes" }, { "kind": "code", "literal": "GET /v2/namespaces/{ns}/scans/{id}", "chunkId": "api/scans#routes" }, { "kind": "code", "literal": "GET /v2/namespaces/{ns}/scans/{id}/results", "chunkId": "api/scans#routes" }, { "kind": "code", "literal": "DELETE /v2/namespaces/{ns}/scans/{id}", "chunkId": "api/scans#routes" } ], "sources": [ { "chunkId": "api/scans#routes", "url": "/docs/api/scans#routes", "anchor": "routes" } ], "mode": "source-primary", "terms": [ "routes", "route", "method", "behavior", "post", "namespaces", "scans", "create", "values", "scan", "return", "count", "list", "jobs", "namespace", "results", "delete", "read", "completed", "drop", "memory" ] }, { "id": "api/scans#sources", "kind": "section", "title": "Scan", "heading": "Sources", "group": "API", "url": "/docs/api/scans#sources", "summary": "Sources Source ID mode Count mode Values mode auto Cache when fresh enough, otherwise origin Snapshot first, then cache/origin. Snapshot when eligible, then cache/origin. snapshot Not supported Latest snapshot only; requ…", "facts": [ { "kind": "code", "literal": "auto", "chunkId": "api/scans#sources" }, { "kind": "code", "literal": "snapshot", "chunkId": "api/scans#sources" }, { "kind": "code", "literal": "Eq", "chunkId": "api/scans#sources" }, { "kind": "code", "literal": "In", "chunkId": "api/scans#sources" }, { "kind": "code", "literal": "fields[]", "chunkId": "api/scans#sources" }, { "kind": "code", "literal": "cache", "chunkId": "api/scans#sources" }, { "kind": "code", "literal": "origin", "chunkId": "api/scans#sources" }, { "kind": "code", "literal": "fts", "chunkId": "api/scans#sources" }, { "kind": "code", "literal": "ann", "chunkId": "api/scans#sources" }, { "kind": "code", "literal": "422", "chunkId": "api/scans#sources" } ], "sources": [ { "chunkId": "api/scans#sources", "url": "/docs/api/scans#sources", "anchor": "sources" } ], "mode": "source-primary", "terms": [ "sources", "source", "mode", "count", "values", "auto", "cache", "fresh", "enough", "otherwise", "origin", "snapshot", "first", "eligible", "supported", "latest", "only", "requ", "fields", "requires", "facet", "listing", "unfiltered", "scan", "field", "aerospike", "document", "turbopuffer", "paginated", "gateway", "side", "dedupe", "table", "covers", "filter", "selector", "selectors", "evaluator", "always", "scatter" ] }, { "id": "api/scans#values-mode", "kind": "section", "title": "Scan", "heading": "Values Mode", "group": "API", "url": "/docs/api/scans#values-mode", "summary": "Values Mode A values scan enumerates the distinct values of one attribute field over the rows the selector picks, each with its document count. Use it to discover a field's value set — what product categories exist, what…", "facts": [ { "kind": "code", "literal": "job = await client.create_scan(\"products\", {\n \"mode\": \"values\",\n \"field\": \"category\",\n \"source\": \"auto\",\n \"filters\": [\"in_stock\", \"Eq\", True],\n})", "chunkId": "api/scans#values-mode" }, { "kind": "code", "literal": "job, err := client.CreateScan(ctx, \"products\", &hevlayer.CreateScanRequest{\n Mode: \"values\",\n Field: \"category\",\n Source: \"auto\",\n Filters: []interface{}{\"in_stock\", \"Eq\", true},\n})", "chunkId": "api/scans#values-mode" }, { "kind": "code", "literal": "const job = await client.createScan(\"products\", {\n mode: \"values\",\n field: \"category\",\n source: \"auto\",\n filters: [\"in_stock\", \"Eq\", true],\n});", "chunkId": "api/scans#values-mode" }, { "kind": "code", "literal": "curl -X POST \"$LAYER_GATEWAY_URL/v2/namespaces/products/scans\" \\\n -H \"Authorization: Bearer $LAYER_GATEWAY_API_KEY\" \\\n -H \"Content-Type: application/json\" \\\n -d '{\n \"mode\": \"values\",\n \"field\": \"category\",\n \"source\": \"auto\",\n \"filters\": [\"in_stock\", \"Eq\", true]\n }'", "chunkId": "api/scans#values-mode" }, { "kind": "code", "literal": "{\n \"id\": \"scan-uuid\",\n \"namespace\": \"products\",\n \"mode\": \"values\",\n \"field\": \"category\",\n \"source\": \"auto\",\n \"effective_source\": \"origin\",\n \"status\": \"running\",\n \"progress\": 0,\n \"documents_scanned\": 0,\n \"threads\": 8,\n \"created_at\": \"2026-05-26T10:00:00Z\"\n}", "chunkId": "api/scans#values-mode" }, { "kind": "code", "literal": "{\n \"values\": [\n {\"v\": \"electronics\", \"n\": 4210},\n {\"v\": \"books\", \"n\": 1240}\n ],\n \"total\": 2,\n \"truncated\": false\n}", "chunkId": "api/scans#values-mode" }, { "kind": "code", "literal": "field", "chunkId": "api/scans#values-mode" }, { "kind": "code", "literal": "mode: values", "chunkId": "api/scans#values-mode" }, { "kind": "code", "literal": "422", "chunkId": "api/scans#values-mode" }, { "kind": "code", "literal": "202 Accepted", "chunkId": "api/scans#values-mode" }, { "kind": "code", "literal": "scan(...)", "chunkId": "api/scans#values-mode" }, { "kind": "code", "literal": "status", "chunkId": "api/scans#values-mode" }, { "kind": "code", "literal": "completed", "chunkId": "api/scans#values-mode" }, { "kind": "code", "literal": "limit", "chunkId": "api/scans#values-mode" }, { "kind": "code", "literal": "offset", "chunkId": "api/scans#values-mode" }, { "kind": "code", "literal": "bounded: true", "chunkId": "api/scans#values-mode" }, { "kind": "code", "literal": ">=", "chunkId": "api/scans#values-mode" } ], "sources": [ { "chunkId": "api/scans#values-mode", "url": "/docs/api/scans#values-mode", "anchor": "values-mode" } ], "mode": "source-primary", "terms": [ "values", "mode", "scan", "enumerates", "distinct", "attribute", "field", "rows", "selector", "picks", "document", "count", "discover", "value", "product", "categories", "exist", "await", "client", "create", "products", "category", "source", "auto", "filters", "stock", "true", "createscan", "hevlayer", "createscanrequest", "interface", "const", "curl", "post", "layer", "gateway", "namespaces", "scans", "authorization", "bearer" ] }, { "id": "api/search-history", "kind": "section", "title": "Query History", "heading": null, "group": "API", "url": "/docs/api/search-history", "summary": "Per-namespace query and clickstream history backed by JSONL in S3. Layer logs every query the gateway serves into a durable JSONL trail in S3, mirrored into the NVMe cache for fast recent reads. Fetch events that downstr…", "facts": [ { "kind": "value", "literal": "CodeTabs.astro", "chunkId": "api/search-history" } ], "sources": [ { "chunkId": "api/search-history", "url": "/docs/api/search-history", "anchor": null } ], "mode": "source-primary", "terms": [ "namespace", "query", "clickstream", "history", "backed", "jsonl", "layer", "logs", "every", "gateway", "serves", "durable", "trail", "mirrored", "nvme", "cache", "fast", "recent", "reads", "fetch", "events", "downstr", "codetabs", "astro", "downstream", "consumers", "back", "land", "sibling", "feed", "together", "make", "search", "session", "reconstructable", "after", "fact", "relevance", "tuning", "comparison" ] }, { "id": "api/search-history#clickstream-entry", "kind": "section", "title": "Query History", "heading": "Clickstream entry", "group": "API", "url": "/docs/api/search-history#clickstream-entry", "summary": "Clickstream entry { \"events\": [ { \"timestamp\": \"2026-05-22T08:00:02.143Z\", \"timestampnanos\": 1747900802143000000, \"traceid\": \"f81d4fae-7dec-11d0-a765-00a0c91e6bf6\", \"namespace\": \"products\", \"docid\": \"asin-B08N5WRWNW\", \"t…", "facts": [ { "kind": "code", "literal": "{\n \"events\": [\n {\n \"timestamp\": \"2026-05-22T08:00:02.143Z\",\n \"timestamp_nanos\": 1747900802143000000,\n \"trace_id\": \"f81d4fae-7dec-11d0-a765-00a0c91e6bf6\",\n \"namespace\": \"products\",\n \"doc_id\": \"asin-B08N5WRWNW\",\n \"tags\": [\"session:abc123\"],\n \"source\": \"fetch\",\n \"served_from\": \"cache\"\n }\n ],\n \"next_cursor\": \"1747900802142000000\"\n}", "chunkId": "api/search-history#clickstream-entry" }, { "kind": "code", "literal": "events = await client.list_clickstream(\n \"products\",\n trace_id=\"f81d4fae-7dec-11d0-a765-00a0c91e6bf6\",\n)", "chunkId": "api/search-history#clickstream-entry" }, { "kind": "code", "literal": "events, err := client.ListClickstream(ctx, \"products\",\n &hevlayer.ListClickstreamParams{\n TraceID: \"f81d4fae-7dec-11d0-a765-00a0c91e6bf6\",\n })", "chunkId": "api/search-history#clickstream-entry" }, { "kind": "code", "literal": "const events = await client.listClickstream(\"products\", {\n traceId: \"f81d4fae-7dec-11d0-a765-00a0c91e6bf6\",\n});", "chunkId": "api/search-history#clickstream-entry" }, { "kind": "code", "literal": "curl \"$LAYER_GATEWAY_URL/v2/namespaces/products/clickstream?trace_id=f81d4fae-7dec-11d0-a765-00a0c91e6bf6\" \\\n -H \"Authorization: Bearer $LAYER_GATEWAY_API_KEY\"", "chunkId": "api/search-history#clickstream-entry" }, { "kind": "code", "literal": "trace_id", "chunkId": "api/search-history#clickstream-entry" }, { "kind": "code", "literal": "served_from", "chunkId": "api/search-history#clickstream-entry" } ], "sources": [ { "chunkId": "api/search-history#clickstream-entry", "url": "/docs/api/search-history#clickstream-entry", "anchor": "clickstream-entry" } ], "mode": "source-primary", "terms": [ "clickstream", "entry", "events", "timestamp", "2026", "22t08", "143z", "timestampnanos", "1747900802143000000", "traceid", "f81d4fae", "7dec", "11d0", "a765", "00a0c91e6bf6", "namespace", "products", "docid", "asin", "b08n5wrwnw", "nanos", "trace", "tags", "session", "abc123", "source", "fetch", "served", "cache", "next", "cursor", "1747900802142000000", "await", "client", "list", "listclickstream", "hevlayer", "listclickstreamparams", "const", "curl" ] }, { "id": "api/search-history#query-parameters", "kind": "section", "title": "Query History", "heading": "Query parameters", "group": "API", "url": "/docs/api/search-history#query-parameters", "summary": "Query parameters Param Purpose tag Comma-separated tag filter. AND semantics — every tag must match. from / to RFC3339 time bounds. before Pagination cursor; return entries strictly older than the given timestampnanos. l…", "facts": [ { "kind": "code", "literal": "tag", "chunkId": "api/search-history#query-parameters" }, { "kind": "code", "literal": "from", "chunkId": "api/search-history#query-parameters" }, { "kind": "code", "literal": "to", "chunkId": "api/search-history#query-parameters" }, { "kind": "code", "literal": "before", "chunkId": "api/search-history#query-parameters" }, { "kind": "code", "literal": "timestamp_nanos", "chunkId": "api/search-history#query-parameters" }, { "kind": "code", "literal": "limit", "chunkId": "api/search-history#query-parameters" } ], "sources": [ { "chunkId": "api/search-history#query-parameters", "url": "/docs/api/search-history#query-parameters", "anchor": "query-parameters" } ], "mode": "source-primary", "terms": [ "query", "parameters", "param", "purpose", "comma", "separated", "filter", "semantics", "every", "must", "match", "rfc3339", "time", "bounds", "before", "pagination", "cursor", "return", "entries", "strictly", "older", "given", "timestampnanos", "timestamp", "nanos", "limit", "default" ] }, { "id": "api/search-history#routes", "kind": "section", "title": "Query History", "heading": "Routes", "group": "API", "url": "/docs/api/search-history#routes", "summary": "Routes Route Behavior GET /v2/namespaces/{ns}/search-history Per-namespace query log, newest first. GET /v2/namespaces/{ns}/clickstream Fetch events correlated to a search, newest first. The /v1/ versions of both routes…", "facts": [ { "kind": "code", "literal": "GET /v2/namespaces/{ns}/search-history", "chunkId": "api/search-history#routes" }, { "kind": "code", "literal": "GET /v2/namespaces/{ns}/clickstream", "chunkId": "api/search-history#routes" }, { "kind": "code", "literal": "/v1/", "chunkId": "api/search-history#routes" } ], "sources": [ { "chunkId": "api/search-history#routes", "url": "/docs/api/search-history#routes", "anchor": "routes" } ], "mode": "source-primary", "terms": [ "routes", "route", "behavior", "namespaces", "search", "history", "namespace", "query", "newest", "first", "clickstream", "fetch", "events", "correlated", "versions", "both", "identical", "aliases", "held", "client", "compatibility" ] }, { "id": "api/search-history#search-history-entry", "kind": "section", "title": "Query History", "heading": "Search history entry", "group": "API", "url": "/docs/api/search-history#search-history-entry", "summary": "Search history entry { \"entries\": [ { \"timestamp\": \"2026-05-22T08:00:00.000Z\", \"timestampnanos\": 1747900800000000000, \"namespace\": \"products\", \"traceid\": \"f81d4fae-7dec-11d0-a765-00a0c91e6bf6\", \"rawquery\": \"wireless head…", "facts": [ { "kind": "code", "literal": "timestamp", "chunkId": "api/search-history#search-history-entry" }, { "kind": "code", "literal": "timestamp_nanos", "chunkId": "api/search-history#search-history-entry" }, { "kind": "code", "literal": "trace_id", "chunkId": "api/search-history#search-history-entry" }, { "kind": "code", "literal": "raw_query", "chunkId": "api/search-history#search-history-entry" }, { "kind": "code", "literal": "x-hevlayer-search-query", "chunkId": "api/search-history#search-history-entry" }, { "kind": "code", "literal": "stable_as_of", "chunkId": "api/search-history#search-history-entry" }, { "kind": "code", "literal": "query", "chunkId": "api/search-history#search-history-entry" }, { "kind": "code", "literal": "top_result_ids", "chunkId": "api/search-history#search-history-entry" }, { "kind": "code", "literal": "tags", "chunkId": "api/search-history#search-history-entry" }, { "kind": "code", "literal": "HybridText", "chunkId": "api/search-history#search-history-entry" }, { "kind": "value", "literal": "e.g", "chunkId": "api/search-history#search-history-entry" } ], "sources": [ { "chunkId": "api/search-history#search-history-entry", "url": "/docs/api/search-history#search-history-entry", "anchor": "search-history-entry" } ], "mode": "source-primary", "terms": [ "search", "history", "entry", "entries", "timestamp", "2026", "22t08", "000z", "timestampnanos", "1747900800000000000", "namespace", "products", "traceid", "f81d4fae", "7dec", "11d0", "a765", "00a0c91e6bf6", "rawquery", "wireless", "head", "nanos", "trace", "query", "hevlayer", "stable", "result", "tags", "hybridtext", "headphones", "stableasof", "1747900700000", "vector", "topk", "filters", "topresultids", "asin", "b08n5wrwnw", "b07pxgqc1q", "shop" ] }, { "id": "api/search-history#storage", "kind": "section", "title": "Query History", "heading": "Storage", "group": "API", "url": "/docs/api/search-history#storage", "summary": "Storage search-history/{namespace}/{YYYY-MM-DD}/{timestampnanos}.jsonl Writes are best-effort and never block the query response. Aerospike holds a recent window for fast reads; S3 is the durable store. A cache outage de…", "facts": [ { "kind": "code", "literal": "search-history/{namespace}/{YYYY-MM-DD}/{timestamp_nanos}.jsonl", "chunkId": "api/search-history#storage" } ], "sources": [ { "chunkId": "api/search-history#storage", "url": "/docs/api/search-history#storage", "anchor": "storage" } ], "mode": "source-primary", "terms": [ "storage", "search", "history", "namespace", "yyyy", "timestampnanos", "jsonl", "writes", "best", "effort", "never", "block", "query", "response", "aerospike", "holds", "recent", "window", "fast", "reads", "durable", "store", "cache", "outage", "timestamp", "nanos", "degrades", "read", "latency", "durability", "list", "calls", "walk", "prefix", "merge", "inline" ] }, { "id": "api/search-history#tag-contract", "kind": "section", "title": "Query History", "heading": "Tag contract", "group": "API", "url": "/docs/api/search-history#tag-contract", "summary": "Tag contract Layer splits x-hevlayer-tags and ?tag= on commas, trims whitespace, drops empty values, then sorts and dedupes tags before storing or matching them. Commas are separators and cannot be escaped. Limits: Limit…", "facts": [ { "kind": "code", "literal": "x-hevlayer-tags", "chunkId": "api/search-history#tag-contract" }, { "kind": "code", "literal": "?tag=", "chunkId": "api/search-history#tag-contract" }, { "kind": "code", "literal": "?tag=a,b", "chunkId": "api/search-history#tag-contract" } ], "sources": [ { "chunkId": "api/search-history#tag-contract", "url": "/docs/api/search-history#tag-contract", "anchor": "tag-contract" } ], "mode": "source-primary", "terms": [ "contract", "layer", "splits", "hevlayer", "tags", "commas", "trims", "whitespace", "drops", "empty", "values", "sorts", "dedupes", "before", "storing", "matching", "separators", "cannot", "escaped", "limits", "limit", "value", "unique", "request", "filter", "length", "bytes", "allowed", "characters", "ascii", "letters", "digits", "list", "uses", "semantics", "returns", "only", "entries", "carry", "both" ] }, { "id": "api/search-history#writing-metadata", "kind": "section", "title": "Query History", "heading": "Writing metadata", "group": "API", "url": "/docs/api/search-history#writing-metadata", "summary": "Writing metadata Set x-hevlayer-search-query on query requests to capture the human input, and set x-hevlayer-tags to a comma-separated list of segmentation tags. The Python client exposes these as the rawquery and tags…", "facts": [ { "kind": "code", "literal": "query = await client.query_namespace(\n \"products\",\n {\"vector\": embedding, \"top_k\": 10, \"include_attributes\": [\"title\"]},\n raw_query=\"wireless headphones\",\n tags=[\"app:hev-shop\", \"surface:storefront\", \"route:search\", \"page:first\"],\n)\n\nhistory = await client.list_search_history(\n \"products\",\n tags=[\"app:hev-shop\", \"route:search\", \"page:first\"],\n limit=20,\n)", "chunkId": "api/search-history#writing-metadata" }, { "kind": "code", "literal": "const query = await client.queryNamespace(\n \"products\",\n { vector: embedding, top_k: 10, include_attributes: [\"title\"] },\n {\n searchQuery: \"wireless headphones\",\n tags: [\"app:hev-shop\", \"surface:storefront\", \"route:search\", \"page:first\"],\n },\n);\n\nconst history = await client.listSearchHistory(\"products\", {\n tags: [\"app:hev-shop\", \"route:search\", \"page:first\"],\n limit: 20,\n});", "chunkId": "api/search-history#writing-metadata" }, { "kind": "code", "literal": "x-hevlayer-search-query", "chunkId": "api/search-history#writing-metadata" }, { "kind": "code", "literal": "x-hevlayer-tags", "chunkId": "api/search-history#writing-metadata" }, { "kind": "code", "literal": "raw_query", "chunkId": "api/search-history#writing-metadata" }, { "kind": "code", "literal": "tags", "chunkId": "api/search-history#writing-metadata" }, { "kind": "code", "literal": "WithSearchQuery", "chunkId": "api/search-history#writing-metadata" }, { "kind": "code", "literal": "WithSearchTags", "chunkId": "api/search-history#writing-metadata" } ], "sources": [ { "chunkId": "api/search-history#writing-metadata", "url": "/docs/api/search-history#writing-metadata", "anchor": "writing-metadata" } ], "mode": "source-primary", "terms": [ "writing", "metadata", "hevlayer", "search", "query", "requests", "capture", "human", "input", "tags", "comma", "separated", "list", "segmentation", "python", "client", "exposes", "these", "rawquery", "await", "namespace", "products", "vector", "embedding", "include", "attributes", "title", "wireless", "headphones", "shop", "surface", "storefront", "route", "page", "first", "history", "limit", "const", "querynamespace", "searchquery" ] }, { "id": "api/snapshots", "kind": "section", "title": "Snapshot History", "heading": null, "group": "API", "url": "/docs/api/snapshots", "summary": "Facet snapshot jobs, history, bodies, and activity streams. Snapshots are materialized facet histograms for a namespace. They carry facet listings in values[].v and facet counts in values[].n, stored durably in S3 and mi…", "facts": [ { "kind": "code", "literal": "values[].v", "chunkId": "api/snapshots" }, { "kind": "code", "literal": "values[].n", "chunkId": "api/snapshots" }, { "kind": "code", "literal": "POST /snapshots", "chunkId": "api/snapshots" }, { "kind": "value", "literal": "CodeTabs.astro", "chunkId": "api/snapshots" } ], "sources": [ { "chunkId": "api/snapshots", "url": "/docs/api/snapshots", "anchor": null } ], "mode": "source-primary", "terms": [ "facet", "snapshot", "jobs", "history", "bodies", "activity", "streams", "snapshots", "materialized", "histograms", "namespace", "carry", "listings", "values", "counts", "stored", "durably", "post", "codetabs", "astro", "mirrored", "aerospike", "latest", "body", "materialize", "field", "routes", "read", "durable", "chronology", "written", "consistency", "watcher" ] }, { "id": "api/snapshots#activity", "kind": "section", "title": "Snapshot History", "heading": "Activity", "group": "API", "url": "/docs/api/snapshots#activity", "summary": "Activity activity = await client.listsnapshotactivity(since=1747200000000, limit=50) activity, err := client.ListSnapshotActivity(ctx, &hevlayer.ListSnapshotActivityParams{Since: 1747200000000, Limit: 50}) const activity…", "facts": [ { "kind": "code", "literal": "activity = await client.list_snapshot_activity(since=1747200000000, limit=50)", "chunkId": "api/snapshots#activity" }, { "kind": "code", "literal": "activity, err := client.ListSnapshotActivity(ctx,\n &hevlayer.ListSnapshotActivityParams{Since: 1747200000000, Limit: 50})", "chunkId": "api/snapshots#activity" }, { "kind": "code", "literal": "const activity = await client.listSnapshotActivity({\n since: 1747200000000,\n limit: 50,\n});", "chunkId": "api/snapshots#activity" }, { "kind": "code", "literal": "curl \"$LAYER_GATEWAY_URL/v2/activity/snapshots?since=1747200000000&limit=50\" \\\n -H \"Authorization: Bearer $LAYER_GATEWAY_API_KEY\"", "chunkId": "api/snapshots#activity" }, { "kind": "code", "literal": "since", "chunkId": "api/snapshots#activity" }, { "kind": "code", "literal": "ts_ms", "chunkId": "api/snapshots#activity" }, { "kind": "code", "literal": "limit", "chunkId": "api/snapshots#activity" }, { "kind": "code", "literal": "namespace", "chunkId": "api/snapshots#activity" }, { "kind": "code", "literal": "cursor", "chunkId": "api/snapshots#activity" }, { "kind": "code", "literal": "next_cursor", "chunkId": "api/snapshots#activity" } ], "sources": [ { "chunkId": "api/snapshots#activity", "url": "/docs/api/snapshots#activity", "anchor": "activity" } ], "mode": "source-primary", "terms": [ "activity", "await", "client", "listsnapshotactivity", "since", "1747200000000", "limit", "hevlayer", "listsnapshotactivityparams", "const", "list", "snapshot", "curl", "layer", "gateway", "snapshots", "authorization", "bearer", "namespace", "cursor", "next", "layergatewayurl", "layergatewayapikey", "query", "param", "required", "purpose", "epoch", "lower", "bound", "tsms", "default", "exact", "filter", "pagination", "nextcursor", "lifecycle", "only", "search", "history" ] }, { "id": "api/snapshots#history", "kind": "section", "title": "Snapshot History", "heading": "History", "group": "API", "url": "/docs/api/snapshots#history", "summary": "History history = await client.listnamespacehistory(\"products\", limit=20) history, err := client.ListNamespaceHistory(ctx, \"products\", &hevlayer.ListNamespaceHistoryParams{Limit: 20}) const history = await client.listNam…", "facts": [ { "kind": "code", "literal": "history = await client.list_namespace_history(\"products\", limit=20)", "chunkId": "api/snapshots#history" }, { "kind": "code", "literal": "history, err := client.ListNamespaceHistory(ctx, \"products\",\n &hevlayer.ListNamespaceHistoryParams{Limit: 20})", "chunkId": "api/snapshots#history" }, { "kind": "code", "literal": "const history = await client.listNamespaceHistory(\"products\", { limit: 20 });", "chunkId": "api/snapshots#history" }, { "kind": "code", "literal": "curl \"$LAYER_GATEWAY_URL/v2/namespaces/products/history?limit=20\" \\\n -H \"Authorization: Bearer $LAYER_GATEWAY_API_KEY\"", "chunkId": "api/snapshots#history" }, { "kind": "code", "literal": "[\n {\"watermark_ms\": 1747300000123, \"sha\": \"3f9e8b21...\"},\n {\"watermark_ms\": 1747299600045, \"sha\": \"a1c5b09f...\"}\n]", "chunkId": "api/snapshots#history" }, { "kind": "code", "literal": "limit", "chunkId": "api/snapshots#history" }, { "kind": "code", "literal": "before", "chunkId": "api/snapshots#history" } ], "sources": [ { "chunkId": "api/snapshots#history", "url": "/docs/api/snapshots#history", "anchor": "history" } ], "mode": "source-primary", "terms": [ "history", "await", "client", "listnamespacehistory", "products", "limit", "hevlayer", "listnamespacehistoryparams", "const", "listnam", "list", "namespace", "curl", "layer", "gateway", "namespaces", "authorization", "bearer", "watermark", "1747300000123", "3f9e8b21", "1747299600045", "a1c5b09f", "before", "layergatewayurl", "layergatewayapikey", "watermarkms", "query", "param", "default", "purpose", "maximum", "entries", "returned", "capped", "none", "return", "older", "char", "prefixes" ] }, { "id": "api/snapshots#manual-snapshot", "kind": "section", "title": "Snapshot History", "heading": "Manual snapshot", "group": "API", "url": "/docs/api/snapshots#manual-snapshot", "summary": "Manual snapshot job = await client.createsnapshot(\"products\", { \"field\": \"category\", \"source\": \"auto\", \"filters\": [\"brand\", \"Eq\", \"Acme\"], \"pagesize\": 1000, }) job, err := client.CreateSnapshot(ctx, \"products\", &hevlayer…", "facts": [ { "kind": "code", "literal": "job = await client.create_snapshot(\"products\", {\n \"field\": \"category\",\n \"source\": \"auto\",\n \"filters\": [\"brand\", \"Eq\", \"Acme\"],\n \"page_size\": 1000,\n})", "chunkId": "api/snapshots#manual-snapshot" }, { "kind": "code", "literal": "job, err := client.CreateSnapshot(ctx, \"products\", &hevlayer.CreateSnapshotRequest{\n Field: \"category\",\n Source: \"auto\",\n Filters: []interface{}{\"brand\", \"Eq\", \"Acme\"},\n PageSize: 1000,\n})", "chunkId": "api/snapshots#manual-snapshot" }, { "kind": "code", "literal": "const job = await client.createSnapshot(\"products\", {\n field: \"category\",\n source: \"auto\",\n filters: [\"brand\", \"Eq\", \"Acme\"],\n page_size: 1000,\n});", "chunkId": "api/snapshots#manual-snapshot" }, { "kind": "code", "literal": "curl -X POST \"$LAYER_GATEWAY_URL/v2/namespaces/products/snapshots\" \\\n -H \"Authorization: Bearer $LAYER_GATEWAY_API_KEY\" \\\n -H \"Content-Type: application/json\" \\\n -d '{\n \"field\": \"category\",\n \"source\": \"auto\",\n \"filters\": [\"brand\", \"Eq\", \"Acme\"],\n \"page_size\": 1000\n }'", "chunkId": "api/snapshots#manual-snapshot" }, { "kind": "code", "literal": "{\n \"id\": \"snapshot-job-uuid\",\n \"namespace\": \"products\",\n \"field\": \"category\",\n \"source\": \"auto\",\n \"status\": \"running\",\n \"progress\": 0,\n \"documents_scanned\": 0,\n \"created_at\": \"2026-05-26T10:00:00Z\"\n}", "chunkId": "api/snapshots#manual-snapshot" }, { "kind": "code", "literal": "job = await client.get_snapshot_job(\"products\", job.id)", "chunkId": "api/snapshots#manual-snapshot" }, { "kind": "code", "literal": "job, err := client.GetSnapshotJob(ctx, \"products\", jobID)", "chunkId": "api/snapshots#manual-snapshot" }, { "kind": "code", "literal": "const job = await client.getSnapshotJob(\"products\", jobId);", "chunkId": "api/snapshots#manual-snapshot" }, { "kind": "code", "literal": "curl \"$LAYER_GATEWAY_URL/v2/namespaces/products/snapshot-jobs/snapshot-job-uuid\" \\\n -H \"Authorization: Bearer $LAYER_GATEWAY_API_KEY\"", "chunkId": "api/snapshots#manual-snapshot" }, { "kind": "code", "literal": "{\n \"id\": \"snapshot-job-uuid\",\n \"namespace\": \"products\",\n \"field\": \"category\",\n \"source\": \"origin\",\n \"status\": \"completed\",\n \"documents_scanned\": 12844,\n \"sha\": \"3f9e8b21\",\n \"stable_as_of\": 1747300000123\n}", "chunkId": "api/snapshots#manual-snapshot" }, { "kind": "code", "literal": "auto", "chunkId": "api/snapshots#manual-snapshot" }, { "kind": "code", "literal": "stored", "chunkId": "api/snapshots#manual-snapshot" }, { "kind": "code", "literal": "cache", "chunkId": "api/snapshots#manual-snapshot" }, { "kind": "code", "literal": "origin", "chunkId": "api/snapshots#manual-snapshot" }, { "kind": "code", "literal": "202 Accepted", "chunkId": "api/snapshots#manual-snapshot" }, { "kind": "code", "literal": "sha", "chunkId": "api/snapshots#manual-snapshot" } ], "sources": [ { "chunkId": "api/snapshots#manual-snapshot", "url": "/docs/api/snapshots#manual-snapshot", "anchor": "manual-snapshot" } ], "mode": "source-primary", "terms": [ "manual", "snapshot", "await", "client", "createsnapshot", "products", "field", "category", "source", "auto", "filters", "brand", "acme", "pagesize", "1000", "hevlayer", "create", "page", "size", "createsnapshotrequest", "interface", "const", "curl", "post", "layer", "gateway", "namespaces", "snapshots", "authorization", "bearer", "content", "type", "application", "json", "uuid", "namespace", "status", "running", "progress", "documents" ] }, { "id": "api/snapshots#routes", "kind": "section", "title": "Snapshot History", "heading": "Routes", "group": "API", "url": "/docs/api/snapshots#routes", "summary": "Routes Route Method Behavior POST /v2/namespaces/{ns}/snapshots POST Create an on-demand snapshot job for one field. GET /v2/namespaces/{ns}/snapshot-jobs GET List in-memory snapshot jobs. GET /v2/namespaces/{ns}/snapsho…", "facts": [ { "kind": "code", "literal": "POST /v2/namespaces/{ns}/snapshots", "chunkId": "api/snapshots#routes" }, { "kind": "code", "literal": "GET /v2/namespaces/{ns}/snapshot-jobs", "chunkId": "api/snapshots#routes" }, { "kind": "code", "literal": "GET /v2/namespaces/{ns}/snapshot-jobs/{id}", "chunkId": "api/snapshots#routes" }, { "kind": "code", "literal": "GET /v2/namespaces/{ns}/history", "chunkId": "api/snapshots#routes" }, { "kind": "code", "literal": "GET /v2/namespaces/{ns}/snapshots/{sha}", "chunkId": "api/snapshots#routes" }, { "kind": "code", "literal": "GET /v2/activity/snapshots", "chunkId": "api/snapshots#routes" } ], "sources": [ { "chunkId": "api/snapshots#routes", "url": "/docs/api/snapshots#routes", "anchor": "routes" } ], "mode": "source-primary", "terms": [ "routes", "route", "method", "behavior", "post", "namespaces", "snapshots", "create", "demand", "snapshot", "field", "jobs", "list", "memory", "snapsho", "history", "activity", "read", "newest", "first", "durable", "full", "body", "char", "prefix", "cross", "namespace", "write", "stream" ] }, { "id": "api/snapshots#snapshot-body", "kind": "section", "title": "Snapshot History", "heading": "Snapshot body", "group": "API", "url": "/docs/api/snapshots#snapshot-body", "summary": "Snapshot body body = await client.getnamespacesnapshot(\"products\", \"3f9e8b2\") body, err := client.GetNamespaceSnapshot(ctx, \"products\", \"3f9e8b2\") const body = await client.getNamespaceSnapshot(\"products\", \"3f9e8b2\"); cu…", "facts": [ { "kind": "code", "literal": "body = await client.get_namespace_snapshot(\"products\", \"3f9e8b2\")", "chunkId": "api/snapshots#snapshot-body" }, { "kind": "code", "literal": "body, err := client.GetNamespaceSnapshot(ctx, \"products\", \"3f9e8b2\")", "chunkId": "api/snapshots#snapshot-body" }, { "kind": "code", "literal": "const body = await client.getNamespaceSnapshot(\"products\", \"3f9e8b2\");", "chunkId": "api/snapshots#snapshot-body" }, { "kind": "code", "literal": "curl \"$LAYER_GATEWAY_URL/v2/namespaces/products/snapshots/3f9e8b2\" \\\n -H \"Authorization: Bearer $LAYER_GATEWAY_API_KEY\"", "chunkId": "api/snapshots#snapshot-body" }, { "kind": "code", "literal": "fields[].values[].v", "chunkId": "api/snapshots#snapshot-body" }, { "kind": "code", "literal": "fields[].values[].n", "chunkId": "api/snapshots#snapshot-body" }, { "kind": "code", "literal": "row_count", "chunkId": "api/snapshots#snapshot-body" }, { "kind": "code", "literal": "indexed", "chunkId": "api/snapshots#snapshot-body" }, { "kind": "code", "literal": "index_lag_rows", "chunkId": "api/snapshots#snapshot-body" }, { "kind": "code", "literal": "fields[]", "chunkId": "api/snapshots#snapshot-body" }, { "kind": "code", "literal": "fields_skipped[]", "chunkId": "api/snapshots#snapshot-body" } ], "sources": [ { "chunkId": "api/snapshots#snapshot-body", "url": "/docs/api/snapshots#snapshot-body", "anchor": "snapshot-body" } ], "mode": "source-primary", "terms": [ "snapshot", "body", "await", "client", "getnamespacesnapshot", "products", "3f9e8b2", "const", "namespace", "curl", "layer", "gateway", "namespaces", "snapshots", "authorization", "bearer", "fields", "values", "count", "indexed", "index", "rows", "skipped", "layergatewayurl", "layergatewayapikey", "watermarkms", "1747300000123", "3f9e8b21", "rowcount", "12500", "name", "category", "books", "1240", "electronics", "fieldsskipped", "tags", "reason", "exceededcap", "distinctobserved" ] }, { "id": "api/snapshots#snapshot-policy", "kind": "section", "title": "Snapshot History", "heading": "Snapshot policy", "group": "API", "url": "/docs/api/snapshots#snapshot-policy", "summary": "Snapshot policy Configure automatic snapshot writes on the namespace's Index CR: apiVersion: hevlayer.com/v1 kind: Index metadata: name: products spec: backend: namespace: products snapshot: interval: 5m retention: 30d f…", "facts": [ { "kind": "code", "literal": "apiVersion: hevlayer.com/v1\nkind: Index\nmetadata:\n name: products\nspec:\n backend:\n namespace: products\n snapshot:\n interval: 5m\n retention: 30d\n facetFields:\n - category\n - brand", "chunkId": "api/snapshots#snapshot-policy" }, { "kind": "code", "literal": "Index", "chunkId": "api/snapshots#snapshot-policy" }, { "kind": "code", "literal": "facetFields", "chunkId": "api/snapshots#snapshot-policy" }, { "kind": "code", "literal": "[]", "chunkId": "api/snapshots#snapshot-policy" }, { "kind": "code", "literal": "interval", "chunkId": "api/snapshots#snapshot-policy" }, { "kind": "code", "literal": "5m", "chunkId": "api/snapshots#snapshot-policy" }, { "kind": "code", "literal": "LAYER_SNAPSHOT_MIN_INTERVAL_MS", "chunkId": "api/snapshots#snapshot-policy" }, { "kind": "code", "literal": "retention", "chunkId": "api/snapshots#snapshot-policy" }, { "kind": "code", "literal": "never", "chunkId": "api/snapshots#snapshot-policy" }, { "kind": "code", "literal": "30d", "chunkId": "api/snapshots#snapshot-policy" }, { "kind": "code", "literal": "POST /snapshots", "chunkId": "api/snapshots#snapshot-policy" }, { "kind": "code", "literal": "source: origin", "chunkId": "api/snapshots#snapshot-policy" }, { "kind": "code", "literal": "spec.scan.threads", "chunkId": "api/snapshots#snapshot-policy" } ], "sources": [ { "chunkId": "api/snapshots#snapshot-policy", "url": "/docs/api/snapshots#snapshot-policy", "anchor": "snapshot-policy" } ], "mode": "source-primary", "terms": [ "snapshot", "policy", "configure", "automatic", "writes", "namespace", "index", "apiversion", "hevlayer", "kind", "metadata", "name", "products", "spec", "backend", "interval", "retention", "facetfields", "category", "brand", "layer", "never", "post", "snapshots", "source", "origin", "scan", "threads", "field", "default", "behavior", "facet", "fields", "histogram", "empty", "unset", "disables", "writer", "history", "activity" ] }, { "id": "api/warm-cache", "kind": "section", "title": "Warm cache", "heading": null, "group": "API", "url": "/docs/api/warm-cache", "summary": "Warm a namespace's NVMe cache and snapshot mirror. Layer exposes two warm endpoints. hintcachewarm is the Turbopuffer-compatible hint; warm is the Layer-only shortcut that creates a gateway warm job. GET /v1/namespaces/{…", "facts": [ { "kind": "code", "literal": "hint_cache_warm", "chunkId": "api/warm-cache" }, { "kind": "code", "literal": "warm", "chunkId": "api/warm-cache" }, { "kind": "code", "literal": "GET /v1/namespaces/{ns}/hint_cache_warm", "chunkId": "api/warm-cache" }, { "kind": "value", "literal": "Upstream.astro", "chunkId": "api/warm-cache" }, { "kind": "value", "literal": "Callout.astro", "chunkId": "api/warm-cache" }, { "kind": "value", "literal": "CodeTabs.astro", "chunkId": "api/warm-cache" }, { "kind": "value", "literal": "turbopuffer.com", "chunkId": "api/warm-cache" } ], "sources": [ { "chunkId": "api/warm-cache", "url": "/docs/api/warm-cache", "anchor": null } ], "mode": "source-primary", "terms": [ "warm", "namespace", "nvme", "cache", "snapshot", "mirror", "layer", "exposes", "endpoints", "hintcachewarm", "turbopuffer", "compatible", "hint", "only", "shortcut", "creates", "gateway", "namespaces", "upstream", "astro", "callout", "codetabs", "matches", "call", "advises", "index", "load", "additionally", "runs", "steps", "side" ] }, { "id": "api/warm-cache#cache-cold-behavior", "kind": "section", "title": "Warm cache", "heading": "Cache-cold behavior", "group": "API", "url": "/docs/api/warm-cache#cache-cold-behavior", "summary": "Cache-cold behavior Warm jobs, cache scans, cache snapshot jobs, and pipeline chunk reads return 503 cachecold when the NVMe cache is unavailable. Fetch and fetch-many fall through to Turbopuffer with x-layer-cache: miss…", "facts": [ { "kind": "code", "literal": "cache_cold", "chunkId": "api/warm-cache#cache-cold-behavior" }, { "kind": "code", "literal": "x-layer-cache: miss-on-error", "chunkId": "api/warm-cache#cache-cold-behavior" }, { "kind": "code", "literal": "hint_cache_warm", "chunkId": "api/warm-cache#cache-cold-behavior" }, { "kind": "code", "literal": "documents", "chunkId": "api/warm-cache#cache-cold-behavior" }, { "kind": "code", "literal": "snapshots", "chunkId": "api/warm-cache#cache-cold-behavior" } ], "sources": [ { "chunkId": "api/warm-cache#cache-cold-behavior", "url": "/docs/api/warm-cache#cache-cold-behavior", "anchor": "cache-cold-behavior" } ], "mode": "source-primary", "terms": [ "cache", "cold", "behavior", "warm", "jobs", "scans", "snapshot", "pipeline", "chunk", "reads", "return", "cachecold", "nvme", "unavailable", "fetch", "many", "fall", "through", "turbopuffer", "layer", "miss", "error", "hint", "documents", "snapshots", "instead", "split", "deliberate", "correctness", "first", "outage", "must", "turn", "missing", "document", "throughput", "warming", "would", "wasted", "work" ] }, { "id": "api/warm-cache#hint-cache-warm", "kind": "section", "title": "Warm cache", "heading": "Hint-cache warm", "group": "API", "url": "/docs/api/warm-cache#hint-cache-warm", "summary": "Hint-cache warm With no query parameters, the call is a raw passthrough: the gateway forwards it to Turbopuffer unchanged and returns the upstream response verbatim. Existing Turbopuffer clients keep their exact wire beh…", "facts": [ { "kind": "code", "literal": "curl \"$LAYER_GATEWAY_URL/v1/namespaces/products/hint_cache_warm\" \\\n -H \"Authorization: Bearer $LAYER_GATEWAY_API_KEY\"", "chunkId": "api/warm-cache#hint-cache-warm" }, { "kind": "code", "literal": "result = await client.hint_cache_warm(\n \"products\",\n turbopuffer=False,\n documents=False,\n snapshots=True,\n)", "chunkId": "api/warm-cache#hint-cache-warm" }, { "kind": "code", "literal": "const result = await client.hintCacheWarm(\"products\", {\n turbopuffer: false,\n documents: false,\n snapshots: true,\n});", "chunkId": "api/warm-cache#hint-cache-warm" }, { "kind": "code", "literal": "curl \"$LAYER_GATEWAY_URL/v1/namespaces/products/hint_cache_warm?turbopuffer=false&documents=false&snapshots=true\" \\\n -H \"Authorization: Bearer $LAYER_GATEWAY_API_KEY\"", "chunkId": "api/warm-cache#hint-cache-warm" }, { "kind": "code", "literal": "{\n \"namespace\": \"products\",\n \"turbopuffer\": { \"enabled\": true, \"status\": \"completed\" },\n \"documents\": {\n \"enabled\": true,\n \"status\": \"started\",\n \"job\": { \"id\": \"warm-job-uuid\", \"status\": \"running\" }\n },\n \"snapshots\": {\n \"enabled\": true,\n \"status\": \"completed\",\n \"key\": \"snapshots/products/...\",\n \"watermark_ms\": 1715600400000,\n \"sha\": \"...\"\n }\n}", "chunkId": "api/warm-cache#hint-cache-warm" }, { "kind": "code", "literal": "turbopuffer", "chunkId": "api/warm-cache#hint-cache-warm" }, { "kind": "code", "literal": "documents", "chunkId": "api/warm-cache#hint-cache-warm" }, { "kind": "code", "literal": "snapshots", "chunkId": "api/warm-cache#hint-cache-warm" }, { "kind": "code", "literal": "page_size", "chunkId": "api/warm-cache#hint-cache-warm" }, { "kind": "code", "literal": "turbopuffer=true", "chunkId": "api/warm-cache#hint-cache-warm" }, { "kind": "code", "literal": "documents=true", "chunkId": "api/warm-cache#hint-cache-warm" }, { "kind": "code", "literal": "snapshots=true", "chunkId": "api/warm-cache#hint-cache-warm" }, { "kind": "code", "literal": "false", "chunkId": "api/warm-cache#hint-cache-warm" }, { "kind": "code", "literal": "/warm-jobs/{id}", "chunkId": "api/warm-cache#hint-cache-warm" } ], "sources": [ { "chunkId": "api/warm-cache#hint-cache-warm", "url": "/docs/api/warm-cache#hint-cache-warm", "anchor": "hint-cache-warm" } ], "mode": "source-primary", "terms": [ "hint", "cache", "warm", "query", "parameters", "call", "passthrough", "gateway", "forwards", "turbopuffer", "unchanged", "returns", "upstream", "response", "verbatim", "existing", "clients", "keep", "their", "exact", "wire", "curl", "layer", "namespaces", "products", "authorization", "bearer", "result", "await", "client", "false", "documents", "snapshots", "true", "const", "hintcachewarm", "namespace", "enabled", "status", "completed" ] }, { "id": "api/warm-cache#layer-warm", "kind": "section", "title": "Warm cache", "heading": "Layer warm", "group": "API", "url": "/docs/api/warm-cache#layer-warm", "summary": "Layer warm POST /v2/namespaces/{ns}/warm creates an asynchronous job that pages through Turbopuffer, backfills Aerospike, and refreshes cachewarmedthrough. Use it when bootstrapping a namespace whose data was written out…", "facts": [ { "kind": "code", "literal": "job = await client.warm_cache(\"products\", page_size=1000)", "chunkId": "api/warm-cache#layer-warm" }, { "kind": "code", "literal": "job, err := client.WarmCache(ctx, \"products\", &hevlayer.WarmCacheParams{\n PageSize: 1000,\n})", "chunkId": "api/warm-cache#layer-warm" }, { "kind": "code", "literal": "const job = await client.warmCache(\"products\", { pageSize: 1000 });", "chunkId": "api/warm-cache#layer-warm" }, { "kind": "code", "literal": "curl -X POST \"$LAYER_GATEWAY_URL/v2/namespaces/products/warm?page_size=1000\" \\\n -H \"Authorization: Bearer $LAYER_GATEWAY_API_KEY\"", "chunkId": "api/warm-cache#layer-warm" }, { "kind": "code", "literal": "{\n \"id\": \"warm-job-uuid\",\n \"namespace\": \"products\",\n \"status\": \"running\",\n \"progress\": 0,\n \"documents_scanned\": 0,\n \"created_at\": \"2026-05-26T10:00:00Z\"\n}", "chunkId": "api/warm-cache#layer-warm" }, { "kind": "code", "literal": "job = await client.get_warm_job(\"products\", job.id)", "chunkId": "api/warm-cache#layer-warm" }, { "kind": "code", "literal": "job, err := client.GetWarmJob(ctx, \"products\", jobID)", "chunkId": "api/warm-cache#layer-warm" }, { "kind": "code", "literal": "const job = await client.getWarmJob(\"products\", jobId);", "chunkId": "api/warm-cache#layer-warm" }, { "kind": "code", "literal": "curl \"$LAYER_GATEWAY_URL/v2/namespaces/products/warm-jobs/warm-job-uuid\" \\\n -H \"Authorization: Bearer $LAYER_GATEWAY_API_KEY\"", "chunkId": "api/warm-cache#layer-warm" }, { "kind": "code", "literal": "POST /v2/namespaces/{ns}/warm", "chunkId": "api/warm-cache#layer-warm" }, { "kind": "code", "literal": "cache_warmed_through", "chunkId": "api/warm-cache#layer-warm" }, { "kind": "code", "literal": "202 Accepted", "chunkId": "api/warm-cache#layer-warm" } ], "sources": [ { "chunkId": "api/warm-cache#layer-warm", "url": "/docs/api/warm-cache#layer-warm", "anchor": "layer-warm" } ], "mode": "source-primary", "terms": [ "layer", "warm", "post", "namespaces", "creates", "asynchronous", "pages", "through", "turbopuffer", "backfills", "aerospike", "refreshes", "cachewarmedthrough", "bootstrapping", "namespace", "whose", "data", "written", "await", "client", "cache", "products", "page", "size", "1000", "warmcache", "hevlayer", "warmcacheparams", "pagesize", "const", "curl", "gateway", "authorization", "bearer", "uuid", "status", "running", "progress", "documents", "scanned" ] }, { "id": "api/write", "kind": "section", "title": "Write & Stage", "heading": null, "group": "API", "url": "/docs/api/write", "summary": "Write rows to a namespace and stage documents in the cache. Writes are wire-compatible with the upstream POST /v2/namespaces/{ns} endpoint. The request body (upserts, deletes, patches, and filter writes, combined in one…", "facts": [ { "kind": "code", "literal": "POST /v2/namespaces/{ns}", "chunkId": "api/write" }, { "kind": "code", "literal": "write_namespace", "chunkId": "api/write" }, { "kind": "code", "literal": "_hevlayer_upserted_at", "chunkId": "api/write" }, { "kind": "value", "literal": "Upstream.astro", "chunkId": "api/write" }, { "kind": "value", "literal": "CodeTabs.astro", "chunkId": "api/write" }, { "kind": "value", "literal": "turbopuffer.com", "chunkId": "api/write" } ], "sources": [ { "chunkId": "api/write", "url": "/docs/api/write", "anchor": null } ], "mode": "source-primary", "terms": [ "write", "rows", "namespace", "stage", "documents", "cache", "writes", "wire", "compatible", "upstream", "post", "namespaces", "endpoint", "request", "body", "upserts", "deletes", "patches", "filter", "combined", "hevlayer", "upserted", "astro", "codetabs", "turbopuffer", "documented", "sections", "below", "layer", "adds", "send", "native", "bodies", "writenamespace", "stamps", "every", "producing", "hevlayerupsertedat", "mirrors", "document" ] }, { "id": "api/write#stage", "kind": "section", "title": "Write & Stage", "heading": "Stage", "group": "API", "url": "/docs/api/write#stage", "summary": "Stage Stage caches a document before it's upserted upstream into your vector store. That O(1) read/write is especially useful for queuing chunks in a two-stage pipeline, where a CPU worker stages chunks and a GPU worker…", "facts": [ { "kind": "code", "literal": "await client.put_pipeline_document_chunks(\"product-images\", \"asin-B08N5WRWNW\", {\n \"chunks\": [\n {\"id\": \"asin-B08N5WRWNW-0\", \"text\": \"Wireless noise-cancelling headphones\"},\n {\"id\": \"asin-B08N5WRWNW-1\", \"text\": \"40-hour battery life\", \"metadata\": {\"page\": 2}},\n ],\n})", "chunkId": "api/write#stage" }, { "kind": "code", "literal": "client.PutPipelineDocumentChunks(ctx, \"product-images\", \"asin-B08N5WRWNW\", &hevlayer.PutChunksRequest{\n Chunks: []hevlayer.Chunk{\n {ID: \"asin-B08N5WRWNW-0\", Text: \"Wireless noise-cancelling headphones\"},\n {ID: \"asin-B08N5WRWNW-1\", Text: \"40-hour battery life\", Metadata: map[string]interface{}{\"page\": 2}},\n },\n})", "chunkId": "api/write#stage" }, { "kind": "code", "literal": "await client.putPipelineDocumentChunks(\"product-images\", \"asin-B08N5WRWNW\", {\n chunks: [\n { id: \"asin-B08N5WRWNW-0\", text: \"Wireless noise-cancelling headphones\" },\n { id: \"asin-B08N5WRWNW-1\", text: \"40-hour battery life\", metadata: { page: 2 } },\n ],\n});", "chunkId": "api/write#stage" }, { "kind": "code", "literal": "curl -X PUT \"$LAYER_GATEWAY_URL/v2/pipelines/product-images/documents/asin-B08N5WRWNW\" \\\n -H \"Authorization: Bearer $LAYER_GATEWAY_API_KEY\" \\\n -H \"Content-Type: application/json\" \\\n -d '{\n \"chunks\": [\n {\"id\": \"asin-B08N5WRWNW-0\", \"text\": \"Wireless noise-cancelling headphones\"},\n {\"id\": \"asin-B08N5WRWNW-1\", \"text\": \"40-hour battery life\", \"metadata\": {\"page\": 2}}\n ]\n }'", "chunkId": "api/write#stage" }, { "kind": "code", "literal": "pending", "chunkId": "api/write#stage" } ], "sources": [ { "chunkId": "api/write#stage", "url": "/docs/api/write#stage", "anchor": "stage" } ], "mode": "source-primary", "terms": [ "stage", "caches", "document", "before", "upserted", "upstream", "vector", "store", "read", "write", "especially", "useful", "queuing", "chunks", "pipeline", "worker", "stages", "await", "client", "product", "images", "asin", "b08n5wrwnw", "text", "wireless", "noise", "cancelling", "headphones", "hour", "battery", "life", "metadata", "page", "putpipelinedocumentchunks", "hevlayer", "putchunksrequest", "chunk", "string", "interface", "curl" ] }, { "id": "api/write#status", "kind": "section", "title": "Write & Stage", "heading": "Status", "group": "API", "url": "/docs/api/write#status", "summary": "Status Layer validates the body before forwarding and can fail independently of Turbopuffer, so the write path carries a few statuses a plain proxy wouldn't: 200 OK — applied upstream and stamped. 422 Unprocessable Entit…", "facts": [ { "kind": "code", "literal": "_hevlayer_*", "chunkId": "api/write#status" }, { "kind": "code", "literal": "{ \"error\": \"validation_error\", … }", "chunkId": "api/write#status" }, { "kind": "code", "literal": "upsert_condition", "chunkId": "api/write#status" }, { "kind": "code", "literal": "patch_condition", "chunkId": "api/write#status" }, { "kind": "code", "literal": "delete_condition", "chunkId": "api/write#status" }, { "kind": "code", "literal": "{ \"error\": \"upstream_error\", … }", "chunkId": "api/write#status" }, { "kind": "value", "literal": "non-2xx", "chunkId": "api/write#status" } ], "sources": [ { "chunkId": "api/write#status", "url": "/docs/api/write#status", "anchor": "status" } ], "mode": "source-primary", "terms": [ "status", "layer", "validates", "body", "before", "forwarding", "fail", "independently", "turbopuffer", "write", "path", "carries", "statuses", "plain", "proxy", "wouldn", "applied", "upstream", "stamped", "unprocessable", "entit", "hevlayer", "error", "validation", "upsert", "condition", "patch", "delete", "entity", "rejected", "recognized", "native", "operation", "reserved", "attribute", "name", "removed", "custom", "validationerror", "passthrough" ] }, { "id": "cli", "kind": "section", "title": "Layer CLI", "heading": null, "group": "Operations", "url": "/docs/cli", "summary": "The layer CLI manages environments, observes indexes, pipelines, and UDFs, mints API keys, and runs Function manifests. The layer CLI operates hevlayer from the terminal. It manages named environments, observes index, pi…", "facts": [ { "kind": "code", "literal": "layer", "chunkId": "cli" }, { "kind": "code", "literal": "run", "chunkId": "cli" }, { "kind": "code", "literal": "--kube-context", "chunkId": "cli" }, { "kind": "code", "literal": "--kube-namespace", "chunkId": "cli" }, { "kind": "code", "literal": "--context", "chunkId": "cli" } ], "sources": [ { "chunkId": "cli", "url": "/docs/cli", "anchor": null } ], "mode": "agent-primary", "terms": [ "layer", "manages", "environments", "observes", "indexes", "pipelines", "udfs", "mints", "keys", "runs", "function", "manifests", "operates", "hevlayer", "terminal", "named", "index", "kube", "context", "namespace", "pipeline", "state", "gateway", "revokes", "every", "read", "goes", "through", "only", "touches", "kubernetes", "applies", "registers", "spec", "triggers", "discovery", "optionally", "watches", "until", "queue" ] }, { "id": "cli#ask-the-docs", "kind": "section", "title": "Layer CLI", "heading": "Ask The Docs", "group": "Operations", "url": "/docs/cli#ask-the-docs", "summary": "Ask The Docs layer ask queries the committed docs digest with the ask CLI. It is keyless and local by default: from a checkout, it finds site/.hev-ask, prefers a sibling ../ask source checkout, and falls back to the docs…", "facts": [ { "kind": "code", "literal": "layer ask tree\nlayer ask grep \"warm cache\"\nlayer ask cat api/query\nlayer ask glossary get watermark\nlayer -o json ask tree", "chunkId": "cli#ask-the-docs" }, { "kind": "code", "literal": "layer ask --endpoint https://hevlayer.com/api/ask tree", "chunkId": "cli#ask-the-docs" }, { "kind": "code", "literal": "layer ask", "chunkId": "cli#ask-the-docs" }, { "kind": "code", "literal": "ask", "chunkId": "cli#ask-the-docs" }, { "kind": "code", "literal": "site/.hev-ask", "chunkId": "cli#ask-the-docs" }, { "kind": "code", "literal": "../ask", "chunkId": "cli#ask-the-docs" }, { "kind": "code", "literal": "@hevmind/ask", "chunkId": "cli#ask-the-docs" }, { "kind": "code", "literal": "PATH", "chunkId": "cli#ask-the-docs" }, { "kind": "code", "literal": "--endpoint", "chunkId": "cli#ask-the-docs" } ], "sources": [ { "chunkId": "cli#ask-the-docs", "url": "/docs/cli#ask-the-docs", "anchor": "ask-the-docs" } ], "mode": "agent-primary", "terms": [ "docs", "layer", "queries", "committed", "digest", "keyless", "local", "default", "checkout", "finds", "site", "prefers", "sibling", "source", "falls", "back", "tree", "grep", "warm", "cache", "query", "glossary", "watermark", "json", "endpoint", "https", "hevlayer", "hevmind", "path", "installed", "package", "binary", "deployed", "instead" ] }, { "id": "cli#configuration", "kind": "section", "title": "Layer CLI", "heading": "Configuration", "group": "Operations", "url": "/docs/cli#configuration", "summary": "Configuration layer reads named environments from /.hevlayer/config.toml. The directory is created with mode 0700; the config file is written with mode 0600. active = \"partner\" [envs.partner] baseurl = \"https://aws-us-ea…", "facts": [ { "kind": "code", "literal": "active = \"partner\"\n\n[envs.partner]\nbase_url = \"https://aws-us-east-1.hevlayer.com\"\napi_key = \"...\"\nkube_context = \"partner-cluster\"\nkube_namespace = \"hevlayer\"\n\n[envs.local]\nbase_url = \"http://localhost:8080\"\napi_key = \"dev\"\nkube_context = \"kind-hevlayer\"", "chunkId": "cli#configuration" }, { "kind": "code", "literal": "layer", "chunkId": "cli#configuration" }, { "kind": "code", "literal": "~/.hevlayer/config.toml", "chunkId": "cli#configuration" }, { "kind": "code", "literal": "0700", "chunkId": "cli#configuration" }, { "kind": "code", "literal": "0600", "chunkId": "cli#configuration" }, { "kind": "code", "literal": "--base-url", "chunkId": "cli#configuration" }, { "kind": "code", "literal": "--api-key", "chunkId": "cli#configuration" }, { "kind": "code", "literal": "--context", "chunkId": "cli#configuration" }, { "kind": "code", "literal": "--kube-namespace", "chunkId": "cli#configuration" }, { "kind": "code", "literal": "LAYER_BASE_URL", "chunkId": "cli#configuration" }, { "kind": "code", "literal": "LAYER_API_KEY", "chunkId": "cli#configuration" }, { "kind": "code", "literal": "HEVLAYER_", "chunkId": "cli#configuration" }, { "kind": "code", "literal": "--env", "chunkId": "cli#configuration" }, { "kind": "code", "literal": "LAYER_ENV", "chunkId": "cli#configuration" }, { "kind": "code", "literal": "HEVLAYER_BASE_URL", "chunkId": "cli#configuration" }, { "kind": "code", "literal": "https://aws-us-east-1.hevlayer.com", "chunkId": "cli#configuration" }, { "kind": "code", "literal": "HEVLAYER_API_KEY", "chunkId": "cli#configuration" }, { "kind": "code", "literal": "-o", "chunkId": "cli#configuration" }, { "kind": "code", "literal": "--output", "chunkId": "cli#configuration" }, { "kind": "code", "literal": "table", "chunkId": "cli#configuration" }, { "kind": "code", "literal": "json", "chunkId": "cli#configuration" }, { "kind": "code", "literal": "names", "chunkId": "cli#configuration" } ], "sources": [ { "chunkId": "cli#configuration", "url": "/docs/cli#configuration", "anchor": "configuration" } ], "mode": "agent-primary", "terms": [ "configuration", "layer", "reads", "named", "environments", "hevlayer", "config", "toml", "directory", "created", "mode", "0700", "file", "written", "0600", "active", "partner", "envs", "baseurl", "https", "base", "east", "kube", "context", "cluster", "namespace", "local", "http", "localhost", "8080", "kind", "output", "table", "json", "names", "apikey", "kubecontext", "kubenamespace", "resolution", "order" ] }, { "id": "cli#environments", "kind": "section", "title": "Layer CLI", "heading": "Environments", "group": "Operations", "url": "/docs/cli#environments", "summary": "Environments layer env add partner --base-url https://aws-us-east-1.hevlayer.com \\ --api-key \"$LAYERAPIKEY\" --kube-context partner-cluster \\ --kube-namespace hevlayer layer env use partner layer env ls layer env show par…", "facts": [ { "kind": "code", "literal": "layer env add partner --base-url https://aws-us-east-1.hevlayer.com \\\n --api-key \"$LAYER_API_KEY\" --kube-context partner-cluster \\\n --kube-namespace hevlayer\nlayer env use partner\nlayer env ls\nlayer env show partner -o json\nlayer env rm partner", "chunkId": "cli#environments" }, { "kind": "code", "literal": "env add", "chunkId": "cli#environments" }, { "kind": "code", "literal": "env ls", "chunkId": "cli#environments" }, { "kind": "code", "literal": "env show", "chunkId": "cli#environments" } ], "sources": [ { "chunkId": "cli#environments", "url": "/docs/cli#environments", "anchor": "environments" } ], "mode": "agent-primary", "terms": [ "environments", "layer", "partner", "base", "https", "east", "hevlayer", "layerapikey", "kube", "context", "cluster", "namespace", "show", "json", "prompts", "missing", "values", "required", "must", "supplied", "flags", "keys", "masked" ] }, { "id": "cli#inspect-an-index", "kind": "section", "title": "Layer CLI", "heading": "Inspect An Index", "group": "Operations", "url": "/docs/cli#inspect-an-index", "summary": "Inspect An Index layer index get shop-products layer index get shop-products -o json index get reports row count, size, schema summary, last write, stable watermark and lag, index (WAL) status, cache state, and snapshot…", "facts": [ { "kind": "code", "literal": "layer index get shop-products\nlayer index get shop-products -o json", "chunkId": "cli#inspect-an-index" }, { "kind": "code", "literal": "index get", "chunkId": "cli#inspect-an-index" }, { "kind": "code", "literal": "-o json", "chunkId": "cli#inspect-an-index" } ], "sources": [ { "chunkId": "cli#inspect-an-index", "url": "/docs/cli#inspect-an-index", "anchor": "inspect-an-index" } ], "mode": "agent-primary", "terms": [ "inspect", "index", "layer", "shop", "products", "json", "reports", "count", "size", "schema", "summary", "last", "write", "stable", "watermark", "status", "cache", "state", "snapshot", "history", "timestamps", "sizes", "epoch", "bytes", "carries", "full", "list" ] }, { "id": "cli#install", "kind": "section", "title": "Layer CLI", "heading": "Install", "group": "Operations", "url": "/docs/cli#install", "summary": "Install From the repository root: go build -o layer ./apps/layer-cli", "facts": [ { "kind": "code", "literal": "go build -o layer ./apps/layer-cli", "chunkId": "cli#install" } ], "sources": [ { "chunkId": "cli#install", "url": "/docs/cli#install", "anchor": "install" } ], "mode": "agent-primary", "terms": [ "install", "repository", "root", "build", "layer", "apps" ] }, { "id": "cli#keys", "kind": "section", "title": "Layer CLI", "heading": "Keys", "group": "Operations", "url": "/docs/cli#keys", "summary": "Keys layer keys mint cohort-reader --owner acme \\ --entitle vectorstore.prod-turbopuffer=read \\ --namespaces \"cohort-\" \\ --claim warehouse.prod-snowflake=\"notes:cohort::read\" layer keys ls layer keys get cohort-reader la…", "facts": [ { "kind": "code", "literal": "layer keys mint cohort-reader --owner acme \\\n --entitle vectorstore.prod-turbopuffer=read \\\n --namespaces \"cohort-*\" \\\n --claim warehouse.prod-snowflake=\"notes:cohort:*:read\"\nlayer keys ls\nlayer keys get cohort-reader\nlayer keys revoke cohort-reader\nlayer keys rm cohort-reader", "chunkId": "cli#keys" }, { "kind": "code", "literal": "keys mint", "chunkId": "cli#keys" }, { "kind": "code", "literal": "layer keys mint … | pbcopy", "chunkId": "cli#keys" }, { "kind": "code", "literal": "--entitle", "chunkId": "cli#keys" }, { "kind": "code", "literal": "TARGET[=SCOPE[+SCOPE]]", "chunkId": "cli#keys" }, { "kind": "code", "literal": "vectorstore.", "chunkId": "cli#keys" }, { "kind": "code", "literal": "warehouse.", "chunkId": "cli#keys" }, { "kind": "code", "literal": "layer", "chunkId": "cli#keys" }, { "kind": "code", "literal": "--namespaces", "chunkId": "cli#keys" }, { "kind": "code", "literal": "--claim", "chunkId": "cli#keys" }, { "kind": "code", "literal": "TARGET=STRING", "chunkId": "cli#keys" }, { "kind": "code", "literal": "--expires-after", "chunkId": "cli#keys" }, { "kind": "code", "literal": "never", "chunkId": "cli#keys" }, { "kind": "code", "literal": "365d", "chunkId": "cli#keys" }, { "kind": "code", "literal": "--entitle layer=admin", "chunkId": "cli#keys" }, { "kind": "code", "literal": "layer keys mint -f key.yaml", "chunkId": "cli#keys" }, { "kind": "code", "literal": "ApiKey", "chunkId": "cli#keys" }, { "kind": "code", "literal": "kubectl apply", "chunkId": "cli#keys" }, { "kind": "code", "literal": "keys ls", "chunkId": "cli#keys" }, { "kind": "code", "literal": "keys get", "chunkId": "cli#keys" }, { "kind": "code", "literal": "revoke", "chunkId": "cli#keys" }, { "kind": "code", "literal": "rm", "chunkId": "cli#keys" }, { "kind": "code", "literal": "keys", "chunkId": "cli#keys" }, { "kind": "code", "literal": "admin", "chunkId": "cli#keys" } ], "sources": [ { "chunkId": "cli#keys", "url": "/docs/cli#keys", "anchor": "keys" } ], "mode": "agent-primary", "terms": [ "keys", "layer", "mint", "cohort", "reader", "owner", "acme", "entitle", "vectorstore", "prod", "turbopuffer", "read", "namespaces", "claim", "warehouse", "snowflake", "notes", "revoke", "pbcopy", "target", "scope", "name", "string", "expires", "after", "never", "365d", "admin", "yaml", "apikey", "kubectl", "apply", "creates", "through", "gateway", "prints", "token", "once", "alone", "stdout" ] }, { "id": "cli#pipelines", "kind": "section", "title": "Layer CLI", "heading": "Pipelines", "group": "Operations", "url": "/docs/cli#pipelines", "summary": "Pipelines layer pipeline list layer pipeline get product-images pipeline list reads registered pipelines and fans out to the pipeline status API for each one's live queue depth (pending, processing, failed, rate/min); pi…", "facts": [ { "kind": "code", "literal": "layer pipeline list\nlayer pipeline get product-images", "chunkId": "cli#pipelines" }, { "kind": "code", "literal": "pipeline list", "chunkId": "cli#pipelines" }, { "kind": "code", "literal": "pending", "chunkId": "cli#pipelines" }, { "kind": "code", "literal": "processing", "chunkId": "cli#pipelines" }, { "kind": "code", "literal": "failed", "chunkId": "cli#pipelines" }, { "kind": "code", "literal": "rate/min", "chunkId": "cli#pipelines" }, { "kind": "code", "literal": "layer push", "chunkId": "cli#pipelines" } ], "sources": [ { "chunkId": "cli#pipelines", "url": "/docs/cli#pipelines", "anchor": "pipelines" } ], "mode": "agent-primary", "terms": [ "pipelines", "layer", "pipeline", "list", "product", "images", "reads", "registered", "fans", "status", "live", "queue", "depth", "pending", "processing", "failed", "rate", "push", "adds", "target", "namespace", "distance", "metric", "created", "worker", "staged", "renders", "without", "counts", "rather", "erroring", "need", "only", "kube", "access", "deferred", "managed", "build", "loop", "milestone" ] }, { "id": "cli#run-a-function", "kind": "section", "title": "Layer CLI", "heading": "Run A Function", "group": "Operations", "url": "/docs/cli#run-a-function", "summary": "Run A Function layer run -f tag-products.yaml layer run -f tag-products.yaml --index amazon-products-staging layer run -f tag-products.yaml --detach layer run -f tag-products.yaml --rm The input is a Kubernetes Function…", "facts": [ { "kind": "code", "literal": "layer run -f tag-products.yaml\nlayer run -f tag-products.yaml --index amazon-products-staging\nlayer run -f tag-products.yaml --detach\nlayer run -f tag-products.yaml --rm", "chunkId": "cli#run-a-function" }, { "kind": "code", "literal": "layer udf list\nlayer udf get product-tags --watch", "chunkId": "cli#run-a-function" }, { "kind": "code", "literal": "Function", "chunkId": "cli#run-a-function" }, { "kind": "code", "literal": "--index", "chunkId": "cli#run-a-function" }, { "kind": "code", "literal": "spec.targetNamespaces", "chunkId": "cli#run-a-function" }, { "kind": "code", "literal": "--context", "chunkId": "cli#run-a-function" }, { "kind": "code", "literal": "--kube-namespace", "chunkId": "cli#run-a-function" }, { "kind": "code", "literal": "--no-apply", "chunkId": "cli#run-a-function" }, { "kind": "code", "literal": "spec.version", "chunkId": "cli#run-a-function" }, { "kind": "code", "literal": "--detach", "chunkId": "cli#run-a-function" }, { "kind": "code", "literal": "pending_count", "chunkId": "cli#run-a-function" }, { "kind": "code", "literal": "processing_count", "chunkId": "cli#run-a-function" }, { "kind": "code", "literal": "--rm", "chunkId": "cli#run-a-function" }, { "kind": "code", "literal": "udf list", "chunkId": "cli#run-a-function" }, { "kind": "code", "literal": "udf get", "chunkId": "cli#run-a-function" }, { "kind": "code", "literal": "--watch", "chunkId": "cli#run-a-function" } ], "sources": [ { "chunkId": "cli#run-a-function", "url": "/docs/cli#run-a-function", "anchor": "run-a-function" } ], "mode": "agent-primary", "terms": [ "function", "layer", "products", "yaml", "index", "amazon", "staging", "detach", "input", "kubernetes", "list", "product", "tags", "watch", "spec", "targetnamespaces", "context", "kube", "namespace", "apply", "version", "pending", "count", "processing", "manifest", "overrides", "target", "selects", "kubeconfig", "skips", "step", "workers", "managed", "outside", "operator", "registered", "gateway", "completion", "marker", "bump" ] }, { "id": "cli#tui", "kind": "section", "title": "Layer CLI", "heading": "TUI", "group": "Operations", "url": "/docs/cli#tui", "summary": "TUI Bare layer on a TTY opens the read-only operations TUI (layer browse is the explicit spelling); on a non-TTY it prints usage and exits 2. Press i/f/p/k/e to switch between indexes, functions, pipelines, keys, and env…", "facts": [ { "kind": "code", "literal": "layer", "chunkId": "cli#tui" }, { "kind": "code", "literal": "layer browse", "chunkId": "cli#tui" }, { "kind": "code", "literal": "enter", "chunkId": "cli#tui" }, { "kind": "code", "literal": "esc", "chunkId": "cli#tui" }, { "kind": "code", "literal": "layer env ls", "chunkId": "cli#tui" }, { "kind": "code", "literal": "layer udf list", "chunkId": "cli#tui" }, { "kind": "code", "literal": "layer udf get UDF_ID [--watch]", "chunkId": "cli#tui" }, { "kind": "code", "literal": "layer index list", "chunkId": "cli#tui" }, { "kind": "code", "literal": "layer index get NAME", "chunkId": "cli#tui" }, { "kind": "code", "literal": "layer pipeline list", "chunkId": "cli#tui" }, { "kind": "code", "literal": "layer pipeline get ID", "chunkId": "cli#tui" }, { "kind": "code", "literal": "layer keys ls", "chunkId": "cli#tui" }, { "kind": "code", "literal": "layer keys get KEY_ID", "chunkId": "cli#tui" } ], "sources": [ { "chunkId": "cli#tui", "url": "/docs/cli#tui", "anchor": "tui" } ], "mode": "agent-primary", "terms": [ "bare", "layer", "opens", "read", "only", "operations", "browse", "explicit", "spelling", "prints", "usage", "exits", "press", "switch", "between", "indexes", "functions", "pipelines", "keys", "enter", "list", "watch", "index", "name", "pipeline", "environments", "view", "open", "detail", "back", "every", "interactive", "command", "twin", "same", "data", "humanizes", "timestamps", "sizes", "commands" ] }, { "id": "concepts", "kind": "section", "title": "Concepts", "heading": null, "group": "Overview", "url": "/docs/concepts", "summary": "How the gateway composes Turbopuffer, NVMe cache, PostgreSQL, S3, and metrics — and the core nouns you'll work with.", "facts": [], "sources": [ { "chunkId": "concepts", "url": "/docs/concepts", "anchor": null } ], "mode": "agent-primary", "terms": [ "gateway", "composes", "turbopuffer", "nvme", "cache", "postgresql", "metrics", "core", "nouns", "work" ] }, { "id": "concepts#control-loops", "kind": "section", "title": "Concepts", "heading": "Control loops", "group": "Overview", "url": "/docs/concepts#control-loops", "summary": "Control loops Layer uses a control loop as a core primitive for managing your indexes. It reconciles index state against metrics emitted by the search system, which is how Layer applies row-level transformations (UDFs) a…", "facts": [], "sources": [ { "chunkId": "concepts#control-loops", "url": "/docs/concepts#control-loops", "anchor": "control-loops" } ], "mode": "agent-primary", "terms": [ "control", "loops", "layer", "uses", "loop", "core", "primitive", "managing", "indexes", "reconciles", "index", "state", "against", "metrics", "emitted", "search", "system", "applies", "level", "transformations", "udfs", "keeps", "stable", "view", "current", "related", "snapshots", "watermark" ] }, { "id": "concepts#document-cache", "kind": "section", "title": "Concepts", "heading": "Document cache", "group": "Overview", "url": "/docs/concepts#document-cache", "summary": "Document cache The document cache (NVMe-backed Aerospike) does two jobs. Document reads are served pull-through: the gateway checks the cache first, and on a miss reads through to Turbopuffer (or S3 for snapshots), retur…", "facts": [ { "kind": "code", "literal": "set", "chunkId": "concepts#document-cache" } ], "sources": [ { "chunkId": "concepts#document-cache", "url": "/docs/concepts#document-cache", "anchor": "document-cache" } ], "mode": "agent-primary", "terms": [ "document", "cache", "nvme", "backed", "aerospike", "does", "jobs", "reads", "served", "pull", "through", "gateway", "checks", "first", "miss", "turbopuffer", "snapshots", "retur", "returns", "backfills", "best", "effort", "pipeline", "chunk", "handoff", "uses", "same", "store", "queue", "between", "workers", "neither", "makes", "hard", "dependency", "fall", "origin", "unavailable", "back", "backing" ] }, { "id": "concepts#gateway-enhancements", "kind": "section", "title": "Concepts", "heading": "Gateway enhancements", "group": "Overview", "url": "/docs/concepts#gateway-enhancements", "summary": "Gateway enhancements Where helpful, the gateway extends your search system with common query patterns and filtering primitives. Layer's enhancements use reserved hevlayer attributes; changing the schema on those attribut…", "facts": [ { "kind": "code", "literal": "_hevlayer_*", "chunkId": "concepts#gateway-enhancements" } ], "sources": [ { "chunkId": "concepts#gateway-enhancements", "url": "/docs/concepts#gateway-enhancements", "anchor": "gateway-enhancements" } ], "mode": "agent-primary", "terms": [ "gateway", "enhancements", "helpful", "extends", "search", "system", "common", "query", "patterns", "filtering", "primitives", "layer", "reserved", "hevlayer", "attributes", "changing", "schema", "those", "attribut", "breaks", "guarantees", "should", "degrade", "gracefully", "functionality", "exposed", "through", "surface", "python", "typescript", "client", "plain", "rest", "applications", "route", "every", "call", "works", "best", "traffic" ] }, { "id": "concepts#glossary", "kind": "section", "title": "Concepts", "heading": "Glossary", "group": "Overview", "url": "/docs/concepts#glossary", "summary": "Glossary Concept Current meaning Namespace A Turbopuffer namespace addressed through /v2/namespaces/{namespace}. Document A row id plus attributes, and optionally a vector when writing/searching. Document cache NVMe-back…", "facts": [ { "kind": "code", "literal": "/v2/namespaces/{namespace}", "chunkId": "concepts#glossary" }, { "kind": "code", "literal": "indexed", "chunkId": "concepts#glossary" }, { "kind": "code", "literal": "index_lag_rows", "chunkId": "concepts#glossary" }, { "kind": "code", "literal": "fields[].values[].v", "chunkId": "concepts#glossary" }, { "kind": "code", "literal": "fields[].values[].n", "chunkId": "concepts#glossary" }, { "kind": "code", "literal": "values[].n", "chunkId": "concepts#glossary" }, { "kind": "code", "literal": "fts", "chunkId": "concepts#glossary" }, { "kind": "code", "literal": "ann", "chunkId": "concepts#glossary" }, { "kind": "code", "literal": "_hevlayer_shard", "chunkId": "concepts#glossary" }, { "kind": "code", "literal": "rerank_by: [\"RRF\", ...]", "chunkId": "concepts#glossary" }, { "kind": "code", "literal": "HybridText", "chunkId": "concepts#glossary" }, { "kind": "code", "literal": "alyze", "chunkId": "concepts#glossary" }, { "kind": "code", "literal": "word_v4", "chunkId": "concepts#glossary" }, { "kind": "code", "literal": "Auto", "chunkId": "concepts#glossary" }, { "kind": "code", "literal": "hybrid_text", "chunkId": "concepts#glossary" }, { "kind": "code", "literal": "semantic", "chunkId": "concepts#glossary" }, { "kind": "code", "literal": "fused", "chunkId": "concepts#glossary" }, { "kind": "code", "literal": "routing", "chunkId": "concepts#glossary" }, { "kind": "code", "literal": "executed: false", "chunkId": "concepts#glossary" } ], "sources": [ { "chunkId": "concepts#glossary", "url": "/docs/concepts#glossary", "anchor": "glossary" } ], "mode": "agent-primary", "terms": [ "glossary", "concept", "current", "meaning", "namespace", "turbopuffer", "addressed", "through", "namespaces", "document", "plus", "attributes", "optionally", "vector", "writing", "searching", "cache", "nvme", "back", "indexed", "index", "rows", "fields", "values", "hevlayer", "shard", "rerank", "hybridtext", "alyze", "word", "auto", "hybrid", "text", "semantic", "fused", "routing", "executed", "false", "backed", "records" ] }, { "id": "concepts#kubernetes-autoscaling", "kind": "section", "title": "Concepts", "heading": "Kubernetes autoscaling", "group": "Overview", "url": "/docs/concepts#kubernetes-autoscaling", "summary": "Kubernetes autoscaling Because Layer is stateless, you can autoscale every tier independently. Karpenter handles node-level scaling, and KEDA scales pods against signals from an embedded PostgreSQL queue. The data in tha…", "facts": [], "sources": [ { "chunkId": "concepts#kubernetes-autoscaling", "url": "/docs/concepts#kubernetes-autoscaling", "anchor": "kubernetes-autoscaling" } ], "mode": "agent-primary", "terms": [ "kubernetes", "autoscaling", "because", "layer", "stateless", "autoscale", "every", "tier", "independently", "karpenter", "handles", "node", "level", "scaling", "keda", "scales", "pods", "against", "signals", "embedded", "postgresql", "queue", "data", "decisions", "only", "carries", "recoverable", "system", "state" ] }, { "id": "concepts#scattergather", "kind": "section", "title": "Concepts", "heading": "Scatter/gather", "group": "Overview", "url": "/docs/concepts#scattergather", "summary": "Scatter/gather Layer can partition a single namespace into hash buckets, called shards, by assigning each row a reserved hevlayershard attribute (xxh64 of its id, modulo the shard count). The gateway then scatters a quer…", "facts": [ { "kind": "code", "literal": "_hevlayer_shard", "chunkId": "concepts#scattergather" }, { "kind": "code", "literal": "top_k", "chunkId": "concepts#scattergather" }, { "kind": "flag", "literal": "-filtered", "chunkId": "concepts#scattergather" } ], "sources": [ { "chunkId": "concepts#scattergather", "url": "/docs/concepts#scattergather", "anchor": "scattergather" } ], "mode": "agent-primary", "terms": [ "scatter", "gather", "layer", "partition", "single", "namespace", "hash", "buckets", "called", "shards", "assigning", "reserved", "hevlayershard", "attribute", "xxh64", "modulo", "shard", "count", "gateway", "scatters", "quer", "hevlayer", "filtered", "query", "every", "bucket", "parallel", "gathers", "results", "merges", "ranks", "combined", "rows", "down", "requested", "topk", "before", "returning", "sharding", "stays" ] }, { "id": "dashboard", "kind": "section", "title": "Dashboard", "heading": null, "group": "Operations", "url": "/docs/dashboard", "summary": "Running the in-cluster operations dashboard: the access it needs, networking, auth, and turning it off. The Layer dashboard is the operator UI that ships in-cluster alongside the gateway, as the layer-dashboard Deploymen…", "facts": [ { "kind": "code", "literal": "layer-dashboard", "chunkId": "dashboard" }, { "kind": "value", "literal": "Callout.astro", "chunkId": "dashboard" } ], "sources": [ { "chunkId": "dashboard", "url": "/docs/dashboard", "anchor": null } ], "mode": "agent-primary", "terms": [ "running", "cluster", "operations", "dashboard", "access", "needs", "networking", "auth", "turning", "layer", "operator", "ships", "alongside", "gateway", "deploymen", "callout", "astro", "deployment", "service", "page", "covers", "reach", "gate", "turn" ] }, { "id": "dashboard#access-it-needs", "kind": "section", "title": "Dashboard", "heading": "Access it needs", "group": "Operations", "url": "/docs/dashboard#access-it-needs", "summary": "Access it needs The dashboard is read-mostly and backed by three sources, each with its own grant: The gateway API — the same endpoints customers use, plus the Prometheus-compatible metrics proxy at /v2/metrics. Authenti…", "facts": [ { "kind": "code", "literal": "/v2/metrics", "chunkId": "dashboard#access-it-needs" }, { "kind": "code", "literal": "LAYER_GATEWAY_API_KEY", "chunkId": "dashboard#access-it-needs" }, { "kind": "code", "literal": "deriveFromStore", "chunkId": "dashboard#access-it-needs" }, { "kind": "code", "literal": "VectorStore", "chunkId": "dashboard#access-it-needs" }, { "kind": "code", "literal": "keys", "chunkId": "dashboard#access-it-needs" }, { "kind": "code", "literal": "hevlayer.com", "chunkId": "dashboard#access-it-needs" }, { "kind": "code", "literal": "dashboard.kubeAccess.enabled", "chunkId": "dashboard#access-it-needs" }, { "kind": "code", "literal": "dashboard.writeAccess.enabled", "chunkId": "dashboard#access-it-needs" }, { "kind": "code", "literal": "false", "chunkId": "dashboard#access-it-needs" }, { "kind": "code", "literal": "dashboard.serviceAccount.roleArn", "chunkId": "dashboard#access-it-needs" } ], "sources": [ { "chunkId": "dashboard#access-it-needs", "url": "/docs/dashboard#access-it-needs", "anchor": "access-it-needs" } ], "mode": "agent-primary", "terms": [ "access", "needs", "dashboard", "read", "mostly", "backed", "three", "sources", "grant", "gateway", "same", "endpoints", "customers", "plus", "prometheus", "compatible", "metrics", "proxy", "authenti", "layer", "derivefromstore", "vectorstore", "keys", "hevlayer", "kubeaccess", "enabled", "writeaccess", "false", "serviceaccount", "rolearn", "authenticated", "bearer", "layergatewayapikey", "mode", "default", "credential", "configured", "inbound", "worker", "does" ] }, { "id": "dashboard#basic-auth", "kind": "section", "title": "Dashboard", "heading": "Basic auth", "group": "Operations", "url": "/docs/dashboard#basic-auth", "summary": "Basic auth HTTP Basic auth sits in front of every dashboard route and is required — the dashboard refuses to start without it. Set credentials through the chart: dashboard: basicAuth: user: ops password: The chart render…", "facts": [ { "kind": "code", "literal": "dashboard:\n basicAuth:\n user: ops\n password: ", "chunkId": "dashboard#basic-auth" } ], "sources": [ { "chunkId": "dashboard#basic-auth", "url": "/docs/dashboard#basic-auth", "anchor": "basic-auth" } ], "mode": "agent-primary", "terms": [ "basic", "auth", "http", "sits", "front", "every", "dashboard", "route", "required", "refuses", "start", "without", "credentials", "through", "chart", "basicauth", "user", "password", "render", "strong", "fails", "either", "field", "blank", "while", "enabled" ] }, { "id": "dashboard#disabling-the-dashboard", "kind": "section", "title": "Dashboard", "heading": "Disabling the dashboard", "group": "Operations", "url": "/docs/dashboard#disabling-the-dashboard", "summary": "Disabling the dashboard The dashboard is optional. Disable it and the Deployment, Service, RBAC, and ingress all skip rendering: dashboard: enabled: false The gateway and transform runtime run unchanged without it; you l…", "facts": [ { "kind": "code", "literal": "dashboard:\n enabled: false", "chunkId": "dashboard#disabling-the-dashboard" } ], "sources": [ { "chunkId": "dashboard#disabling-the-dashboard", "url": "/docs/dashboard#disabling-the-dashboard", "anchor": "disabling-the-dashboard" } ], "mode": "agent-primary", "terms": [ "disabling", "dashboard", "optional", "disable", "deployment", "service", "rbac", "ingress", "skip", "rendering", "enabled", "false", "gateway", "transform", "runtime", "unchanged", "without", "lose", "only", "operator" ] }, { "id": "dashboard#networking", "kind": "section", "title": "Dashboard", "heading": "Networking", "group": "Operations", "url": "/docs/dashboard#networking", "summary": "Networking The dashboard is an operator tool. Reach it over a port-forward rather than exposing it publicly: kubectl port-forward -n svc/layer-dashboard 8081:8081 Then open http://localhost:8081. Customer workloads only…", "facts": [ { "kind": "code", "literal": "kubectl port-forward -n svc/layer-dashboard 8081:8081", "chunkId": "dashboard#networking" }, { "kind": "code", "literal": "http://localhost:8081", "chunkId": "dashboard#networking" } ], "sources": [ { "chunkId": "dashboard#networking", "url": "/docs/dashboard#networking", "anchor": "networking" } ], "mode": "agent-primary", "terms": [ "networking", "dashboard", "operator", "tool", "reach", "port", "forward", "rather", "exposing", "publicly", "kubectl", "layer", "8081", "open", "http", "localhost", "customer", "workloads", "only", "release", "namespace", "ever", "receive", "gateway", "base", "credentials", "never" ] }, { "id": "dashboard#operational-notes", "kind": "section", "title": "Dashboard", "heading": "Operational notes", "group": "Operations", "url": "/docs/dashboard#operational-notes", "summary": "Operational notes The dashboard is intentionally read-mostly. Mutating actions (UDF pause, InfraRules or scaling edits) are gated through CRD apply or explicit confirm dialogs, and write access is governed separately by…", "facts": [ { "kind": "code", "literal": "dashboard.writeAccess.enabled", "chunkId": "dashboard#operational-notes" } ], "sources": [ { "chunkId": "dashboard#operational-notes", "url": "/docs/dashboard#operational-notes", "anchor": "operational-notes" } ], "mode": "agent-primary", "terms": [ "operational", "notes", "dashboard", "intentionally", "read", "mostly", "mutating", "actions", "pause", "infrarules", "scaling", "edits", "gated", "through", "apply", "explicit", "confirm", "dialogs", "write", "access", "governed", "separately", "writeaccess", "enabled" ] }, { "id": "document-model", "kind": "section", "title": "Document model", "heading": null, "group": "Overview", "url": "/docs/document-model", "summary": "A Layer document and the reserved attributes the gateway manages on every Turbopuffer row. Layer reserves the hevlayer attribute prefix for its own bookkeeping. The gateway manages these attributes: your writes and UDF c…", "facts": [ { "kind": "code", "literal": "_hevlayer_*", "chunkId": "document-model" }, { "kind": "code", "literal": "_hevlayer_upserted_at", "chunkId": "document-model" }, { "kind": "code", "literal": "_hevlayer_upserted_at <= watermark", "chunkId": "document-model" }, { "kind": "code", "literal": "_hevlayer_shard", "chunkId": "document-model" }, { "kind": "code", "literal": "xxh64(id) % shard_count", "chunkId": "document-model" }, { "kind": "code", "literal": "_hevlayer_udf__v", "chunkId": "document-model" }, { "kind": "code", "literal": "spec.version", "chunkId": "document-model" }, { "kind": "code", "literal": "_hevlayer_udf__stale_after", "chunkId": "document-model" }, { "kind": "code", "literal": "_hevlayer_", "chunkId": "document-model" } ], "sources": [ { "chunkId": "document-model", "url": "/docs/document-model", "anchor": null } ], "mode": "agent-primary", "terms": [ "layer", "document", "reserved", "attributes", "gateway", "manages", "every", "turbopuffer", "reserves", "hevlayer", "attribute", "prefix", "bookkeeping", "these", "writes", "upserted", "watermark", "shard", "xxh64", "count", "spec", "version", "stale", "after", "completion", "patches", "must", "editing", "directly", "breaks", "guarantees", "type", "purpose", "hevlayerupsertedat", "integer", "epoch", "server", "stamped", "write", "filters" ] }, { "id": "failure-modes", "kind": "section", "title": "Failure Modes", "heading": null, "group": "Operations", "url": "/docs/failure-modes", "summary": "How reads and writes degrade when the gateway, cache, or pipeline runs into trouble. Layer strives to degrade gracefully: queries and document fetch served from Turbopuffer keep functioning when components around them fa…", "facts": [ { "kind": "value", "literal": "Callout.astro", "chunkId": "failure-modes" } ], "sources": [ { "chunkId": "failure-modes", "url": "/docs/failure-modes", "anchor": null } ], "mode": "agent-primary", "terms": [ "reads", "writes", "degrade", "gateway", "cache", "pipeline", "runs", "trouble", "layer", "strives", "gracefully", "queries", "document", "fetch", "served", "turbopuffer", "keep", "functioning", "components", "around", "callout", "astro", "fail", "page", "details", "scenarios", "does", "apply" ] }, { "id": "failure-modes#client-fall-through", "kind": "section", "title": "Failure Modes", "heading": "Client fall-through", "group": "Operations", "url": "/docs/failure-modes#client-fall-through", "summary": "Client fall-through When the gateway is unreachable, the SDKs retry the call against Turbopuffer directly for operations that need no Layer state — simple vector queries, writes, and raw Turbopuffer-compatible methods (s…", "facts": [ { "kind": "code", "literal": "fallback", "chunkId": "failure-modes#client-fall-through" }, { "kind": "code", "literal": "turbopuffer_direct", "chunkId": "failure-modes#client-fall-through" }, { "kind": "code", "literal": "TURBOPUFFER_API_KEY", "chunkId": "failure-modes#client-fall-through" }, { "kind": "code", "literal": "WithTurbopufferAPIKey", "chunkId": "failure-modes#client-fall-through" }, { "kind": "code", "literal": "turbopuffer_api_key", "chunkId": "failure-modes#client-fall-through" }, { "kind": "code", "literal": "fallback_to_turbopuffer=False", "chunkId": "failure-modes#client-fall-through" }, { "kind": "code", "literal": "AsyncHevlayer", "chunkId": "failure-modes#client-fall-through" }, { "kind": "code", "literal": "WithFallbackToTurbopuffer(false)", "chunkId": "failure-modes#client-fall-through" } ], "sources": [ { "chunkId": "failure-modes#client-fall-through", "url": "/docs/failure-modes#client-fall-through", "anchor": "client-fall-through" } ], "mode": "agent-primary", "terms": [ "client", "fall", "through", "gateway", "unreachable", "sdks", "retry", "call", "against", "turbopuffer", "directly", "operations", "need", "layer", "state", "simple", "vector", "queries", "writes", "compatible", "methods", "fallback", "direct", "withturbopufferapikey", "false", "asynchevlayer", "withfallbacktoturbopuffer", "schema", "metadata", "namespace", "listing", "these", "calls", "succeed", "without", "document", "cache", "search", "history", "query" ] }, { "id": "failure-modes#pipeline-stop-writes", "kind": "section", "title": "Failure Modes", "heading": "Pipeline stop-writes", "group": "Operations", "url": "/docs/failure-modes#pipeline-stop-writes", "summary": "Pipeline stop-writes The primary failure mode for writes through a healthy gateway is Aerospike stop-writes during a multi-stage pipeline job: staged documents stay warm in the cache but carry no vector data yet, and onc…", "facts": [ { "kind": "code", "literal": "documentCache.autoRestartOnStopWrites: true", "chunkId": "failure-modes#pipeline-stop-writes" }, { "kind": "code", "literal": "layer_aerospike_op_duration_seconds{status=\"aerospike_stop_writes\"}", "chunkId": "failure-modes#pipeline-stop-writes" }, { "kind": "code", "literal": "hevlayer_cache_cold_responses_total", "chunkId": "failure-modes#pipeline-stop-writes" }, { "kind": "code", "literal": "hevlayer_document_cache_cold_starts_total", "chunkId": "failure-modes#pipeline-stop-writes" }, { "kind": "code", "literal": "hevlayer_document_cache_cold_start_seconds", "chunkId": "failure-modes#pipeline-stop-writes" }, { "kind": "code", "literal": "Aerospike chunk write failed (best-effort)", "chunkId": "failure-modes#pipeline-stop-writes" }, { "kind": "code", "literal": "Aerospike chunk read failed; falling back to S3 backing", "chunkId": "failure-modes#pipeline-stop-writes" }, { "kind": "value", "literal": "documentCache.storage.resetOnStart", "chunkId": "failure-modes#pipeline-stop-writes" } ], "sources": [ { "chunkId": "failure-modes#pipeline-stop-writes", "url": "/docs/failure-modes#pipeline-stop-writes", "anchor": "pipeline-stop-writes" } ], "mode": "agent-primary", "terms": [ "pipeline", "stop", "writes", "primary", "failure", "mode", "through", "healthy", "gateway", "aerospike", "during", "multi", "stage", "staged", "documents", "stay", "warm", "cache", "carry", "vector", "data", "documentcache", "autorestartonstopwrites", "true", "layer", "duration", "seconds", "status", "hevlayer", "cold", "responses", "total", "document", "starts", "start", "chunk", "write", "failed", "best", "effort" ] }, { "id": "failure-modes#read", "kind": "section", "title": "Failure Modes", "heading": "Read", "group": "Operations", "url": "/docs/failure-modes#read", "summary": "Read Reads route through the gateway, but a gateway outage does not take your queries dark. The Python and Go SDKs fall through to Turbopuffer direct when the gateway is unreachable, so Turbopuffer-compatible queries kee…", "facts": [], "sources": [ { "chunkId": "failure-modes#read", "url": "/docs/failure-modes#read", "anchor": "read" } ], "mode": "agent-primary", "terms": [ "read", "reads", "route", "through", "gateway", "outage", "does", "take", "queries", "dark", "python", "sdks", "fall", "turbopuffer", "direct", "unreachable", "compatible", "keep", "serving", "rather", "failing", "minus", "document", "cache", "search", "history", "layer", "query", "enhancements", "client", "below", "only", "paths", "fetch", "warm", "jobs", "pipeline", "status", "snapshots", "fail" ] }, { "id": "failure-modes#write", "kind": "section", "title": "Failure Modes", "heading": "Write", "group": "Operations", "url": "/docs/failure-modes#write", "summary": "Write Writes also fall through to Turbopuffer direct when the gateway is unreachable (again, see Client fall-through); the durable upstream still accepts the row, but the write skips document-cache warming and pipeline s…", "facts": [], "sources": [ { "chunkId": "failure-modes#write", "url": "/docs/failure-modes#write", "anchor": "write" } ], "mode": "agent-primary", "terms": [ "write", "writes", "also", "fall", "through", "turbopuffer", "direct", "gateway", "unreachable", "again", "client", "durable", "upstream", "still", "accepts", "skips", "document", "cache", "warming", "pipeline", "staging", "until", "returns" ] }, { "id": "faq", "kind": "section", "title": "FAQ", "heading": null, "group": "Overview", "url": "/docs/faq", "summary": "Licensing, pricing, open source plans, and how to get started. This page answers the questions the rest of the docs don't: licensing, pricing, and where the project is headed.", "facts": [], "sources": [ { "chunkId": "faq", "url": "/docs/faq", "anchor": null } ], "mode": "agent-primary", "terms": [ "licensing", "pricing", "open", "source", "plans", "started", "page", "answers", "questions", "rest", "docs", "project", "headed" ] }, { "id": "faq#how-do-i-get-started", "kind": "section", "title": "FAQ", "heading": "How do I get started?", "group": "Overview", "url": "/docs/faq#how-do-i-get-started", "summary": "How do I get started? Sign up and I'll follow up to schedule a discovery call.", "facts": [], "sources": [ { "chunkId": "faq#how-do-i-get-started", "url": "/docs/faq#how-do-i-get-started", "anchor": "how-do-i-get-started" } ], "mode": "agent-primary", "terms": [ "started", "sign", "follow", "schedule", "discovery", "call" ] }, { "id": "faq#how-much-will-it-cost", "kind": "section", "title": "FAQ", "heading": "How much will it cost?", "group": "Overview", "url": "/docs/faq#how-much-will-it-cost", "summary": "How much will it cost? Pricing isn't final. The shape will be: a small line item for an enterprise, sized to fund proper support and full-time development.", "facts": [], "sources": [ { "chunkId": "faq#how-much-will-it-cost", "url": "/docs/faq#how-much-will-it-cost", "anchor": "how-much-will-it-cost" } ], "mode": "agent-primary", "terms": [ "much", "cost", "pricing", "final", "shape", "small", "line", "item", "enterprise", "sized", "fund", "proper", "support", "full", "time", "development" ] }, { "id": "faq#is-hev-layer-a-hosted-service", "kind": "section", "title": "FAQ", "heading": "Is hev layer a hosted service?", "group": "Overview", "url": "/docs/faq#is-hev-layer-a-hosted-service", "summary": "Is hev layer a hosted service? No. Layer runs in your Kubernetes cluster, next to your data, against your own Turbopuffer account. See install for what a deployment looks like.", "facts": [], "sources": [ { "chunkId": "faq#is-hev-layer-a-hosted-service", "url": "/docs/faq#is-hev-layer-a-hosted-service", "anchor": "is-hev-layer-a-hosted-service" } ], "mode": "agent-primary", "terms": [ "layer", "hosted", "service", "runs", "kubernetes", "cluster", "next", "data", "against", "turbopuffer", "account", "install", "deployment", "looks", "like" ] }, { "id": "faq#what-is-the-licensing-for-hev-layer", "kind": "section", "title": "FAQ", "heading": "What is the licensing for hev layer?", "group": "Overview", "url": "/docs/faq#what-is-the-licensing-for-hev-layer", "summary": "What is the licensing for hev layer? Right now hev layer is distributed under a proprietary license and requires a signed beta agreement.", "facts": [], "sources": [ { "chunkId": "faq#what-is-the-licensing-for-hev-layer", "url": "/docs/faq#what-is-the-licensing-for-hev-layer", "anchor": "what-is-the-licensing-for-hev-layer" } ], "mode": "agent-primary", "terms": [ "licensing", "layer", "right", "distributed", "under", "proprietary", "license", "requires", "signed", "beta", "agreement" ] }, { "id": "faq#who-built-hev-layer", "kind": "section", "title": "FAQ", "heading": "Who built hev layer?", "group": "Overview", "url": "/docs/faq#who-built-hev-layer", "summary": "Who built hev layer? Adam Hevenor. hev layer is a hev mind product.", "facts": [ { "kind": "value", "literal": "hevmind.com", "chunkId": "faq#who-built-hev-layer" } ], "sources": [ { "chunkId": "faq#who-built-hev-layer", "url": "/docs/faq#who-built-hev-layer", "anchor": "who-built-hev-layer" } ], "mode": "agent-primary", "terms": [ "built", "layer", "adam", "hevenor", "mind", "product", "hevmind" ] }, { "id": "faq#will-any-of-it-be-open-source", "kind": "section", "title": "FAQ", "heading": "Will any of it be open source?", "group": "Overview", "url": "/docs/faq#will-any-of-it-be-open-source", "summary": "Will any of it be open source? Some of it. The clients will be published publicly during the design preview, and as commercial adoption grows I plan to open the gateway as well. The dashboard, operator, and autoscaling c…", "facts": [], "sources": [ { "chunkId": "faq#will-any-of-it-be-open-source", "url": "/docs/faq#will-any-of-it-be-open-source", "anchor": "will-any-of-it-be-open-source" } ], "mode": "agent-primary", "terms": [ "open", "source", "some", "clients", "published", "publicly", "during", "design", "preview", "commercial", "adoption", "grows", "plan", "gateway", "well", "dashboard", "operator", "autoscaling", "components", "stay", "under", "license", "opening", "also", "ultimate", "form", "graceful", "degradation", "core", "value", "layer", "infrastructure", "query", "path", "shouldn", "depend", "company", "sticking", "around", "including" ] }, { "id": "faq#will-it-be-a-paid-product", "kind": "section", "title": "FAQ", "heading": "Will it be a paid product?", "group": "Overview", "url": "/docs/faq#will-it-be-a-paid-product", "summary": "Will it be a paid product? Yes. I'm building hev layer as a business, not a side project. Paying customers get a vendor with every incentive to support them well.", "facts": [], "sources": [ { "chunkId": "faq#will-it-be-a-paid-product", "url": "/docs/faq#will-it-be-a-paid-product", "anchor": "will-it-be-a-paid-product" } ], "mode": "agent-primary", "terms": [ "paid", "product", "building", "layer", "business", "side", "project", "paying", "customers", "vendor", "every", "incentive", "support", "well" ] }, { "id": "guarantees", "kind": "section", "title": "No Guarantees", "heading": null, "group": "Overview", "url": "/docs/guarantees", "summary": "Layer can't offer guarantees — here's what we commit to instead. Layer can't offer guarantees. We try our best to provide secure, hands-off infrastructure that you are ultimately responsible for. While we can't offer gua…", "facts": [ { "kind": "value", "literal": "Callout.astro", "chunkId": "guarantees" } ], "sources": [ { "chunkId": "guarantees", "url": "/docs/guarantees", "anchor": null } ], "mode": "agent-primary", "terms": [ "layer", "offer", "guarantees", "here", "commit", "instead", "best", "provide", "secure", "hands", "infrastructure", "ultimately", "responsible", "while", "callout", "astro", "make", "promises", "design", "distribute", "software", "believe", "easy", "stand", "test", "time", "page", "covers", "specific", "status", "those" ] }, { "id": "guarantees#commitments", "kind": "section", "title": "No Guarantees", "heading": "Commitments", "group": "Overview", "url": "/docs/guarantees#commitments", "summary": "Commitments Your index stays in your search system. We will not reimplement indexing. Layer keeps a copy of your data, but the search index lives in your vector store. Your history is backed up to S3. Search history and…", "facts": [ { "kind": "value", "literal": "hevmind.com", "chunkId": "guarantees#commitments" } ], "sources": [ { "chunkId": "guarantees#commitments", "url": "/docs/guarantees#commitments", "anchor": "commitments" } ], "mode": "agent-primary", "terms": [ "commitments", "index", "stays", "search", "system", "reimplement", "indexing", "layer", "keeps", "copy", "data", "lives", "vector", "store", "history", "backed", "hevmind", "namespace", "snapshots", "written", "bucket", "specify", "format", "change", "nvme", "customer", "document", "chunk", "served", "price", "performance", "stray", "pattern", "though", "some", "cases", "justify", "smaller", "memory", "cache" ] }, { "id": "index", "kind": "section", "title": "Introduction", "heading": null, "group": "Overview", "url": "/docs", "summary": "Layer is a gateway and function runtime for modern retrieval systems. It scales compute for multi-stage indexing pipelines and runs functions across every row of your index, with all durable state in object storage. Laye…", "facts": [ { "kind": "value", "literal": "Apache-2", "chunkId": "index" }, { "kind": "value", "literal": "AGPL-3", "chunkId": "index" }, { "kind": "value", "literal": "Diagram.astro", "chunkId": "index" }, { "kind": "value", "literal": "karpenter.sh", "chunkId": "index" }, { "kind": "value", "literal": "Apache-2.0", "chunkId": "index" }, { "kind": "value", "literal": "aerospike.com", "chunkId": "index" }, { "kind": "value", "literal": "AGPL-3.0", "chunkId": "index" }, { "kind": "value", "literal": "www.postgresql.org", "chunkId": "index" }, { "kind": "value", "literal": "victoriametrics.com", "chunkId": "index" }, { "kind": "value", "literal": "2.0", "chunkId": "index" }, { "kind": "value", "literal": "3.0", "chunkId": "index" } ], "sources": [ { "chunkId": "index", "url": "/docs", "anchor": null } ], "mode": "agent-primary", "terms": [ "layer", "gateway", "function", "runtime", "modern", "retrieval", "systems", "scales", "compute", "multi", "stage", "indexing", "pipelines", "runs", "functions", "across", "every", "index", "durable", "state", "object", "storage", "laye", "apache", "agpl", "diagram", "astro", "karpenter", "aerospike", "postgresql", "victoriametrics", "provides", "drop", "enhancements", "favorite", "lets", "scale", "reason", "about", "observe" ] }, { "id": "install", "kind": "section", "title": "Install", "heading": null, "group": "Operations", "url": "/docs/install", "summary": "How to bring up a hev layer environment: AWS resources via Terraform, runtime via Helm. A hev layer install has two stages. Terraform provisions the required AWS resources: IAM, S3, ECR, networking, cost-read roles, and,…", "facts": [ { "kind": "value", "literal": "Callout.astro", "chunkId": "install" } ], "sources": [ { "chunkId": "install", "url": "/docs/install", "anchor": null } ], "mode": "agent-primary", "terms": [ "bring", "layer", "environment", "resources", "terraform", "runtime", "helm", "install", "stages", "provisions", "required", "networking", "cost", "read", "roles", "callout", "astro", "recommended", "path", "fresh", "cluster", "installs", "gateway", "operator", "document", "cache", "wires", "produced", "skip", "already", "needs", "minimum", "provide", "bucket", "irsa", "role", "snapshots", "history", "full", "feature" ] }, { "id": "install#cluster-recommended", "kind": "section", "title": "Install", "heading": "Cluster: recommended", "group": "Operations", "url": "/docs/install#cluster-recommended", "summary": "Design-partner installs should use a fresh EKS cluster. The recommended cluster provisions a VPC, EKS control plane, one always-on i4i system node for serving and document cache, NATless public worker subnets, Karpenter CPU/GPU indexing pools, the AWS Load Balancer Controller, and EFS.", "facts": [ { "kind": "code", "literal": "system", "chunkId": "install#cluster-recommended" }, { "kind": "code", "literal": "i4i.large", "chunkId": "install#cluster-recommended" }, { "kind": "code", "literal": "worker-cpu", "chunkId": "install#cluster-recommended" }, { "kind": "code", "literal": "worker-gpu", "chunkId": "install#cluster-recommended" } ], "sources": [ { "chunkId": "install#cluster-recommended", "url": "/docs/install#cluster-recommended", "anchor": "cluster-recommended" } ], "mode": "agent-primary", "terms": [ "cluster", "recommended", "design", "partner", "installs", "should", "fresh", "provisions", "control", "plane", "always", "system", "node", "serving", "document", "cache", "natless", "public", "worker", "subnets", "karpenter", "indexing", "pools", "load", "balancer", "controller", "large", "unless", "there", "specific", "reason", "bind", "layer", "existing", "path", "endpoints", "expects", "group", "defaulting", "share" ] }, { "id": "install#cost-notes", "kind": "section", "title": "Install", "heading": "Cost notes", "group": "Operations", "url": "/docs/install#cost-notes", "summary": "The Terraform footprint is cost-oriented: EKS, one i4i system node, the shared ALB, and small storage lines are the baseline, while CPU and GPU indexing workers scale up through Karpenter only during work. Larger search deployments can add gateway replicas, larger always-on nodes, or a dedicated document-cache pool.", "facts": [ { "kind": "code", "literal": "system", "chunkId": "install#cost-notes" }, { "kind": "value", "literal": "us-east-1", "chunkId": "install#cost-notes" } ], "sources": [ { "chunkId": "install#cost-notes", "url": "/docs/install#cost-notes", "anchor": "cost-notes" } ], "mode": "agent-primary", "terms": [ "cost", "notes", "terraform", "footprint", "oriented", "system", "node", "shared", "small", "storage", "lines", "baseline", "while", "indexing", "workers", "scale", "through", "karpenter", "only", "during", "work", "larger", "search", "deployments", "gateway", "replicas", "always", "nodes", "dedicated", "document", "cache", "pool", "east", "designed", "deploy", "efficient", "autoscaling", "demand", "rest", "fixed" ] }, { "id": "install#default-infrarules", "kind": "section", "title": "Install", "heading": "Default InfraRules", "group": "Operations", "url": "/docs/install#default-infrarules", "summary": "Default InfraRules When operator.infraRules.create=true, Helm renders the cluster-scoped InfraRules/default object used by every Pipeline and Function spec.scaling.pool reference. If a workload omits scaling.pool, the op…", "facts": [ { "kind": "code", "literal": "operator.infraRules.create=true", "chunkId": "install#default-infrarules" }, { "kind": "code", "literal": "InfraRules/default", "chunkId": "install#default-infrarules" }, { "kind": "code", "literal": "spec.scaling.pool", "chunkId": "install#default-infrarules" }, { "kind": "code", "literal": "scaling.pool", "chunkId": "install#default-infrarules" }, { "kind": "code", "literal": "worker.computeClass: cpu", "chunkId": "install#default-infrarules" }, { "kind": "code", "literal": "gpu", "chunkId": "install#default-infrarules" }, { "kind": "code", "literal": "cpu", "chunkId": "install#default-infrarules" }, { "kind": "code", "literal": "cpu-large", "chunkId": "install#default-infrarules" }, { "kind": "code", "literal": "layer.hev.dev/node-role=worker-cpu", "chunkId": "install#default-infrarules" }, { "kind": "code", "literal": "worker-gpu", "chunkId": "install#default-infrarules" }, { "kind": "code", "literal": "workerKarpenter", "chunkId": "install#default-infrarules" }, { "kind": "code", "literal": "operator.infraRules.computePools", "chunkId": "install#default-infrarules" } ], "sources": [ { "chunkId": "install#default-infrarules", "url": "/docs/install#default-infrarules", "anchor": "default-infrarules" } ], "mode": "agent-primary", "terms": [ "default", "infrarules", "operator", "create", "true", "helm", "renders", "cluster", "scoped", "object", "every", "pipeline", "function", "spec", "scaling", "pool", "reference", "workload", "omits", "worker", "computeclass", "large", "layer", "node", "role", "workerkarpenter", "computepools", "maps", "stock", "compute", "pools", "general", "workers", "such", "extraction", "ingestion", "lightweight", "functions", "need", "local" ] }, { "id": "install#gateway-auth-modes", "kind": "section", "title": "Install", "heading": "Gateway auth modes", "group": "Operations", "url": "/docs/install#gateway-auth-modes", "summary": "Gateway auth modes The default deriveFromStore mode is the single-tenant BYOC path: vectorStore: credential: apiKey: tpuf... inboundAuth: mode: deriveFromStore For an install that needs a gateway-only bearer, use keys mo…", "facts": [ { "kind": "code", "literal": "vectorStore:\n credential:\n apiKey: tpuf_...\n inboundAuth:\n mode: deriveFromStore", "chunkId": "install#gateway-auth-modes" }, { "kind": "code", "literal": "vectorStore:\n credential:\n apiKey: tpuf_...\n inboundAuth:\n mode: keys\n workerSecretKey: layer-inbound-worker-api-key\n keys:\n - name: worker\n scopes: [read, write, admin]\n apiKey: layer_worker_...\n secretRef:\n key: layer-inbound-worker-api-key", "chunkId": "install#gateway-auth-modes" }, { "kind": "code", "literal": "deriveFromStore", "chunkId": "install#gateway-auth-modes" }, { "kind": "code", "literal": "keys", "chunkId": "install#gateway-auth-modes" }, { "kind": "code", "literal": "apiKey", "chunkId": "install#gateway-auth-modes" }, { "kind": "code", "literal": "VectorStore", "chunkId": "install#gateway-auth-modes" }, { "kind": "code", "literal": "workerSecretName", "chunkId": "install#gateway-auth-modes" }, { "kind": "code", "literal": "workerSecretKey", "chunkId": "install#gateway-auth-modes" }, { "kind": "code", "literal": "layer-inbound-worker-api-key", "chunkId": "install#gateway-auth-modes" } ], "sources": [ { "chunkId": "install#gateway-auth-modes", "url": "/docs/install#gateway-auth-modes", "anchor": "gateway-auth-modes" } ], "mode": "agent-primary", "terms": [ "gateway", "auth", "modes", "default", "derivefromstore", "mode", "single", "tenant", "byoc", "path", "vectorstore", "credential", "apikey", "tpuf", "inboundauth", "install", "needs", "only", "bearer", "keys", "workersecretkey", "layer", "inbound", "worker", "name", "scopes", "read", "write", "admin", "secretref", "workersecretname", "chart", "renders", "values", "release", "secret", "references", "omit", "pointing", "created" ] }, { "id": "install#helm", "kind": "section", "title": "Install", "heading": "Helm", "group": "Operations", "url": "/docs/install#helm", "summary": "Helm The Helm chart at infra/helm/layer/ installs the gateway, operator, and document cache into a cluster that already has the AWS resources from Terraform or equivalent resources you manage.", "facts": [ { "kind": "code", "literal": "infra/helm/layer/", "chunkId": "install#helm" } ], "sources": [ { "chunkId": "install#helm", "url": "/docs/install#helm", "anchor": "helm" } ], "mode": "agent-primary", "terms": [ "helm", "chart", "infra", "layer", "installs", "gateway", "operator", "document", "cache", "cluster", "already", "resources", "terraform", "equivalent", "manage" ] }, { "id": "install#install-shape", "kind": "section", "title": "Install", "heading": "Install shape", "group": "Operations", "url": "/docs/install#install-shape", "summary": "Install shape An install is one Helm release per environment with one S3 bucket for snapshot and history data. The chart renders a default VectorStore from the credential you provide; an install can define additional Vec…", "facts": [ { "kind": "code", "literal": "VectorStore", "chunkId": "install#install-shape" }, { "kind": "code", "literal": "Index.spec.backend.storeRef", "chunkId": "install#install-shape" }, { "kind": "code", "literal": "keys", "chunkId": "install#install-shape" } ], "sources": [ { "chunkId": "install#install-shape", "url": "/docs/install#install-shape", "anchor": "install-shape" } ], "mode": "agent-primary", "terms": [ "install", "shape", "helm", "release", "environment", "bucket", "snapshot", "history", "data", "chart", "renders", "default", "vectorstore", "credential", "provide", "define", "additional", "index", "spec", "backend", "storeref", "keys", "resources", "upstream", "inbound", "auth", "policy", "route", "namespaces", "between", "scoped", "gateway", "only", "bearer", "available", "through", "mode", "described", "below" ] }, { "id": "install#outputs", "kind": "section", "title": "Install", "heading": "Outputs", "group": "Operations", "url": "/docs/install#outputs", "summary": "Outputs Terraform emits the values the Helm chart needs to install: the S3 bucket name, gateway IRSA role ARN, dashboard cost-read role ARN, ECR image URLs, and cluster metadata. Pass these into the Helm values file desc…", "facts": [], "sources": [ { "chunkId": "install#outputs", "url": "/docs/install#outputs", "anchor": "outputs" } ], "mode": "agent-primary", "terms": [ "outputs", "terraform", "emits", "values", "helm", "chart", "needs", "install", "bucket", "name", "gateway", "irsa", "role", "dashboard", "cost", "read", "image", "urls", "cluster", "metadata", "pass", "these", "file", "desc", "described", "below" ] }, { "id": "install#required-values", "kind": "section", "title": "Install", "heading": "Required values", "group": "Operations", "url": "/docs/install#required-values", "summary": "Required values Most of the chart is opinionated defaults. In a typical install the credential you bring from outside the cluster becomes the default VectorStore credential. Value Required Notes vectorStore.credential.ap…", "facts": [ { "kind": "code", "literal": "VectorStore", "chunkId": "install#required-values" }, { "kind": "code", "literal": "vectorStore.credential.apiKey", "chunkId": "install#required-values" }, { "kind": "code", "literal": "deriveFromStore", "chunkId": "install#required-values" }, { "kind": "code", "literal": "vectorStore.endpoint.url", "chunkId": "install#required-values" }, { "kind": "code", "literal": "vectorStore.endpoint.region", "chunkId": "install#required-values" }, { "kind": "code", "literal": "vectorStore.inboundAuth.mode", "chunkId": "install#required-values" }, { "kind": "code", "literal": "keys", "chunkId": "install#required-values" }, { "kind": "code", "literal": "open", "chunkId": "install#required-values" }, { "kind": "code", "literal": "vectorStore.inboundAuth.keys", "chunkId": "install#required-values" }, { "kind": "code", "literal": "read", "chunkId": "install#required-values" }, { "kind": "code", "literal": "write", "chunkId": "install#required-values" }, { "kind": "code", "literal": "admin", "chunkId": "install#required-values" }, { "kind": "code", "literal": "gateway.image", "chunkId": "install#required-values" }, { "kind": "code", "literal": "s3.bucket", "chunkId": "install#required-values" }, { "kind": "code", "literal": "serviceAccount.roleArn", "chunkId": "install#required-values" }, { "kind": "code", "literal": "gateway.indexNamespace", "chunkId": "install#required-values" }, { "kind": "code", "literal": "Index", "chunkId": "install#required-values" }, { "kind": "code", "literal": "operator.discovery.indexNamespace", "chunkId": "install#required-values" }, { "kind": "code", "literal": "gateway.indexConfig.enabled", "chunkId": "install#required-values" }, { "kind": "code", "literal": "spec.backend.storeRef", "chunkId": "install#required-values" }, { "kind": "code", "literal": "spec.snapshot.facetFields", "chunkId": "install#required-values" }, { "kind": "code", "literal": "spec.scan.threads", "chunkId": "install#required-values" }, { "kind": "code", "literal": "gateway.indexGc.enabled", "chunkId": "install#required-values" }, { "kind": "code", "literal": "gateway.consistency.stablePollIntervalMs", "chunkId": "install#required-values" } ], "sources": [ { "chunkId": "install#required-values", "url": "/docs/install#required-values", "anchor": "required-values" } ], "mode": "agent-primary", "terms": [ "required", "values", "most", "chart", "opinionated", "defaults", "typical", "install", "credential", "bring", "outside", "cluster", "becomes", "default", "vectorstore", "value", "notes", "apikey", "derivefromstore", "endpoint", "region", "inboundauth", "mode", "keys", "open", "read", "write", "admin", "gateway", "image", "bucket", "serviceaccount", "rolearn", "indexnamespace", "index", "operator", "discovery", "indexconfig", "enabled", "spec" ] }, { "id": "install#run-the-install", "kind": "section", "title": "Install", "heading": "Run the install", "group": "Operations", "url": "/docs/install#run-the-install", "summary": "Run the install helm upgrade --install layer ./infra/helm/layer \\ --namespace layer --create-namespace \\ -f values.customer.yaml The chart is not published to a public Helm repository — install from the source path or fr…", "facts": [ { "kind": "code", "literal": "helm upgrade --install layer ./infra/helm/layer \\\n --namespace layer --create-namespace \\\n -f values.customer.yaml", "chunkId": "install#run-the-install" } ], "sources": [ { "chunkId": "install#run-the-install", "url": "/docs/install#run-the-install", "anchor": "run-the-install" } ], "mode": "agent-primary", "terms": [ "install", "helm", "upgrade", "layer", "infra", "namespace", "create", "values", "customer", "yaml", "chart", "published", "public", "repository", "source", "path", "artifact", "provided", "during", "onboarding" ] }, { "id": "install#terraform", "kind": "section", "title": "Install", "heading": "Terraform", "group": "Operations", "url": "/docs/install#terraform", "summary": "Terraform The Terraform configuration in infra/terraform/ provisions the AWS resources that the gateway and operator need. It is opinionated about the resources hev layer needs to behave correctly and conservative about…", "facts": [ { "kind": "code", "literal": "infra/terraform/", "chunkId": "install#terraform" } ], "sources": [ { "chunkId": "install#terraform", "url": "/docs/install#terraform", "anchor": "terraform" } ], "mode": "agent-primary", "terms": [ "terraform", "configuration", "infra", "provisions", "resources", "gateway", "operator", "need", "opinionated", "about", "layer", "needs", "behave", "correctly", "conservative", "around", "route53", "hosted", "zones", "certificates", "most", "installs", "bring", "existing" ] }, { "id": "install#what-gets-installed", "kind": "section", "title": "Install", "heading": "What gets installed", "group": "Operations", "url": "/docs/install#what-gets-installed", "summary": "The chart installs the gateway, operator, scale-to-zero Aerospike document cache, optional Karpenter CPU/GPU worker pools, and supporting service accounts, ingress, and CRDs. In the baseline profile the document cache schedules onto the always-on i4i system node; larger installs can opt into a dedicated document-cache Karpenter pool with documentCache.nodeRole=document-cache and documentCache.karpenter.enabled=true.", "facts": [ { "kind": "code", "literal": "layer-gateway", "chunkId": "install#what-gets-installed" }, { "kind": "code", "literal": "layer-operator", "chunkId": "install#what-gets-installed" }, { "kind": "code", "literal": "layer-document-cache", "chunkId": "install#what-gets-installed" }, { "kind": "code", "literal": "NodePool", "chunkId": "install#what-gets-installed" }, { "kind": "code", "literal": "EC2NodeClass", "chunkId": "install#what-gets-installed" }, { "kind": "code", "literal": "worker-cpu", "chunkId": "install#what-gets-installed" }, { "kind": "code", "literal": "worker-gpu", "chunkId": "install#what-gets-installed" }, { "kind": "code", "literal": "workerKarpenter.enabled=true", "chunkId": "install#what-gets-installed" }, { "kind": "code", "literal": "document-cache", "chunkId": "install#what-gets-installed" }, { "kind": "code", "literal": "documentCache.nodeRole=document-cache", "chunkId": "install#what-gets-installed" }, { "kind": "code", "literal": "documentCache.karpenter.enabled=true", "chunkId": "install#what-gets-installed" } ], "sources": [ { "chunkId": "install#what-gets-installed", "url": "/docs/install#what-gets-installed", "anchor": "what-gets-installed" } ], "mode": "agent-primary", "terms": [ "gets", "installed", "chart", "installs", "gateway", "operator", "scale", "zero", "aerospike", "document", "cache", "optional", "karpenter", "worker", "pools", "supporting", "service", "accounts", "ingress", "crds", "baseline", "profile", "schedules", "onto", "always", "system", "node", "larger", "dedicated", "pool", "documentcache", "noderole", "enabled", "true", "layer", "nodepool", "ec2nodeclass", "workerkarpenter", "rust", "turbopuffer" ] }, { "id": "install#what-it-sets-up", "kind": "section", "title": "Install", "heading": "What it sets up", "group": "Operations", "url": "/docs/install#what-it-sets-up", "summary": "What it sets up Resource Purpose S3 bucket Durable storage for namespace snapshots, search history, and clickstream events. IAM roles + IRSA policies Gateway S3 access, dashboard cost-read access, and worker/operator AWS…", "facts": [ { "kind": "code", "literal": "manage_public_dns=true", "chunkId": "install#what-it-sets-up" } ], "sources": [ { "chunkId": "install#what-it-sets-up", "url": "/docs/install#what-it-sets-up", "anchor": "what-it-sets-up" } ], "mode": "agent-primary", "terms": [ "sets", "resource", "purpose", "bucket", "durable", "storage", "namespace", "snapshots", "search", "history", "clickstream", "events", "roles", "irsa", "policies", "gateway", "access", "dashboard", "cost", "read", "worker", "operator", "manage", "public", "true", "repositories", "image", "registry", "customer", "built", "function", "images", "node", "pools", "recommended", "fresh", "cluster", "runtime", "design", "partners" ] }, { "id": "kubernetes/apikey-crd", "kind": "section", "title": "ApiKey CRD", "heading": null, "group": "Operations", "url": "/docs/kubernetes/apikey-crd", "summary": "Minted API keys as Kubernetes resources: lifecycle, entitlements, and opaque claims. An ApiKey is a minted credential as a resource. Layer owns the credential lifecycle — mint, verify, revoke, expire — and what the key o…", "facts": [ { "kind": "code", "literal": "ApiKey", "chunkId": "kubernetes/apikey-crd" }, { "kind": "code", "literal": "VectorStore", "chunkId": "kubernetes/apikey-crd" }, { "kind": "code", "literal": "Warehouse", "chunkId": "kubernetes/apikey-crd" }, { "kind": "code", "literal": "kubectl get apikey -o yaml", "chunkId": "kubernetes/apikey-crd" }, { "kind": "code", "literal": "GET /v2/keys/{keyId}", "chunkId": "kubernetes/apikey-crd" } ], "sources": [ { "chunkId": "kubernetes/apikey-crd", "url": "/docs/kubernetes/apikey-crd", "anchor": null } ], "mode": "agent-primary", "terms": [ "minted", "keys", "kubernetes", "resources", "lifecycle", "entitlements", "opaque", "claims", "apikey", "credential", "resource", "layer", "owns", "mint", "verify", "revoke", "expire", "vectorstore", "warehouse", "kubectl", "yaml", "keyid", "opens", "declared", "entitlement", "names", "itself", "carries", "scopes", "target", "external", "system", "store", "keep", "authorization", "decisions", "authoring", "surfaces", "round", "trip" ] }, { "id": "kubernetes/apikey-crd#bootstrapping", "kind": "section", "title": "ApiKey CRD", "heading": "Bootstrapping", "group": "Operations", "url": "/docs/kubernetes/apikey-crd#bootstrapping", "summary": "Bootstrapping LAYERGATEWAYAPIKEY is the bootstrap credential: it mints the first admin key — spec: entitlements: layer: scopes: [admin] — after which routine minting uses minted admin keys. Cluster operators can equally…", "facts": [ { "kind": "code", "literal": "spec:\n entitlements:\n layer:\n scopes: [admin]", "chunkId": "kubernetes/apikey-crd#bootstrapping" }, { "kind": "code", "literal": "LAYER_GATEWAY_API_KEY", "chunkId": "kubernetes/apikey-crd#bootstrapping" }, { "kind": "code", "literal": "ApiKey", "chunkId": "kubernetes/apikey-crd#bootstrapping" } ], "sources": [ { "chunkId": "kubernetes/apikey-crd#bootstrapping", "url": "/docs/kubernetes/apikey-crd#bootstrapping", "anchor": "bootstrapping" } ], "mode": "agent-primary", "terms": [ "bootstrapping", "layergatewayapikey", "bootstrap", "credential", "mints", "first", "admin", "spec", "entitlements", "layer", "scopes", "after", "routine", "minting", "uses", "minted", "keys", "cluster", "operators", "equally", "gateway", "apikey", "applying", "resource", "since", "authoring", "needs", "only", "kubectl", "access" ] }, { "id": "kubernetes/apikey-crd#entitlements", "kind": "section", "title": "ApiKey CRD", "heading": "Entitlements", "group": "Operations", "url": "/docs/kubernetes/apikey-crd#entitlements", "summary": "Entitlements Key Target vectorstore. Data-plane access through the named store. scopes (read, write) gate routes whose Index resolves to that store; namespaces globs constrain which upstream namespaces. warehouse. A list…", "facts": [ { "kind": "code", "literal": "vectorstore.", "chunkId": "kubernetes/apikey-crd#entitlements" }, { "kind": "code", "literal": "scopes", "chunkId": "kubernetes/apikey-crd#entitlements" }, { "kind": "code", "literal": "read", "chunkId": "kubernetes/apikey-crd#entitlements" }, { "kind": "code", "literal": "write", "chunkId": "kubernetes/apikey-crd#entitlements" }, { "kind": "code", "literal": "Index", "chunkId": "kubernetes/apikey-crd#entitlements" }, { "kind": "code", "literal": "namespaces", "chunkId": "kubernetes/apikey-crd#entitlements" }, { "kind": "code", "literal": "warehouse.", "chunkId": "kubernetes/apikey-crd#entitlements" }, { "kind": "code", "literal": "claims", "chunkId": "kubernetes/apikey-crd#entitlements" }, { "kind": "code", "literal": "layer", "chunkId": "kubernetes/apikey-crd#entitlements" }, { "kind": "code", "literal": "scopes: [admin]", "chunkId": "kubernetes/apikey-crd#entitlements" }, { "kind": "code", "literal": "service:resource_type:resource_id:action", "chunkId": "kubernetes/apikey-crd#entitlements" }, { "kind": "code", "literal": "EntitlementTargetMissing", "chunkId": "kubernetes/apikey-crd#entitlements" } ], "sources": [ { "chunkId": "kubernetes/apikey-crd#entitlements", "url": "/docs/kubernetes/apikey-crd#entitlements", "anchor": "entitlements" } ], "mode": "agent-primary", "terms": [ "entitlements", "target", "vectorstore", "data", "plane", "access", "through", "named", "store", "scopes", "read", "write", "gate", "routes", "whose", "index", "resolves", "namespaces", "globs", "constrain", "upstream", "warehouse", "list", "name", "claims", "layer", "admin", "service", "resource", "type", "action", "entitlementtargetmissing", "opaque", "strings", "bound", "source", "system", "stores", "echoes", "application" ] }, { "id": "kubernetes/apikey-crd#kubernetes-rbac", "kind": "section", "title": "ApiKey CRD", "heading": "Kubernetes RBAC", "group": "Operations", "url": "/docs/kubernetes/apikey-crd#kubernetes-rbac", "summary": "Kubernetes RBAC CRD authoring makes kubectl a minting surface, so the chart ships roles to delegate key administration without cluster-admin: ClusterRole Grants hevlayer-key-admin Full verbs on apikeys, plus get on deliv…", "facts": [ { "kind": "code", "literal": "hevlayer-key-admin", "chunkId": "kubernetes/apikey-crd#kubernetes-rbac" }, { "kind": "code", "literal": "apikeys", "chunkId": "kubernetes/apikey-crd#kubernetes-rbac" }, { "kind": "code", "literal": "get", "chunkId": "kubernetes/apikey-crd#kubernetes-rbac" }, { "kind": "code", "literal": "hevlayer-key-viewer", "chunkId": "kubernetes/apikey-crd#kubernetes-rbac" }, { "kind": "code", "literal": "list", "chunkId": "kubernetes/apikey-crd#kubernetes-rbac" }, { "kind": "code", "literal": "watch", "chunkId": "kubernetes/apikey-crd#kubernetes-rbac" }, { "kind": "code", "literal": "view", "chunkId": "kubernetes/apikey-crd#kubernetes-rbac" }, { "kind": "code", "literal": "edit", "chunkId": "kubernetes/apikey-crd#kubernetes-rbac" }, { "kind": "code", "literal": "admin", "chunkId": "kubernetes/apikey-crd#kubernetes-rbac" }, { "kind": "code", "literal": "rbac.keyRoleBindings", "chunkId": "kubernetes/apikey-crd#kubernetes-rbac" } ], "sources": [ { "chunkId": "kubernetes/apikey-crd#kubernetes-rbac", "url": "/docs/kubernetes/apikey-crd#kubernetes-rbac", "anchor": "kubernetes-rbac" } ], "mode": "agent-primary", "terms": [ "kubernetes", "rbac", "authoring", "makes", "kubectl", "minting", "surface", "chart", "ships", "roles", "delegate", "administration", "without", "cluster", "admin", "clusterrole", "grants", "hevlayer", "full", "verbs", "apikeys", "plus", "deliv", "viewer", "list", "watch", "view", "edit", "keyrolebindings", "delivered", "token", "secrets", "mint", "revoke", "collect", "tokens", "secret", "access", "status", "hashes" ] }, { "id": "kubernetes/apikey-crd#minting", "kind": "section", "title": "ApiKey CRD", "heading": "Minting", "group": "Operations", "url": "/docs/kubernetes/apikey-crd#minting", "summary": "Minting REST. POST /v2/keys generates the token, creates the ApiKey resource, and returns the token in the response — once. The raw token is never persisted; Layer stores only one-way hashes on the resource. POST /v2/key…", "facts": [ { "kind": "code", "literal": "POST /v2/keys # 201 { keyId, …, token } — token returned once\nGET /v2/keys # metadata only; ?includeRevoked\nGET /v2/keys/{keyId}\nPOST /v2/keys/{keyId}/revoke # idempotent\nDELETE /v2/keys/{keyId} # hard delete\nPOST /v2/keys/authenticate # body { token } → 200 { keyId, entitlements, … } | 401", "chunkId": "kubernetes/apikey-crd#minting" }, { "kind": "code", "literal": "POST /v2/keys", "chunkId": "kubernetes/apikey-crd#minting" }, { "kind": "code", "literal": "ApiKey", "chunkId": "kubernetes/apikey-crd#minting" }, { "kind": "code", "literal": "layer", "chunkId": "kubernetes/apikey-crd#minting" }, { "kind": "code", "literal": "admin", "chunkId": "kubernetes/apikey-crd#minting" }, { "kind": "code", "literal": "POST /v2/keys/authenticate", "chunkId": "kubernetes/apikey-crd#minting" }, { "kind": "code", "literal": "status.secretRef", "chunkId": "kubernetes/apikey-crd#minting" }, { "kind": "code", "literal": "token", "chunkId": "kubernetes/apikey-crd#minting" }, { "kind": "code", "literal": "phase", "chunkId": "kubernetes/apikey-crd#minting" }, { "kind": "code", "literal": "Pending", "chunkId": "kubernetes/apikey-crd#minting" }, { "kind": "code", "literal": "Active", "chunkId": "kubernetes/apikey-crd#minting" } ], "sources": [ { "chunkId": "kubernetes/apikey-crd#minting", "url": "/docs/kubernetes/apikey-crd#minting", "anchor": "minting" } ], "mode": "agent-primary", "terms": [ "minting", "rest", "post", "keys", "generates", "token", "creates", "apikey", "resource", "returns", "response", "once", "never", "persisted", "layer", "stores", "only", "hashes", "keyid", "returned", "metadata", "includerevoked", "revoke", "idempotent", "delete", "hard", "authenticate", "body", "entitlements", "admin", "status", "secretref", "phase", "pending", "active", "management", "routes", "require", "entitlement", "scope" ] }, { "id": "kubernetes/apikey-crd#spec", "kind": "section", "title": "ApiKey CRD", "heading": "Spec", "group": "Operations", "url": "/docs/kubernetes/apikey-crd#spec", "summary": "Spec Field Purpose owner Optional free-form owner label, echoed in list and authenticate responses. description Optional free-form description. entitlements Map keyed by target resource. Each entry carries scopes, namesp…", "facts": [ { "kind": "code", "literal": "owner", "chunkId": "kubernetes/apikey-crd#spec" }, { "kind": "code", "literal": "description", "chunkId": "kubernetes/apikey-crd#spec" }, { "kind": "code", "literal": "entitlements", "chunkId": "kubernetes/apikey-crd#spec" }, { "kind": "code", "literal": "scopes", "chunkId": "kubernetes/apikey-crd#spec" }, { "kind": "code", "literal": "namespaces", "chunkId": "kubernetes/apikey-crd#spec" }, { "kind": "code", "literal": "claims", "chunkId": "kubernetes/apikey-crd#spec" }, { "kind": "code", "literal": "expiresAfter", "chunkId": "kubernetes/apikey-crd#spec" }, { "kind": "code", "literal": "never", "chunkId": "kubernetes/apikey-crd#spec" }, { "kind": "code", "literal": "365d", "chunkId": "kubernetes/apikey-crd#spec" }, { "kind": "code", "literal": "status.expiresAt", "chunkId": "kubernetes/apikey-crd#spec" } ], "sources": [ { "chunkId": "kubernetes/apikey-crd#spec", "url": "/docs/kubernetes/apikey-crd#spec", "anchor": "spec" } ], "mode": "agent-primary", "terms": [ "spec", "field", "purpose", "owner", "optional", "free", "form", "label", "echoed", "list", "authenticate", "responses", "description", "entitlements", "keyed", "target", "resource", "entry", "carries", "scopes", "namesp", "namespaces", "claims", "expiresafter", "never", "365d", "status", "expiresat", "duration", "defaults", "computed", "mint" ] }, { "id": "kubernetes/apikey-crd#verification", "kind": "section", "title": "ApiKey CRD", "heading": "Verification", "group": "Operations", "url": "/docs/kubernetes/apikey-crd#verification", "summary": "Verification External systems present the raw token to POST /v2/keys/authenticate and get back keyId (a stable actor id) plus the full entitlements map, then make their own authorization decisions from the claims. The ga…", "facts": [ { "kind": "code", "literal": "POST /v2/keys/authenticate", "chunkId": "kubernetes/apikey-crd#verification" }, { "kind": "code", "literal": "keyId", "chunkId": "kubernetes/apikey-crd#verification" }, { "kind": "code", "literal": "entitlements", "chunkId": "kubernetes/apikey-crd#verification" }, { "kind": "code", "literal": "Active", "chunkId": "kubernetes/apikey-crd#verification" }, { "kind": "code", "literal": "status.lastSeenAt", "chunkId": "kubernetes/apikey-crd#verification" }, { "kind": "code", "literal": "Pending", "chunkId": "kubernetes/apikey-crd#verification" }, { "kind": "code", "literal": "Revoked", "chunkId": "kubernetes/apikey-crd#verification" }, { "kind": "code", "literal": "POST /v2/keys/{keyId}/revoke", "chunkId": "kubernetes/apikey-crd#verification" }, { "kind": "code", "literal": "Expired", "chunkId": "kubernetes/apikey-crd#verification" }, { "kind": "code", "literal": "status.expiresAt", "chunkId": "kubernetes/apikey-crd#verification" }, { "kind": "code", "literal": "VectorStore", "chunkId": "kubernetes/apikey-crd#verification" }, { "kind": "code", "literal": "Warehouse", "chunkId": "kubernetes/apikey-crd#verification" } ], "sources": [ { "chunkId": "kubernetes/apikey-crd#verification", "url": "/docs/kubernetes/apikey-crd#verification", "anchor": "verification" } ], "mode": "agent-primary", "terms": [ "verification", "external", "systems", "present", "token", "post", "keys", "authenticate", "back", "keyid", "stable", "actor", "plus", "full", "entitlements", "make", "their", "authorization", "decisions", "claims", "active", "status", "lastseenat", "pending", "revoked", "revoke", "expired", "expiresat", "vectorstore", "warehouse", "gateway", "also", "accepts", "bearer", "routes", "enforcing", "entitlement", "store", "control", "plane" ] }, { "id": "kubernetes/function-crd", "kind": "section", "title": "Function CRD", "heading": null, "group": "Operations", "url": "/docs/kubernetes/function-crd", "summary": "Stateless user-defined functions declared as Kubernetes resources. The Function CRD is a User Defined Function (UDF) that runs over rows that already exist in an Index. It is the right shape for classifiers, enrichment,…", "facts": [ { "kind": "code", "literal": "Function", "chunkId": "kubernetes/function-crd" }, { "kind": "value", "literal": "CodeTabs.astro", "chunkId": "kubernetes/function-crd" } ], "sources": [ { "chunkId": "kubernetes/function-crd", "url": "/docs/kubernetes/function-crd", "anchor": null } ], "mode": "agent-primary", "terms": [ "stateless", "user", "defined", "functions", "declared", "kubernetes", "resources", "function", "runs", "rows", "already", "exist", "index", "right", "shape", "classifiers", "enrichment", "codetabs", "astro", "backfills", "existing", "deterministic", "upserts", "udfs", "best", "yaml", "invoked", "layer", "operator", "creates", "worker", "gateway", "owns", "discovery", "queueing", "retries", "leases", "completion", "markers", "workers" ] }, { "id": "kubernetes/function-crd#gpu-classifier", "kind": "section", "title": "Function CRD", "heading": "GPU classifier", "group": "Operations", "url": "/docs/kubernetes/function-crd#gpu-classifier", "summary": "GPU classifier More complicated classifiers (e.g. a vision-language classifier) may require a model to run on a GPU. apiVersion: hevlayer.com/v1alpha1 kind: Function metadata: name: product-color namespace: layer spec: t…", "facts": [ { "kind": "code", "literal": "worker.computeClass: gpu", "chunkId": "kubernetes/function-crd#gpu-classifier" }, { "kind": "code", "literal": "scaling.pool", "chunkId": "kubernetes/function-crd#gpu-classifier" }, { "kind": "code", "literal": "gpu", "chunkId": "kubernetes/function-crd#gpu-classifier" }, { "kind": "code", "literal": "InfraRules/default", "chunkId": "kubernetes/function-crd#gpu-classifier" }, { "kind": "code", "literal": "layer.hev.dev/node-role=worker-gpu", "chunkId": "kubernetes/function-crd#gpu-classifier" }, { "kind": "code", "literal": "torch", "chunkId": "kubernetes/function-crd#gpu-classifier" }, { "kind": "code", "literal": "transformers", "chunkId": "kubernetes/function-crd#gpu-classifier" }, { "kind": "code", "literal": "pillow", "chunkId": "kubernetes/function-crd#gpu-classifier" }, { "kind": "code", "literal": "httpx", "chunkId": "kubernetes/function-crd#gpu-classifier" }, { "kind": "code", "literal": "hevlayer", "chunkId": "kubernetes/function-crd#gpu-classifier" }, { "kind": "code", "literal": "worker.batchSize", "chunkId": "kubernetes/function-crd#gpu-classifier" }, { "kind": "code", "literal": "worker.timeoutSeconds", "chunkId": "kubernetes/function-crd#gpu-classifier" }, { "kind": "code", "literal": "schedule.leaseSeconds", "chunkId": "kubernetes/function-crd#gpu-classifier" }, { "kind": "code", "literal": "replicas.min: 1", "chunkId": "kubernetes/function-crd#gpu-classifier" }, { "kind": "code", "literal": "min: 0", "chunkId": "kubernetes/function-crd#gpu-classifier" }, { "kind": "value", "literal": "e.g", "chunkId": "kubernetes/function-crd#gpu-classifier" } ], "sources": [ { "chunkId": "kubernetes/function-crd#gpu-classifier", "url": "/docs/kubernetes/function-crd#gpu-classifier", "anchor": "gpu-classifier" } ], "mode": "agent-primary", "terms": [ "classifier", "more", "complicated", "classifiers", "vision", "language", "require", "model", "apiversion", "hevlayer", "v1alpha1", "kind", "function", "metadata", "name", "product", "color", "namespace", "layer", "spec", "worker", "computeclass", "scaling", "pool", "infrarules", "default", "node", "role", "torch", "transformers", "pillow", "httpx", "batchsize", "timeoutseconds", "schedule", "leaseseconds", "replicas", "targetnamespaces", "amazon", "products" ] }, { "id": "kubernetes/function-crd#lifecycle", "kind": "section", "title": "Function CRD", "heading": "Lifecycle", "group": "Operations", "url": "/docs/kubernetes/function-crd#lifecycle", "summary": "Lifecycle kubectl get function product-tags kubectl describe function product-tags layer udf get product-tags kubectl patch function product-tags --type=merge -p '{\"spec\":{\"paused\":true}}' kubectl patch function product-…", "facts": [], "sources": [ { "chunkId": "kubernetes/function-crd#lifecycle", "url": "/docs/kubernetes/function-crd#lifecycle", "anchor": "lifecycle" } ], "mode": "agent-primary", "terms": [ "lifecycle", "kubectl", "function", "product", "tags", "describe", "layer", "patch", "type", "merge", "spec", "paused", "true", "false", "curl", "post", "authorization", "bearer", "layergatewayapikey", "layergatewayurl", "udfs", "reset", "failed", "delete" ] }, { "id": "kubernetes/function-crd#scaling", "kind": "section", "title": "Function CRD", "heading": "Scaling", "group": "Operations", "url": "/docs/kubernetes/function-crd#scaling", "summary": "Scaling spec.scaling is the same scaling config Pipelines use: a pool from InfraRules/default, a mode, and replica bounds. For Functions, mode: autoscale emits a KEDA ScaledObject triggered by layerudfqueuedepth. Replica…", "facts": [ { "kind": "code", "literal": "spec.scaling", "chunkId": "kubernetes/function-crd#scaling" }, { "kind": "code", "literal": "InfraRules/default", "chunkId": "kubernetes/function-crd#scaling" }, { "kind": "code", "literal": "mode: autoscale", "chunkId": "kubernetes/function-crd#scaling" }, { "kind": "code", "literal": "ScaledObject", "chunkId": "kubernetes/function-crd#scaling" }, { "kind": "code", "literal": "layer_udf_queue_depth", "chunkId": "kubernetes/function-crd#scaling" }, { "kind": "code", "literal": "maxReplicasPerWorkload", "chunkId": "kubernetes/function-crd#scaling" } ], "sources": [ { "chunkId": "kubernetes/function-crd#scaling", "url": "/docs/kubernetes/function-crd#scaling", "anchor": "scaling" } ], "mode": "agent-primary", "terms": [ "scaling", "spec", "same", "config", "pipelines", "pool", "infrarules", "default", "mode", "replica", "bounds", "functions", "autoscale", "emits", "keda", "scaledobject", "triggered", "layerudfqueuedepth", "layer", "queue", "depth", "maxreplicasperworkload", "maxima", "above", "rejected", "status" ] }, { "id": "kubernetes/function-crd#selection", "kind": "section", "title": "Function CRD", "heading": "Selection", "group": "Operations", "url": "/docs/kubernetes/function-crd#selection", "summary": "Selection Use targetNamespaces for explicit namespaces. Use indexSelector when labels on Index resources should choose the namespaces. filter preserves arbitrary JSON, including array-form Turbopuffer filters. The operat…", "facts": [ { "kind": "code", "literal": "targetNamespaces", "chunkId": "kubernetes/function-crd#selection" }, { "kind": "code", "literal": "indexSelector", "chunkId": "kubernetes/function-crd#selection" }, { "kind": "code", "literal": "Index", "chunkId": "kubernetes/function-crd#selection" }, { "kind": "code", "literal": "filter", "chunkId": "kubernetes/function-crd#selection" }, { "kind": "code", "literal": "spec.version", "chunkId": "kubernetes/function-crd#selection" } ], "sources": [ { "chunkId": "kubernetes/function-crd#selection", "url": "/docs/kubernetes/function-crd#selection", "anchor": "selection" } ], "mode": "agent-primary", "terms": [ "selection", "targetnamespaces", "explicit", "namespaces", "indexselector", "labels", "index", "resources", "should", "choose", "filter", "preserves", "arbitrary", "json", "including", "array", "form", "turbopuffer", "filters", "operat", "spec", "version", "operator", "stores", "shape", "gateway", "evaluates", "during", "discovery", "after", "generated", "completion", "marker", "predicate", "include", "creates" ] }, { "id": "kubernetes/function-crd#simple-classifier", "kind": "section", "title": "Function CRD", "heading": "Simple classifier", "group": "Operations", "url": "/docs/kubernetes/function-crd#simple-classifier", "summary": "Simple classifier The Python client turns a normal function into the claim/process/complete loop. output=\"tags\" is client-side metadata: the CRD does not declare an output attribute. runudfworker sends the returned value…", "facts": [ { "kind": "code", "literal": "output=\"tags\"", "chunkId": "kubernetes/function-crd#simple-classifier" }, { "kind": "code", "literal": "run_udf_worker", "chunkId": "kubernetes/function-crd#simple-classifier" }, { "kind": "code", "literal": "attributes.tags", "chunkId": "kubernetes/function-crd#simple-classifier" }, { "kind": "code", "literal": "inputs", "chunkId": "kubernetes/function-crd#simple-classifier" }, { "kind": "code", "literal": "TransientError", "chunkId": "kubernetes/function-crd#simple-classifier" }, { "kind": "code", "literal": "PermanentError", "chunkId": "kubernetes/function-crd#simple-classifier" }, { "kind": "code", "literal": "FailUdfItems", "chunkId": "kubernetes/function-crd#simple-classifier" }, { "kind": "code", "literal": "failUdfItems", "chunkId": "kubernetes/function-crd#simple-classifier" }, { "kind": "code", "literal": "kind: \"transient\"", "chunkId": "kubernetes/function-crd#simple-classifier" }, { "kind": "code", "literal": "kind: \"permanent\"", "chunkId": "kubernetes/function-crd#simple-classifier" } ], "sources": [ { "chunkId": "kubernetes/function-crd#simple-classifier", "url": "/docs/kubernetes/function-crd#simple-classifier", "anchor": "simple-classifier" } ], "mode": "agent-primary", "terms": [ "simple", "classifier", "python", "client", "turns", "normal", "function", "claim", "process", "complete", "loop", "output", "tags", "side", "metadata", "does", "declare", "attribute", "runudfworker", "sends", "returned", "value", "worker", "attributes", "inputs", "transienterror", "permanenterror", "failudfitems", "kind", "transient", "permanent", "completion", "patch", "gateway", "stamps", "reserved", "marker", "same", "drives", "protocol" ] }, { "id": "kubernetes/function-crd#tuning-knobs", "kind": "section", "title": "Function CRD", "heading": "Tuning knobs", "group": "Operations", "url": "/docs/kubernetes/function-crd#tuning-knobs", "summary": "Tuning knobs Knob What it bounds worker.batchSize Rows per worker batch. worker.timeoutSeconds Worker call timeout. schedule.leaseSeconds How long a claim is held before reissue. schedule.discoveryIntervalSeconds Time be…", "facts": [ { "kind": "code", "literal": "worker.batchSize", "chunkId": "kubernetes/function-crd#tuning-knobs" }, { "kind": "code", "literal": "worker.timeoutSeconds", "chunkId": "kubernetes/function-crd#tuning-knobs" }, { "kind": "code", "literal": "schedule.leaseSeconds", "chunkId": "kubernetes/function-crd#tuning-knobs" }, { "kind": "code", "literal": "schedule.discoveryIntervalSeconds", "chunkId": "kubernetes/function-crd#tuning-knobs" }, { "kind": "code", "literal": "schedule.maxInFlightBatches", "chunkId": "kubernetes/function-crd#tuning-knobs" }, { "kind": "code", "literal": "schedule.maxConcurrentScans", "chunkId": "kubernetes/function-crd#tuning-knobs" }, { "kind": "code", "literal": "retry.maxAttempts", "chunkId": "kubernetes/function-crd#tuning-knobs" }, { "kind": "code", "literal": "failed", "chunkId": "kubernetes/function-crd#tuning-knobs" } ], "sources": [ { "chunkId": "kubernetes/function-crd#tuning-knobs", "url": "/docs/kubernetes/function-crd#tuning-knobs", "anchor": "tuning-knobs" } ], "mode": "agent-primary", "terms": [ "tuning", "knobs", "knob", "bounds", "worker", "batchsize", "rows", "batch", "timeoutseconds", "call", "timeout", "schedule", "leaseseconds", "long", "claim", "held", "before", "reissue", "discoveryintervalseconds", "time", "maxinflightbatches", "maxconcurrentscans", "retry", "maxattempts", "failed", "between", "discovery", "scan", "jobs", "concurrent", "batches", "namespace", "tries", "lands" ] }, { "id": "kubernetes/function-crd#version-markers", "kind": "section", "title": "Function CRD", "heading": "Version markers", "group": "Operations", "url": "/docs/kubernetes/function-crd#version-markers", "summary": "Version markers spec.version is the re-run safety rail and defaults to v1. On completion, the gateway stamps hevlayerudf v with that version, normalizing hyphens in the Function name to underscores. For metadata.name: pr…", "facts": [ { "kind": "code", "literal": "spec.version", "chunkId": "kubernetes/function-crd#version-markers" }, { "kind": "code", "literal": "v1", "chunkId": "kubernetes/function-crd#version-markers" }, { "kind": "code", "literal": "_hevlayer_udf__v", "chunkId": "kubernetes/function-crd#version-markers" }, { "kind": "code", "literal": "metadata.name: product-color", "chunkId": "kubernetes/function-crd#version-markers" }, { "kind": "code", "literal": "_hevlayer_udf_product_color_v", "chunkId": "kubernetes/function-crd#version-markers" }, { "kind": "code", "literal": "_hevlayer_udf__stale_after", "chunkId": "kubernetes/function-crd#version-markers" } ], "sources": [ { "chunkId": "kubernetes/function-crd#version-markers", "url": "/docs/kubernetes/function-crd#version-markers", "anchor": "version-markers" } ], "mode": "agent-primary", "terms": [ "version", "markers", "spec", "safety", "rail", "defaults", "completion", "gateway", "stamps", "hevlayerudf", "normalizing", "hyphens", "function", "name", "underscores", "metadata", "hevlayer", "product", "color", "stale", "after", "marker", "hevlayerudfproductcolorv", "discovery", "automatically", "looks", "rows", "whose", "missing", "differs", "expired", "staleafter", "bump", "model", "taxonomy", "prompt", "changes" ] }, { "id": "kubernetes/function-crd#worker", "kind": "section", "title": "Function CRD", "heading": "Worker", "group": "Operations", "url": "/docs/kubernetes/function-crd#worker", "summary": "Worker Field Purpose image Worker image. dispatch pull for SDK claim/poll workers, push for HTTP /run workers. computeClass cpu or gpu. Defaults to cpu; when scaling.pool is omitted, the operator maps this to the stock c…", "facts": [ { "kind": "code", "literal": "image", "chunkId": "kubernetes/function-crd#worker" }, { "kind": "code", "literal": "dispatch", "chunkId": "kubernetes/function-crd#worker" }, { "kind": "code", "literal": "pull", "chunkId": "kubernetes/function-crd#worker" }, { "kind": "code", "literal": "push", "chunkId": "kubernetes/function-crd#worker" }, { "kind": "code", "literal": "/run", "chunkId": "kubernetes/function-crd#worker" }, { "kind": "code", "literal": "computeClass", "chunkId": "kubernetes/function-crd#worker" }, { "kind": "code", "literal": "cpu", "chunkId": "kubernetes/function-crd#worker" }, { "kind": "code", "literal": "gpu", "chunkId": "kubernetes/function-crd#worker" }, { "kind": "code", "literal": "scaling.pool", "chunkId": "kubernetes/function-crd#worker" }, { "kind": "code", "literal": "port", "chunkId": "kubernetes/function-crd#worker" }, { "kind": "code", "literal": "batchSize", "chunkId": "kubernetes/function-crd#worker" }, { "kind": "code", "literal": "timeoutSeconds", "chunkId": "kubernetes/function-crd#worker" }, { "kind": "code", "literal": "podSpec", "chunkId": "kubernetes/function-crd#worker" }, { "kind": "code", "literal": "layer run -f", "chunkId": "kubernetes/function-crd#worker" }, { "kind": "code", "literal": "HEVLAYER_UDF_ID", "chunkId": "kubernetes/function-crd#worker" }, { "kind": "code", "literal": "HEVLAYER_BASE_URL", "chunkId": "kubernetes/function-crd#worker" }, { "kind": "code", "literal": "HEVLAYER_UDF_BATCH_SIZE", "chunkId": "kubernetes/function-crd#worker" }, { "kind": "code", "literal": "HEVLAYER_UDF_TIMEOUT_SECONDS", "chunkId": "kubernetes/function-crd#worker" }, { "kind": "code", "literal": "HEVLAYER_UDF_LEASE_SECONDS", "chunkId": "kubernetes/function-crd#worker" }, { "kind": "code", "literal": "LAYER_GATEWAY_API_KEY", "chunkId": "kubernetes/function-crd#worker" }, { "kind": "code", "literal": "VectorStore", "chunkId": "kubernetes/function-crd#worker" }, { "kind": "code", "literal": "deriveFromStore", "chunkId": "kubernetes/function-crd#worker" }, { "kind": "code", "literal": "keys", "chunkId": "kubernetes/function-crd#worker" } ], "sources": [ { "chunkId": "kubernetes/function-crd#worker", "url": "/docs/kubernetes/function-crd#worker", "anchor": "worker" } ], "mode": "agent-primary", "terms": [ "worker", "field", "purpose", "image", "dispatch", "pull", "claim", "poll", "workers", "push", "http", "computeclass", "defaults", "scaling", "pool", "omitted", "operator", "maps", "stock", "port", "batchsize", "timeoutseconds", "podspec", "layer", "hevlayer", "base", "batch", "size", "timeout", "seconds", "lease", "gateway", "vectorstore", "derivefromstore", "keys", "service", "rows", "call", "optional", "level" ] }, { "id": "kubernetes/function-crd#writeback", "kind": "section", "title": "Function CRD", "heading": "Writeback", "group": "Operations", "url": "/docs/kubernetes/function-crd#writeback", "summary": "Writeback Workers own data writes. The common single-attribute case uses the Python client's sugar: @udf(output=\"tags\") makes runudfworker send returned values as attributes.tags in the completion call — in Go (or over R…", "facts": [ { "kind": "code", "literal": "@udf(output=\"tags\")", "chunkId": "kubernetes/function-crd#writeback" }, { "kind": "code", "literal": "run_udf_worker", "chunkId": "kubernetes/function-crd#writeback" }, { "kind": "code", "literal": "attributes.tags", "chunkId": "kubernetes/function-crd#writeback" }, { "kind": "code", "literal": "attributes", "chunkId": "kubernetes/function-crd#writeback" }, { "kind": "code", "literal": "patch_columns", "chunkId": "kubernetes/function-crd#writeback" }, { "kind": "code", "literal": "_hevlayer_*", "chunkId": "kubernetes/function-crd#writeback" }, { "kind": "code", "literal": "tpuf", "chunkId": "kubernetes/function-crd#writeback" }, { "kind": "code", "literal": "None", "chunkId": "kubernetes/function-crd#writeback" } ], "sources": [ { "chunkId": "kubernetes/function-crd#writeback", "url": "/docs/kubernetes/function-crd#writeback", "anchor": "writeback" } ], "mode": "agent-primary", "terms": [ "writeback", "workers", "data", "writes", "common", "single", "attribute", "case", "uses", "python", "client", "sugar", "output", "tags", "makes", "runudfworker", "send", "returned", "values", "attributes", "completion", "call", "worker", "patch", "columns", "hevlayer", "tpuf", "none", "rest", "same", "thing", "item", "gateway", "applies", "those", "reserved", "marker", "patchcolumns", "write", "must" ] }, { "id": "kubernetes/index-crd", "kind": "section", "title": "Index CRD", "heading": null, "group": "Operations", "url": "/docs/kubernetes/index-crd", "summary": "Declarative representation of a namespace managed by Layer. An Index represents one namespace exposed through the gateway. It declares which upstream namespace to use, snapshot policy, cache posture, and consistency mode…", "facts": [ { "kind": "code", "literal": "Index", "chunkId": "kubernetes/index-crd" } ], "sources": [ { "chunkId": "kubernetes/index-crd", "url": "/docs/kubernetes/index-crd", "anchor": null } ], "mode": "agent-primary", "terms": [ "declarative", "representation", "namespace", "managed", "layer", "index", "represents", "exposed", "through", "gateway", "declares", "upstream", "snapshot", "policy", "cache", "posture", "consistency", "mode", "backend", "connection", "itself", "lives", "vectorstore", "apiversion", "hevlayer", "kind", "metadata", "name", "products", "spec", "storeref", "turbopuffer", "default", "distancemetric", "cosinedistance", "labels", "shop", "tags", "catalog", "interval" ] }, { "id": "kubernetes/index-crd#backend", "kind": "section", "title": "Index CRD", "heading": "Backend", "group": "Operations", "url": "/docs/kubernetes/index-crd#backend", "summary": "Backend Field Purpose backend.storeRef Optional VectorStore name in the same namespace. The gateway routes requests for this upstream namespace to that store. Defaults to the namespace's default store. backend.namespace…", "facts": [ { "kind": "code", "literal": "backend.storeRef", "chunkId": "kubernetes/index-crd#backend" }, { "kind": "code", "literal": "VectorStore", "chunkId": "kubernetes/index-crd#backend" }, { "kind": "code", "literal": "backend.namespace", "chunkId": "kubernetes/index-crd#backend" }, { "kind": "code", "literal": "backend.distanceMetric", "chunkId": "kubernetes/index-crd#backend" }, { "kind": "code", "literal": "cosine_distance", "chunkId": "kubernetes/index-crd#backend" } ], "sources": [ { "chunkId": "kubernetes/index-crd#backend", "url": "/docs/kubernetes/index-crd#backend", "anchor": "backend" } ], "mode": "agent-primary", "terms": [ "backend", "field", "purpose", "storeref", "optional", "vectorstore", "name", "same", "namespace", "gateway", "routes", "requests", "upstream", "store", "defaults", "default", "distancemetric", "cosine", "distance", "override", "index", "vector", "metric", "cosinedistance" ] }, { "id": "kubernetes/index-crd#cache-policy", "kind": "section", "title": "Index CRD", "heading": "Cache policy", "group": "Operations", "url": "/docs/kubernetes/index-crd#cache-policy", "summary": "Cache policy Aerospike remains an ephemeral cache; durable snapshot history stays in S3. Cache warming uses the same scan fan-out policy as other origin scans.", "facts": [], "sources": [ { "chunkId": "kubernetes/index-crd#cache-policy", "url": "/docs/kubernetes/index-crd#cache-policy", "anchor": "cache-policy" } ], "mode": "agent-primary", "terms": [ "cache", "policy", "aerospike", "remains", "ephemeral", "durable", "snapshot", "history", "stays", "warming", "uses", "same", "scan", "other", "origin", "scans" ] }, { "id": "kubernetes/index-crd#scan-policy", "kind": "section", "title": "Index CRD", "heading": "Scan policy", "group": "Operations", "url": "/docs/kubernetes/index-crd#scan-policy", "summary": "Scan policy scan.threads sets the per-namespace default for origin scan fan-out: the maximum concurrent upstream requests one scan may issue during scatter/gather. It defaults to 8 and is clamped by the gateway's server…", "facts": [ { "kind": "code", "literal": "scan.threads", "chunkId": "kubernetes/index-crd#scan-policy" }, { "kind": "code", "literal": "threads", "chunkId": "kubernetes/index-crd#scan-policy" } ], "sources": [ { "chunkId": "kubernetes/index-crd#scan-policy", "url": "/docs/kubernetes/index-crd#scan-policy", "anchor": "scan-policy" } ], "mode": "agent-primary", "terms": [ "scan", "policy", "threads", "sets", "namespace", "default", "origin", "maximum", "concurrent", "upstream", "requests", "issue", "during", "scatter", "gather", "defaults", "clamped", "gateway", "server", "active", "shard", "count", "request", "level", "overrides" ] }, { "id": "kubernetes/index-crd#snapshot-policy", "kind": "section", "title": "Index CRD", "heading": "Snapshot policy", "group": "Operations", "url": "/docs/kubernetes/index-crd#snapshot-policy", "summary": "Snapshot policy Field Default Purpose snapshot.facetFields [] Fields the gateway materializes into durable facet snapshots. Empty disables the automatic writer. snapshot.interval 5m Minimum spacing between automatic snap…", "facts": [ { "kind": "code", "literal": "snapshot.facetFields", "chunkId": "kubernetes/index-crd#snapshot-policy" }, { "kind": "code", "literal": "[]", "chunkId": "kubernetes/index-crd#snapshot-policy" }, { "kind": "code", "literal": "snapshot.interval", "chunkId": "kubernetes/index-crd#snapshot-policy" }, { "kind": "code", "literal": "5m", "chunkId": "kubernetes/index-crd#snapshot-policy" }, { "kind": "code", "literal": "snapshot.retention", "chunkId": "kubernetes/index-crd#snapshot-policy" }, { "kind": "code", "literal": "never", "chunkId": "kubernetes/index-crd#snapshot-policy" }, { "kind": "code", "literal": "30d", "chunkId": "kubernetes/index-crd#snapshot-policy" } ], "sources": [ { "chunkId": "kubernetes/index-crd#snapshot-policy", "url": "/docs/kubernetes/index-crd#snapshot-policy", "anchor": "snapshot-policy" } ], "mode": "agent-primary", "terms": [ "snapshot", "policy", "field", "default", "purpose", "facetfields", "fields", "gateway", "materializes", "durable", "facet", "snapshots", "empty", "disables", "automatic", "writer", "interval", "minimum", "spacing", "between", "snap", "retention", "never", "writes", "after", "upstream", "stable", "advances", "keeps", "bodies", "duration", "such", "prunes", "older", "while", "keeping", "latest" ] }, { "id": "kubernetes/index-crd#status", "kind": "section", "title": "Index CRD", "heading": "Status", "group": "Operations", "url": "/docs/kubernetes/index-crd#status", "summary": "Status The operator reports observed generation, metadata sync state, and conditions. status.snapshot.lastRun and lastSuccess are reserved for the gateway history bridge.", "facts": [ { "kind": "code", "literal": "status.snapshot.lastRun", "chunkId": "kubernetes/index-crd#status" }, { "kind": "code", "literal": "lastSuccess", "chunkId": "kubernetes/index-crd#status" } ], "sources": [ { "chunkId": "kubernetes/index-crd#status", "url": "/docs/kubernetes/index-crd#status", "anchor": "status" } ], "mode": "agent-primary", "terms": [ "status", "operator", "reports", "observed", "generation", "metadata", "sync", "state", "conditions", "snapshot", "lastrun", "lastsuccess", "reserved", "gateway", "history", "bridge" ] }, { "id": "kubernetes/operator", "kind": "section", "title": "Operator Overview", "heading": null, "group": "Operations", "url": "/docs/kubernetes/operator", "summary": "What layer-operator reconciles and how it relates to the gateway. layer-operator manages declarative state for your hev layer deployment. It serves a few crucial functions — monitoring for changes to your indexes and man…", "facts": [ { "kind": "code", "literal": "layer-operator", "chunkId": "kubernetes/operator" } ], "sources": [ { "chunkId": "kubernetes/operator", "url": "/docs/kubernetes/operator", "anchor": null } ], "mode": "agent-primary", "terms": [ "layer", "operator", "reconciles", "relates", "gateway", "manages", "declarative", "state", "deployment", "serves", "crucial", "functions", "monitoring", "changes", "indexes", "managing", "scaling", "does", "through", "abstractions", "known", "custom", "resource", "definitions", "crds", "handles", "read", "write", "path", "everything", "wants", "expressed", "desired", "cluster", "vector", "store", "fronts", "exist", "worker", "pools" ] }, { "id": "kubernetes/operator#crds", "kind": "section", "title": "Operator Overview", "heading": "CRDs", "group": "Operations", "url": "/docs/kubernetes/operator#crds", "summary": "CRDs The operator reconciles five resource kinds, each documented on its own page: VectorStore CRD — the upstream store endpoint, credential reference, and gateway inbound auth policy. Index CRD — one resource per Turbop…", "facts": [], "sources": [ { "chunkId": "kubernetes/operator#crds", "url": "/docs/kubernetes/operator#crds", "anchor": "crds" } ], "mode": "agent-primary", "terms": [ "crds", "operator", "reconciles", "five", "resource", "kinds", "documented", "page", "vectorstore", "upstream", "store", "endpoint", "credential", "reference", "gateway", "inbound", "auth", "policy", "index", "turbop", "turbopuffer", "namespace", "should", "manage", "infrarules", "cluster", "wide", "compute", "pools", "document", "cache", "rules", "shared", "scaling", "pipeline", "staged", "work", "changes", "count", "function" ] }, { "id": "kubernetes/operator#relationship-to-the-gateway", "kind": "section", "title": "Operator Overview", "heading": "Relationship to the gateway", "group": "Operations", "url": "/docs/kubernetes/operator#relationship-to-the-gateway", "summary": "Relationship to the gateway The gateway and the operator are decoupled. The operator reconciles declarative state; the gateway serves the read and write path. Neither sits in the other's hot path, so the gateway keeps se…", "facts": [], "sources": [ { "chunkId": "kubernetes/operator#relationship-to-the-gateway", "url": "/docs/kubernetes/operator#relationship-to-the-gateway", "anchor": "relationship-to-the-gateway" } ], "mode": "agent-primary", "terms": [ "relationship", "gateway", "operator", "decoupled", "reconciles", "declarative", "state", "serves", "read", "write", "path", "neither", "sits", "other", "keeps", "serving", "even", "restarted", "lagging", "link", "between", "directional", "only", "some", "features", "reads", "status", "such", "indexes", "exist", "worker", "pools", "ready", "inform", "never", "writes", "crds", "authored", "reconciled", "ever" ] }, { "id": "kubernetes/operator#scheduling-and-node-pools", "kind": "section", "title": "Operator Overview", "heading": "Scheduling and node pools", "group": "Operations", "url": "/docs/kubernetes/operator#scheduling-and-node-pools", "summary": "Scheduling and node pools The operator applies the compute pool chosen by each Pipeline and Function. A pool can set container resources, nodeSelector, and tolerations, so operators can pin CPU, storage-heavy CPU, and GP…", "facts": [ { "kind": "code", "literal": "nodeSelector", "chunkId": "kubernetes/operator#scheduling-and-node-pools" }, { "kind": "code", "literal": "tolerations", "chunkId": "kubernetes/operator#scheduling-and-node-pools" }, { "kind": "code", "literal": "cpu", "chunkId": "kubernetes/operator#scheduling-and-node-pools" }, { "kind": "code", "literal": "cpu-large", "chunkId": "kubernetes/operator#scheduling-and-node-pools" }, { "kind": "code", "literal": "gpu", "chunkId": "kubernetes/operator#scheduling-and-node-pools" }, { "kind": "code", "literal": "layer.hev.dev/node-role=worker-cpu", "chunkId": "kubernetes/operator#scheduling-and-node-pools" }, { "kind": "code", "literal": "layer.hev.dev/node-role=worker-gpu", "chunkId": "kubernetes/operator#scheduling-and-node-pools" }, { "kind": "code", "literal": "nvidia.com/gpu: \"1\"", "chunkId": "kubernetes/operator#scheduling-and-node-pools" }, { "kind": "code", "literal": "InfraRules/default", "chunkId": "kubernetes/operator#scheduling-and-node-pools" } ], "sources": [ { "chunkId": "kubernetes/operator#scheduling-and-node-pools", "url": "/docs/kubernetes/operator#scheduling-and-node-pools", "anchor": "scheduling-and-node-pools" } ], "mode": "agent-primary", "terms": [ "scheduling", "node", "pools", "operator", "applies", "compute", "pool", "chosen", "pipeline", "function", "container", "resources", "nodeselector", "tolerations", "operators", "storage", "heavy", "large", "layer", "role", "worker", "nvidia", "infrarules", "default", "work", "right", "capacity", "helm", "installs", "stock", "select", "chart", "rendered", "karpenter", "also", "requests", "carries", "standard", "toleration", "configured" ] }, { "id": "kubernetes/pipeline-crd", "kind": "section", "title": "Pipeline CRD", "heading": null, "group": "Operations", "url": "/docs/kubernetes/pipeline-crd", "summary": "Staged row-changing work declared as a Kubernetes resource. The Pipeline CRD declares the scaling characteristics you want for ingesting data. Ingestion typically runs in stages: a CPU stage for chunking and extraction,…", "facts": [ { "kind": "code", "literal": "Pipeline", "chunkId": "kubernetes/pipeline-crd" }, { "kind": "code", "literal": "spec.sourceRef", "chunkId": "kubernetes/pipeline-crd" } ], "sources": [ { "chunkId": "kubernetes/pipeline-crd", "url": "/docs/kubernetes/pipeline-crd", "anchor": null } ], "mode": "agent-primary", "terms": [ "staged", "changing", "work", "declared", "kubernetes", "resource", "pipeline", "declares", "scaling", "characteristics", "want", "ingesting", "data", "ingestion", "typically", "runs", "stages", "stage", "chunking", "extraction", "spec", "sourceref", "followed", "embedding", "declare", "yaml", "code", "through", "combination", "both", "recommended", "while", "setting", "namespace", "client", "lets", "upstream", "details", "well", "operator" ] }, { "id": "kubernetes/pipeline-crd#pipeline-id", "kind": "section", "title": "Pipeline CRD", "heading": "Pipeline id", "group": "Operations", "url": "/docs/kubernetes/pipeline-crd#pipeline-id", "summary": "Pipeline id spec.pipelineId names the gateway pipeline (the queue) the worker stages into and scales on. It defaults to the resource name. Set it when multiple worker resources share one queue: the extract and embed stag…", "facts": [ { "kind": "code", "literal": "spec.pipelineId", "chunkId": "kubernetes/pipeline-crd#pipeline-id" }, { "kind": "code", "literal": "pipelineId: products", "chunkId": "kubernetes/pipeline-crd#pipeline-id" } ], "sources": [ { "chunkId": "kubernetes/pipeline-crd#pipeline-id", "url": "/docs/kubernetes/pipeline-crd#pipeline-id", "anchor": "pipeline-id" } ], "mode": "agent-primary", "terms": [ "pipeline", "spec", "pipelineid", "names", "gateway", "queue", "worker", "stages", "scales", "defaults", "resource", "name", "multiple", "resources", "share", "extract", "embed", "stag", "products", "stage", "both" ] }, { "id": "kubernetes/pipeline-crd#scaling", "kind": "section", "title": "Pipeline CRD", "heading": "Scaling", "group": "Operations", "url": "/docs/kubernetes/pipeline-crd#scaling", "summary": "Scaling scaling: pool: cpu mode: autoscale replicas: min: 0 max: 8 spec.scaling.pool, when set, must name a pool in InfraRules/default. When omitted, the operator uses worker.computeClass to choose the stock cpu or gpu p…", "facts": [ { "kind": "code", "literal": "scaling:\n pool: cpu\n mode: autoscale\n replicas:\n min: 0\n max: 8", "chunkId": "kubernetes/pipeline-crd#scaling" }, { "kind": "code", "literal": "spec.scaling.pool", "chunkId": "kubernetes/pipeline-crd#scaling" }, { "kind": "code", "literal": "InfraRules/default", "chunkId": "kubernetes/pipeline-crd#scaling" }, { "kind": "code", "literal": "worker.computeClass", "chunkId": "kubernetes/pipeline-crd#scaling" }, { "kind": "code", "literal": "cpu", "chunkId": "kubernetes/pipeline-crd#scaling" }, { "kind": "code", "literal": "gpu", "chunkId": "kubernetes/pipeline-crd#scaling" }, { "kind": "code", "literal": "cpu-large", "chunkId": "kubernetes/pipeline-crd#scaling" }, { "kind": "code", "literal": "mode: autoscale", "chunkId": "kubernetes/pipeline-crd#scaling" }, { "kind": "code", "literal": "ScaledObject", "chunkId": "kubernetes/pipeline-crd#scaling" }, { "kind": "code", "literal": "mode: fixed", "chunkId": "kubernetes/pipeline-crd#scaling" }, { "kind": "code", "literal": "replicas.min", "chunkId": "kubernetes/pipeline-crd#scaling" }, { "kind": "code", "literal": "spec.paused: true", "chunkId": "kubernetes/pipeline-crd#scaling" } ], "sources": [ { "chunkId": "kubernetes/pipeline-crd#scaling", "url": "/docs/kubernetes/pipeline-crd#scaling", "anchor": "scaling" } ], "mode": "agent-primary", "terms": [ "scaling", "pool", "mode", "autoscale", "replicas", "spec", "must", "name", "infrarules", "default", "omitted", "operator", "uses", "worker", "computeclass", "choose", "stock", "large", "scaledobject", "fixed", "paused", "true", "helm", "installs", "well", "known", "pools", "creates", "keda", "backed", "pipeline", "queue", "depth", "pins", "deployment", "disabled", "scales", "zero", "also" ] }, { "id": "kubernetes/pipeline-crd#source", "kind": "section", "title": "Pipeline CRD", "heading": "Source", "group": "Operations", "url": "/docs/kubernetes/pipeline-crd#source", "summary": "Source spec.sourceRef is intentionally open JSON for the external source that feeds the worker: SQS, Kafka, S3 events, a partner API, or a one-off migration source. The operator injects it into the worker pod verbatim as…", "facts": [ { "kind": "code", "literal": "spec.sourceRef", "chunkId": "kubernetes/pipeline-crd#source" }, { "kind": "code", "literal": "HEVLAYER_SOURCE_REF", "chunkId": "kubernetes/pipeline-crd#source" } ], "sources": [ { "chunkId": "kubernetes/pipeline-crd#source", "url": "/docs/kubernetes/pipeline-crd#source", "anchor": "source" } ], "mode": "agent-primary", "terms": [ "source", "spec", "sourceref", "intentionally", "open", "json", "external", "feeds", "worker", "kafka", "events", "partner", "migration", "operator", "injects", "verbatim", "hevlayer", "hevlayersourceref", "image", "owns", "specific", "behavior", "extract", "chunk", "reading" ] }, { "id": "kubernetes/pipeline-crd#status", "kind": "section", "title": "Pipeline CRD", "heading": "Status", "group": "Operations", "url": "/docs/kubernetes/pipeline-crd#status", "summary": "Status Use the pipeline status API for status: queue counts, stage progress, and worker state. The resource itself reports only managed object references and readiness conditions.", "facts": [], "sources": [ { "chunkId": "kubernetes/pipeline-crd#status", "url": "/docs/kubernetes/pipeline-crd#status", "anchor": "status" } ], "mode": "agent-primary", "terms": [ "status", "pipeline", "queue", "counts", "stage", "progress", "worker", "state", "resource", "itself", "reports", "only", "managed", "object", "references", "readiness", "conditions" ] }, { "id": "kubernetes/pipeline-crd#target", "kind": "section", "title": "Pipeline CRD", "heading": "Target", "group": "Operations", "url": "/docs/kubernetes/pipeline-crd#target", "summary": "Target spec.target.namespace is the Turbopuffer namespace the pipeline writes. The gateway pipeline API owns document state, chunks, and vector writes for that target namespace.", "facts": [ { "kind": "code", "literal": "spec.target.namespace", "chunkId": "kubernetes/pipeline-crd#target" } ], "sources": [ { "chunkId": "kubernetes/pipeline-crd#target", "url": "/docs/kubernetes/pipeline-crd#target", "anchor": "target" } ], "mode": "agent-primary", "terms": [ "target", "spec", "namespace", "turbopuffer", "pipeline", "writes", "gateway", "owns", "document", "state", "chunks", "vector" ] }, { "id": "kubernetes/pipeline-crd#worker", "kind": "section", "title": "Pipeline CRD", "heading": "Worker", "group": "Operations", "url": "/docs/kubernetes/pipeline-crd#worker", "summary": "Worker Field Purpose image Worker image. computeClass cpu or gpu. Defaults to cpu; when scaling.pool is omitted, the operator maps this to the stock cpu or gpu pool. batchSize Work items per batch. timeoutSeconds Worker…", "facts": [ { "kind": "code", "literal": "image", "chunkId": "kubernetes/pipeline-crd#worker" }, { "kind": "code", "literal": "computeClass", "chunkId": "kubernetes/pipeline-crd#worker" }, { "kind": "code", "literal": "cpu", "chunkId": "kubernetes/pipeline-crd#worker" }, { "kind": "code", "literal": "gpu", "chunkId": "kubernetes/pipeline-crd#worker" }, { "kind": "code", "literal": "scaling.pool", "chunkId": "kubernetes/pipeline-crd#worker" }, { "kind": "code", "literal": "batchSize", "chunkId": "kubernetes/pipeline-crd#worker" }, { "kind": "code", "literal": "timeoutSeconds", "chunkId": "kubernetes/pipeline-crd#worker" }, { "kind": "code", "literal": "podSpec", "chunkId": "kubernetes/pipeline-crd#worker" }, { "kind": "code", "literal": "HEVLAYER_PIPELINE_ID", "chunkId": "kubernetes/pipeline-crd#worker" }, { "kind": "code", "literal": "spec.pipelineId", "chunkId": "kubernetes/pipeline-crd#worker" }, { "kind": "code", "literal": "HEVLAYER_TARGET_NAMESPACE", "chunkId": "kubernetes/pipeline-crd#worker" }, { "kind": "code", "literal": "spec.target.namespace", "chunkId": "kubernetes/pipeline-crd#worker" }, { "kind": "code", "literal": "HEVLAYER_BASE_URL", "chunkId": "kubernetes/pipeline-crd#worker" }, { "kind": "code", "literal": "HEVLAYER_SOURCE_REF", "chunkId": "kubernetes/pipeline-crd#worker" }, { "kind": "code", "literal": "spec.sourceRef", "chunkId": "kubernetes/pipeline-crd#worker" }, { "kind": "code", "literal": "LAYER_GATEWAY_API_KEY", "chunkId": "kubernetes/pipeline-crd#worker" }, { "kind": "code", "literal": "deriveFromStore", "chunkId": "kubernetes/pipeline-crd#worker" }, { "kind": "code", "literal": "VectorStore", "chunkId": "kubernetes/pipeline-crd#worker" }, { "kind": "code", "literal": "keys", "chunkId": "kubernetes/pipeline-crd#worker" } ], "sources": [ { "chunkId": "kubernetes/pipeline-crd#worker", "url": "/docs/kubernetes/pipeline-crd#worker", "anchor": "worker" } ], "mode": "agent-primary", "terms": [ "worker", "field", "purpose", "image", "computeclass", "defaults", "scaling", "pool", "omitted", "operator", "maps", "stock", "batchsize", "work", "items", "batch", "timeoutseconds", "podspec", "hevlayer", "pipeline", "spec", "pipelineid", "target", "namespace", "base", "source", "sourceref", "layer", "gateway", "derivefromstore", "vectorstore", "keys", "call", "timeout", "optional", "level", "merge", "patch", "creates", "deployment" ] }, { "id": "kubernetes/scaling-crd", "kind": "section", "title": "InfraRules CRD", "heading": null, "group": "Operations", "url": "/docs/kubernetes/scaling-crd", "summary": "Cluster-wide compute pools, document cache rules, and workload scaling. InfraRules is the cluster-scoped policy object for Layer-managed runtime infrastructure. There is exactly one object: InfraRules/default. Pipelines…", "facts": [ { "kind": "code", "literal": "InfraRules", "chunkId": "kubernetes/scaling-crd" }, { "kind": "code", "literal": "InfraRules/default", "chunkId": "kubernetes/scaling-crd" }, { "kind": "code", "literal": "spec.scaling", "chunkId": "kubernetes/scaling-crd" }, { "kind": "code", "literal": "InfraRules/default.spec.computePools", "chunkId": "kubernetes/scaling-crd" } ], "sources": [ { "chunkId": "kubernetes/scaling-crd", "url": "/docs/kubernetes/scaling-crd", "anchor": null } ], "mode": "agent-primary", "terms": [ "cluster", "wide", "compute", "pools", "document", "cache", "rules", "workload", "scaling", "infrarules", "scoped", "policy", "object", "layer", "managed", "runtime", "infrastructure", "there", "exactly", "default", "pipelines", "spec", "computepools", "functions", "reference", "separate", "autoscaling", "resource", "inline", "choose", "pool" ] }, { "id": "kubernetes/scaling-crd#compute-pools", "kind": "section", "title": "InfraRules CRD", "heading": "Compute pools", "group": "Operations", "url": "/docs/kubernetes/scaling-crd#compute-pools", "summary": "Compute pools The Helm defaults define three well-known pools: Pool Use cpu General CPU workers. cpu-large CPU workers that need local ephemeral-storage headroom. gpu One-NVIDIA-GPU workers for embedding and inference. T…", "facts": [ { "kind": "code", "literal": "cpu", "chunkId": "kubernetes/scaling-crd#compute-pools" }, { "kind": "code", "literal": "cpu-large", "chunkId": "kubernetes/scaling-crd#compute-pools" }, { "kind": "code", "literal": "gpu", "chunkId": "kubernetes/scaling-crd#compute-pools" }, { "kind": "code", "literal": "layer.hev.dev/node-role=worker-cpu", "chunkId": "kubernetes/scaling-crd#compute-pools" }, { "kind": "code", "literal": "worker-gpu", "chunkId": "kubernetes/scaling-crd#compute-pools" }, { "kind": "code", "literal": "nvidia.com/gpu: \"1\"", "chunkId": "kubernetes/scaling-crd#compute-pools" }, { "kind": "code", "literal": "nodeSelector", "chunkId": "kubernetes/scaling-crd#compute-pools" }, { "kind": "code", "literal": "gpuType", "chunkId": "kubernetes/scaling-crd#compute-pools" }, { "kind": "code", "literal": "operator.infraRules.computePools", "chunkId": "kubernetes/scaling-crd#compute-pools" }, { "kind": "code", "literal": "name", "chunkId": "kubernetes/scaling-crd#compute-pools" }, { "kind": "code", "literal": "spec.scaling.pool", "chunkId": "kubernetes/scaling-crd#compute-pools" }, { "kind": "code", "literal": "kind", "chunkId": "kubernetes/scaling-crd#compute-pools" }, { "kind": "code", "literal": "tolerations", "chunkId": "kubernetes/scaling-crd#compute-pools" }, { "kind": "code", "literal": "resources", "chunkId": "kubernetes/scaling-crd#compute-pools" }, { "kind": "code", "literal": "maxReplicasPerWorkload", "chunkId": "kubernetes/scaling-crd#compute-pools" } ], "sources": [ { "chunkId": "kubernetes/scaling-crd#compute-pools", "url": "/docs/kubernetes/scaling-crd#compute-pools", "anchor": "compute-pools" } ], "mode": "agent-primary", "terms": [ "compute", "pools", "helm", "defaults", "define", "three", "well", "known", "pool", "general", "workers", "large", "need", "local", "ephemeral", "storage", "headroom", "nvidia", "embedding", "inference", "layer", "node", "role", "worker", "nodeselector", "gputype", "operator", "infrarules", "computepools", "name", "spec", "scaling", "kind", "tolerations", "resources", "maxreplicasperworkload", "default", "select", "karpenter", "backed" ] }, { "id": "kubernetes/scaling-crd#document-cache-rules", "kind": "section", "title": "InfraRules CRD", "heading": "Document cache rules", "group": "Operations", "url": "/docs/kubernetes/scaling-crd#document-cache-rules", "summary": "Document cache rules documentCache captures the operator-owned document cache settings: capacity, replication factor, and node count. Helm still renders the document-cache KEDA object directly; InfraRules is the declared…", "facts": [ { "kind": "code", "literal": "documentCache", "chunkId": "kubernetes/scaling-crd#document-cache-rules" }, { "kind": "code", "literal": "InfraRules", "chunkId": "kubernetes/scaling-crd#document-cache-rules" } ], "sources": [ { "chunkId": "kubernetes/scaling-crd#document-cache-rules", "url": "/docs/kubernetes/scaling-crd#document-cache-rules", "anchor": "document-cache-rules" } ], "mode": "agent-primary", "terms": [ "document", "cache", "rules", "documentcache", "captures", "operator", "owned", "settings", "capacity", "replication", "factor", "node", "count", "helm", "still", "renders", "keda", "object", "directly", "infrarules", "declared", "policy", "shape", "reports", "validates", "against" ] }, { "id": "kubernetes/scaling-crd#infrarules", "kind": "section", "title": "InfraRules CRD", "heading": "InfraRules", "group": "Operations", "url": "/docs/kubernetes/scaling-crd#infrarules", "summary": "InfraRules apiVersion: hevlayer.com/v1alpha1 kind: InfraRules metadata: name: default spec: computePools: - name: cpu kind: cpu nodeSelector: layer.hev.dev/node-role: worker-cpu layer.hev.dev/compute: cpu tolerations: -…", "facts": [ { "kind": "code", "literal": "default", "chunkId": "kubernetes/scaling-crd#infrarules" }, { "kind": "code", "literal": "operator.infraRules.create=true", "chunkId": "kubernetes/scaling-crd#infrarules" } ], "sources": [ { "chunkId": "kubernetes/scaling-crd#infrarules", "url": "/docs/kubernetes/scaling-crd#infrarules", "anchor": "infrarules" } ], "mode": "agent-primary", "terms": [ "infrarules", "apiversion", "hevlayer", "v1alpha1", "kind", "metadata", "name", "default", "spec", "computepools", "nodeselector", "layer", "node", "role", "worker", "compute", "tolerations", "operator", "create", "true", "equal", "value", "effect", "noschedule", "resources", "requests", "memory", "limits", "maxreplicasperworkload", "large", "ephemeral", "storage", "35gi", "40gi", "nvidia", "exists", "250m", "10gi", "documentcache", "capgib" ] }, { "id": "kubernetes/scaling-crd#workload-scaling", "kind": "section", "title": "InfraRules CRD", "heading": "Workload scaling", "group": "Operations", "url": "/docs/kubernetes/scaling-crd#workload-scaling", "summary": "Workload scaling scaling: pool: cpu mode: autoscale replicas: min: 0 max: 4 Mode Behavior autoscale Emit a KEDA ScaledObject and let queue depth scale the Deployment between min and max. fixed Set Deployment replicas to…", "facts": [ { "kind": "code", "literal": "scaling:\n pool: cpu\n mode: autoscale\n replicas:\n min: 0\n max: 4", "chunkId": "kubernetes/scaling-crd#workload-scaling" }, { "kind": "code", "literal": "autoscale", "chunkId": "kubernetes/scaling-crd#workload-scaling" }, { "kind": "code", "literal": "ScaledObject", "chunkId": "kubernetes/scaling-crd#workload-scaling" }, { "kind": "code", "literal": "min", "chunkId": "kubernetes/scaling-crd#workload-scaling" }, { "kind": "code", "literal": "max", "chunkId": "kubernetes/scaling-crd#workload-scaling" }, { "kind": "code", "literal": "fixed", "chunkId": "kubernetes/scaling-crd#workload-scaling" }, { "kind": "code", "literal": "replicas.min", "chunkId": "kubernetes/scaling-crd#workload-scaling" }, { "kind": "code", "literal": "disabled", "chunkId": "kubernetes/scaling-crd#workload-scaling" }, { "kind": "code", "literal": "mode: autoscale", "chunkId": "kubernetes/scaling-crd#workload-scaling" }, { "kind": "code", "literal": "replicas.min: 1", "chunkId": "kubernetes/scaling-crd#workload-scaling" }, { "kind": "code", "literal": "scaling.pool", "chunkId": "kubernetes/scaling-crd#workload-scaling" }, { "kind": "code", "literal": "worker.computeClass", "chunkId": "kubernetes/scaling-crd#workload-scaling" }, { "kind": "code", "literal": "cpu", "chunkId": "kubernetes/scaling-crd#workload-scaling" }, { "kind": "code", "literal": "gpu", "chunkId": "kubernetes/scaling-crd#workload-scaling" } ], "sources": [ { "chunkId": "kubernetes/scaling-crd#workload-scaling", "url": "/docs/kubernetes/scaling-crd#workload-scaling", "anchor": "workload-scaling" } ], "mode": "agent-primary", "terms": [ "workload", "scaling", "pool", "mode", "autoscale", "replicas", "behavior", "emit", "keda", "scaledobject", "queue", "depth", "scale", "deployment", "between", "fixed", "disabled", "worker", "computeclass", "object", "emitted", "paused", "workloads", "also", "keep", "cold", "start", "heavy", "warm", "function", "pipeline", "omits", "operator", "uses", "choose", "stock" ] }, { "id": "kubernetes/vectorstore-crd", "kind": "section", "title": "VectorStore CRD", "heading": null, "group": "Operations", "url": "/docs/kubernetes/vectorstore-crd", "summary": "Backend connection and gateway inbound auth policy for a Layer install. A VectorStore is the gateway's upstream connection. It names the store kind, endpoint, credential Secret, and the inbound auth policy the gateway ap…", "facts": [ { "kind": "code", "literal": "apiVersion: hevlayer.com/v1alpha1\nkind: VectorStore\nmetadata:\n name: turbopuffer-default\n namespace: layer\nspec:\n kind: turbopuffer\n default: true\n endpoint:\n url: https://aws-us-east-1.turbopuffer.com\n region: aws-us-east-1\n credential:\n secretRef:\n name: layer\n key: turbopuffer-api-key\n inboundAuth:\n mode: deriveFromStore", "chunkId": "kubernetes/vectorstore-crd" }, { "kind": "code", "literal": "VectorStore", "chunkId": "kubernetes/vectorstore-crd" }, { "kind": "code", "literal": "Index.spec.backend.storeRef", "chunkId": "kubernetes/vectorstore-crd" } ], "sources": [ { "chunkId": "kubernetes/vectorstore-crd", "url": "/docs/kubernetes/vectorstore-crd", "anchor": null } ], "mode": "agent-primary", "terms": [ "backend", "connection", "gateway", "inbound", "auth", "policy", "layer", "install", "vectorstore", "upstream", "names", "store", "kind", "endpoint", "credential", "secret", "apiversion", "hevlayer", "v1alpha1", "metadata", "name", "turbopuffer", "default", "namespace", "spec", "true", "https", "east", "region", "secretref", "inboundauth", "mode", "derivefromstore", "index", "storeref", "applies", "client", "requests", "define", "more" ] }, { "id": "kubernetes/vectorstore-crd#connection", "kind": "section", "title": "VectorStore CRD", "heading": "Connection", "group": "Operations", "url": "/docs/kubernetes/vectorstore-crd#connection", "summary": "Connection Field Purpose kind turbopuffer. pinecone is reserved by the schema but rejected by the operator until implemented. default Marks the store used when an Index omits spec.backend.storeRef. A single store is trea…", "facts": [ { "kind": "code", "literal": "kind", "chunkId": "kubernetes/vectorstore-crd#connection" }, { "kind": "code", "literal": "turbopuffer", "chunkId": "kubernetes/vectorstore-crd#connection" }, { "kind": "code", "literal": "pinecone", "chunkId": "kubernetes/vectorstore-crd#connection" }, { "kind": "code", "literal": "default", "chunkId": "kubernetes/vectorstore-crd#connection" }, { "kind": "code", "literal": "Index", "chunkId": "kubernetes/vectorstore-crd#connection" }, { "kind": "code", "literal": "spec.backend.storeRef", "chunkId": "kubernetes/vectorstore-crd#connection" }, { "kind": "code", "literal": "endpoint.url", "chunkId": "kubernetes/vectorstore-crd#connection" }, { "kind": "code", "literal": "endpoint.region", "chunkId": "kubernetes/vectorstore-crd#connection" }, { "kind": "code", "literal": "credential.secretRef", "chunkId": "kubernetes/vectorstore-crd#connection" }, { "kind": "code", "literal": "VectorStore", "chunkId": "kubernetes/vectorstore-crd#connection" } ], "sources": [ { "chunkId": "kubernetes/vectorstore-crd#connection", "url": "/docs/kubernetes/vectorstore-crd#connection", "anchor": "connection" } ], "mode": "agent-primary", "terms": [ "connection", "field", "purpose", "kind", "turbopuffer", "pinecone", "reserved", "schema", "rejected", "operator", "until", "implemented", "default", "marks", "store", "index", "omits", "spec", "backend", "storeref", "single", "trea", "endpoint", "region", "credential", "secretref", "vectorstore", "treated", "upstream", "base", "visible", "label", "secret", "same", "namespace", "never", "stored" ] }, { "id": "kubernetes/vectorstore-crd#inbound-auth", "kind": "section", "title": "VectorStore CRD", "heading": "Inbound auth", "group": "Operations", "url": "/docs/kubernetes/vectorstore-crd#inbound-auth", "summary": "Inbound auth inboundAuth.mode controls what bearer token the gateway accepts: Mode Behavior deriveFromStore Default. The gateway accepts the default store's credential as the inbound bearer. This is the single-tenant BYO…", "facts": [ { "kind": "code", "literal": "spec:\n inboundAuth:\n mode: keys\n keys:\n - name: shop-rw\n scopes: [read, write]\n secretRef:\n name: layer\n key: layer-inbound-shop-rw-api-key", "chunkId": "kubernetes/vectorstore-crd#inbound-auth" }, { "kind": "code", "literal": "inboundAuth.mode", "chunkId": "kubernetes/vectorstore-crd#inbound-auth" }, { "kind": "code", "literal": "deriveFromStore", "chunkId": "kubernetes/vectorstore-crd#inbound-auth" }, { "kind": "code", "literal": "keys", "chunkId": "kubernetes/vectorstore-crd#inbound-auth" }, { "kind": "code", "literal": "read", "chunkId": "kubernetes/vectorstore-crd#inbound-auth" }, { "kind": "code", "literal": "write", "chunkId": "kubernetes/vectorstore-crd#inbound-auth" }, { "kind": "code", "literal": "admin", "chunkId": "kubernetes/vectorstore-crd#inbound-auth" }, { "kind": "code", "literal": "open", "chunkId": "kubernetes/vectorstore-crd#inbound-auth" }, { "kind": "code", "literal": "Authorization: Bearer ", "chunkId": "kubernetes/vectorstore-crd#inbound-auth" }, { "kind": "code", "literal": "LAYER_GATEWAY_API_KEY", "chunkId": "kubernetes/vectorstore-crd#inbound-auth" }, { "kind": "code", "literal": "ApiKey", "chunkId": "kubernetes/vectorstore-crd#inbound-auth" }, { "kind": "code", "literal": "vectorstore.", "chunkId": "kubernetes/vectorstore-crd#inbound-auth" } ], "sources": [ { "chunkId": "kubernetes/vectorstore-crd#inbound-auth", "url": "/docs/kubernetes/vectorstore-crd#inbound-auth", "anchor": "inbound-auth" } ], "mode": "agent-primary", "terms": [ "inbound", "auth", "inboundauth", "mode", "controls", "bearer", "token", "gateway", "accepts", "behavior", "derivefromstore", "default", "store", "credential", "single", "tenant", "spec", "keys", "name", "shop", "scopes", "read", "write", "secretref", "layer", "admin", "open", "authorization", "apikey", "vectorstore", "byoc", "shape", "listed", "independent", "secrets", "enforces", "their", "only", "explicitly", "environments" ] }, { "id": "kubernetes/vectorstore-crd#routing", "kind": "section", "title": "VectorStore CRD", "heading": "Routing", "group": "Operations", "url": "/docs/kubernetes/vectorstore-crd#routing", "summary": "Routing The gateway builds one upstream client per VectorStore in the namespace. Requests whose namespace has an Index with spec.backend.storeRef use that store; other namespaces use the default store. Two Index objects…", "facts": [ { "kind": "code", "literal": "VectorStore", "chunkId": "kubernetes/vectorstore-crd#routing" }, { "kind": "code", "literal": "Index", "chunkId": "kubernetes/vectorstore-crd#routing" }, { "kind": "code", "literal": "spec.backend.storeRef", "chunkId": "kubernetes/vectorstore-crd#routing" } ], "sources": [ { "chunkId": "kubernetes/vectorstore-crd#routing", "url": "/docs/kubernetes/vectorstore-crd#routing", "anchor": "routing" } ], "mode": "agent-primary", "terms": [ "routing", "gateway", "builds", "upstream", "client", "vectorstore", "namespace", "requests", "whose", "index", "spec", "backend", "storeref", "store", "other", "namespaces", "default", "objects", "cannot", "resolve", "same" ] }, { "id": "kubernetes/vectorstore-crd#status", "kind": "section", "title": "VectorStore CRD", "heading": "Status", "group": "Operations", "url": "/docs/kubernetes/vectorstore-crd#status", "summary": "Status The operator sets status.reachable and a Ready condition after validating the Secret references and probing GET /v1/namespaces on the store endpoint.", "facts": [ { "kind": "code", "literal": "status.reachable", "chunkId": "kubernetes/vectorstore-crd#status" }, { "kind": "code", "literal": "Ready", "chunkId": "kubernetes/vectorstore-crd#status" }, { "kind": "code", "literal": "GET /v1/namespaces", "chunkId": "kubernetes/vectorstore-crd#status" } ], "sources": [ { "chunkId": "kubernetes/vectorstore-crd#status", "url": "/docs/kubernetes/vectorstore-crd#status", "anchor": "status" } ], "mode": "agent-primary", "terms": [ "status", "operator", "sets", "reachable", "ready", "condition", "after", "validating", "secret", "references", "probing", "namespaces", "store", "endpoint" ] }, { "id": "kubernetes/warehouse-crd", "kind": "section", "title": "Warehouse CRD", "heading": null, "group": "Operations", "url": "/docs/kubernetes/warehouse-crd", "summary": "Declared upstream data source: identity, credential, and verified reachability. A Warehouse declares an upstream source system — the system of record pipelines extract rows from, plus the verified credential to reach it.…", "facts": [ { "kind": "code", "literal": "apiVersion: hevlayer.com/v1alpha1\nkind: Warehouse\nmetadata:\n name: prod-snowflake\n namespace: layer\nspec:\n kind: snowflake\n snowflake:\n account: acme-xy12345\n user: SVC_LAYER\n role: SVC_LAYER_ROLE\n warehouse: EXTRACT_WH\n keyPairSecretRef:\n name: snowflake-rsa\n pool:\n size: 5\n timeout: 30s\n verifyInterval: 1h", "chunkId": "kubernetes/warehouse-crd" }, { "kind": "code", "literal": "Warehouse", "chunkId": "kubernetes/warehouse-crd" }, { "kind": "code", "literal": "VectorStore", "chunkId": "kubernetes/warehouse-crd" } ], "sources": [ { "chunkId": "kubernetes/warehouse-crd", "url": "/docs/kubernetes/warehouse-crd", "anchor": null } ], "mode": "agent-primary", "terms": [ "declared", "upstream", "data", "source", "identity", "credential", "verified", "reachability", "warehouse", "declares", "system", "record", "pipelines", "extract", "rows", "plus", "reach", "apiversion", "hevlayer", "v1alpha1", "kind", "metadata", "name", "prod", "snowflake", "namespace", "layer", "spec", "account", "acme", "xy12345", "user", "role", "keypairsecretref", "pool", "size", "timeout", "verifyinterval", "vectorstore", "derived" ] }, { "id": "kubernetes/warehouse-crd#connection", "kind": "section", "title": "Warehouse CRD", "heading": "Connection", "group": "Operations", "url": "/docs/kubernetes/warehouse-crd#connection", "summary": "Connection Field Purpose kind snowflake. databricks and iceberg are reserved by the schema but rejected by the operator until implemented. snowflake.account Snowflake account identifier. snowflake.user Service user the k…", "facts": [ { "kind": "code", "literal": "kind", "chunkId": "kubernetes/warehouse-crd#connection" }, { "kind": "code", "literal": "snowflake", "chunkId": "kubernetes/warehouse-crd#connection" }, { "kind": "code", "literal": "databricks", "chunkId": "kubernetes/warehouse-crd#connection" }, { "kind": "code", "literal": "iceberg", "chunkId": "kubernetes/warehouse-crd#connection" }, { "kind": "code", "literal": "snowflake.account", "chunkId": "kubernetes/warehouse-crd#connection" }, { "kind": "code", "literal": "snowflake.user", "chunkId": "kubernetes/warehouse-crd#connection" }, { "kind": "code", "literal": "snowflake.role", "chunkId": "kubernetes/warehouse-crd#connection" }, { "kind": "code", "literal": "snowflake.warehouse", "chunkId": "kubernetes/warehouse-crd#connection" }, { "kind": "code", "literal": "snowflake.keyPairSecretRef", "chunkId": "kubernetes/warehouse-crd#connection" }, { "kind": "code", "literal": "private-key.pem", "chunkId": "kubernetes/warehouse-crd#connection" }, { "kind": "code", "literal": "passphrase", "chunkId": "kubernetes/warehouse-crd#connection" }, { "kind": "code", "literal": "snowflake.pool", "chunkId": "kubernetes/warehouse-crd#connection" }, { "kind": "code", "literal": "size", "chunkId": "kubernetes/warehouse-crd#connection" }, { "kind": "code", "literal": "timeout", "chunkId": "kubernetes/warehouse-crd#connection" }, { "kind": "code", "literal": "verifyInterval", "chunkId": "kubernetes/warehouse-crd#connection" }, { "kind": "code", "literal": "1h", "chunkId": "kubernetes/warehouse-crd#connection" } ], "sources": [ { "chunkId": "kubernetes/warehouse-crd#connection", "url": "/docs/kubernetes/warehouse-crd#connection", "anchor": "connection" } ], "mode": "agent-primary", "terms": [ "connection", "field", "purpose", "kind", "snowflake", "databricks", "iceberg", "reserved", "schema", "rejected", "operator", "until", "implemented", "account", "identifier", "user", "service", "role", "warehouse", "keypairsecretref", "private", "passphrase", "pool", "size", "timeout", "verifyinterval", "pair", "authenticates", "optional", "assumed", "connect", "compute", "extraction", "queries", "secret", "same", "namespace", "holding", "credential", "never" ] }, { "id": "kubernetes/warehouse-crd#deletion", "kind": "section", "title": "Warehouse CRD", "heading": "Deletion", "group": "Operations", "url": "/docs/kubernetes/warehouse-crd#deletion", "summary": "Deletion Deleting a warehouse fences everything drawing from it. A finalizer blocks deletion while status.consumers is non-zero — pipelines extracting from it or keys entitled to it — annotate with hevlayer.com/force-del…", "facts": [ { "kind": "code", "literal": "status.consumers", "chunkId": "kubernetes/warehouse-crd#deletion" }, { "kind": "code", "literal": "hevlayer.com/force-delete: \"true\"", "chunkId": "kubernetes/warehouse-crd#deletion" } ], "sources": [ { "chunkId": "kubernetes/warehouse-crd#deletion", "url": "/docs/kubernetes/warehouse-crd#deletion", "anchor": "deletion" } ], "mode": "agent-primary", "terms": [ "deletion", "deleting", "warehouse", "fences", "everything", "drawing", "finalizer", "blocks", "while", "status", "consumers", "zero", "pipelines", "extracting", "keys", "entitled", "annotate", "hevlayer", "force", "delete", "true", "override" ] }, { "id": "kubernetes/warehouse-crd#keys", "kind": "section", "title": "Warehouse CRD", "heading": "Keys", "group": "Operations", "url": "/docs/kubernetes/warehouse-crd#keys", "summary": "Keys An ApiKey binds to a warehouse with a warehouse. entitlement carrying a list of opaque claims strings. Layer stores and echoes the strings; the application routes on them. No client route reaches a source system — c…", "facts": [ { "kind": "code", "literal": "ApiKey", "chunkId": "kubernetes/warehouse-crd#keys" }, { "kind": "code", "literal": "warehouse.", "chunkId": "kubernetes/warehouse-crd#keys" } ], "sources": [ { "chunkId": "kubernetes/warehouse-crd#keys", "url": "/docs/kubernetes/warehouse-crd#keys", "anchor": "keys" } ], "mode": "agent-primary", "terms": [ "keys", "apikey", "binds", "warehouse", "entitlement", "carrying", "list", "opaque", "claims", "strings", "layer", "stores", "echoes", "application", "routes", "client", "route", "reaches", "source", "system", "name", "clients", "touch", "indexes", "warehouses", "grants", "nothing", "inerts", "deleted" ] }, { "id": "kubernetes/warehouse-crd#pipeline-source", "kind": "section", "title": "Warehouse CRD", "heading": "Pipeline source", "group": "Operations", "url": "/docs/kubernetes/warehouse-crd#pipeline-source", "summary": "Pipeline source A pipeline extracting from a warehouse names it in spec.sourceRef. The source block owns the what — database, schema, query, cursor — and the warehouse owns the where and who: spec: sourceRef: kind: snowf…", "facts": [ { "kind": "code", "literal": "spec:\n sourceRef:\n kind: snowflake\n warehouseRef: prod-snowflake\n database: ANALYTICS\n query: >-\n SELECT ID, TITLE, BODY, REFRESH_ID FROM PUBLIC.NOTES\n WHERE REFRESH_ID > :cursor\n cursor:\n column: REFRESH_ID", "chunkId": "kubernetes/warehouse-crd#pipeline-source" }, { "kind": "code", "literal": "spec.sourceRef", "chunkId": "kubernetes/warehouse-crd#pipeline-source" }, { "kind": "code", "literal": "sourceRef.kind", "chunkId": "kubernetes/warehouse-crd#pipeline-source" }, { "kind": "code", "literal": "snowflake", "chunkId": "kubernetes/warehouse-crd#pipeline-source" }, { "kind": "code", "literal": "warehouseRef", "chunkId": "kubernetes/warehouse-crd#pipeline-source" }, { "kind": "code", "literal": "Verified", "chunkId": "kubernetes/warehouse-crd#pipeline-source" }, { "kind": "code", "literal": "/var/run/hevlayer/warehouse/", "chunkId": "kubernetes/warehouse-crd#pipeline-source" }, { "kind": "code", "literal": "HEVLAYER_WAREHOUSE", "chunkId": "kubernetes/warehouse-crd#pipeline-source" }, { "kind": "code", "literal": "HEVLAYER_SOURCE_REF", "chunkId": "kubernetes/warehouse-crd#pipeline-source" } ], "sources": [ { "chunkId": "kubernetes/warehouse-crd#pipeline-source", "url": "/docs/kubernetes/warehouse-crd#pipeline-source", "anchor": "pipeline-source" } ], "mode": "agent-primary", "terms": [ "pipeline", "source", "extracting", "warehouse", "names", "spec", "sourceref", "block", "owns", "database", "schema", "query", "cursor", "kind", "snowf", "snowflake", "warehouseref", "prod", "analytics", "select", "title", "body", "refresh", "public", "notes", "column", "verified", "hevlayer", "refreshid", "operator", "requires", "name", "same", "namespace", "mounts", "pair", "secret", "worker", "injects", "hevlayerwarehouse" ] }, { "id": "kubernetes/warehouse-crd#rotation", "kind": "section", "title": "Warehouse CRD", "heading": "Rotation", "group": "Operations", "url": "/docs/kubernetes/warehouse-crd#rotation", "summary": "Rotation Swap the referenced Secret's content. The operator re-verifies and status.verifiedAt advances; consumers resolve credentials through the warehouse at connection-build time, so new connections pick up the new key…", "facts": [ { "kind": "code", "literal": "status.verifiedAt", "chunkId": "kubernetes/warehouse-crd#rotation" }, { "kind": "code", "literal": "keyPairSecretRef", "chunkId": "kubernetes/warehouse-crd#rotation" } ], "sources": [ { "chunkId": "kubernetes/warehouse-crd#rotation", "url": "/docs/kubernetes/warehouse-crd#rotation", "anchor": "rotation" } ], "mode": "agent-primary", "terms": [ "rotation", "swap", "referenced", "secret", "content", "operator", "verifies", "status", "verifiedat", "advances", "consumers", "resolve", "credentials", "through", "warehouse", "connection", "build", "time", "connections", "pick", "keypairsecretref", "redeploy", "pointing", "different", "name", "spec", "edit", "same", "flow" ] }, { "id": "kubernetes/warehouse-crd#status", "kind": "section", "title": "Warehouse CRD", "heading": "Status", "group": "Operations", "url": "/docs/kubernetes/warehouse-crd#status", "summary": "Status status: phase: Verified verifiedAt: \"2026-06-10T00:00:00Z\" failureReason: null consumers: pipelines: 2 apiKeys: 1 The operator emits Kubernetes Events on phase transitions and counts observed references in status.…", "facts": [ { "kind": "code", "literal": "status:\n phase: Verified\n verifiedAt: \"2026-06-10T00:00:00Z\"\n failureReason: null\n consumers:\n pipelines: 2\n apiKeys: 1", "chunkId": "kubernetes/warehouse-crd#status" }, { "kind": "code", "literal": "status.consumers", "chunkId": "kubernetes/warehouse-crd#status" } ], "sources": [ { "chunkId": "kubernetes/warehouse-crd#status", "url": "/docs/kubernetes/warehouse-crd#status", "anchor": "status" } ], "mode": "agent-primary", "terms": [ "status", "phase", "verified", "verifiedat", "2026", "10t00", "failurereason", "null", "consumers", "pipelines", "apikeys", "operator", "emits", "kubernetes", "events", "transitions", "counts", "observed", "references" ] }, { "id": "kubernetes/warehouse-crd#verification", "kind": "section", "title": "Warehouse CRD", "heading": "Verification", "group": "Operations", "url": "/docs/kubernetes/warehouse-crd#verification", "summary": "Verification The operator probes the warehouse on apply, whenever the referenced Secret's content changes, and every verifyInterval. For snowflake the probe opens a key-pair session, runs SELECT 1 on the declared compute…", "facts": [ { "kind": "code", "literal": "verifyInterval", "chunkId": "kubernetes/warehouse-crd#verification" }, { "kind": "code", "literal": "snowflake", "chunkId": "kubernetes/warehouse-crd#verification" }, { "kind": "code", "literal": "SELECT 1", "chunkId": "kubernetes/warehouse-crd#verification" }, { "kind": "code", "literal": "Pending", "chunkId": "kubernetes/warehouse-crd#verification" }, { "kind": "code", "literal": "Verified", "chunkId": "kubernetes/warehouse-crd#verification" }, { "kind": "code", "literal": "status.verifiedAt", "chunkId": "kubernetes/warehouse-crd#verification" }, { "kind": "code", "literal": "Failed", "chunkId": "kubernetes/warehouse-crd#verification" }, { "kind": "code", "literal": "status.failureReason", "chunkId": "kubernetes/warehouse-crd#verification" }, { "kind": "code", "literal": "kubectl get warehouse", "chunkId": "kubernetes/warehouse-crd#verification" } ], "sources": [ { "chunkId": "kubernetes/warehouse-crd#verification", "url": "/docs/kubernetes/warehouse-crd#verification", "anchor": "verification" } ], "mode": "agent-primary", "terms": [ "verification", "operator", "probes", "warehouse", "apply", "whenever", "referenced", "secret", "content", "changes", "every", "verifyinterval", "snowflake", "probe", "opens", "pair", "session", "runs", "select", "declared", "compute", "pending", "verified", "status", "verifiedat", "failed", "failurereason", "kubectl", "closes", "phase", "meaning", "probed", "last", "succeeded", "time", "says", "loud", "signal", "outage", "flight" ] }, { "id": "limits", "kind": "section", "title": "Limits", "heading": null, "group": "Overview", "url": "/docs/limits", "summary": "Current ceilings inherited from the components we ship with, and what we don't cap. Layer is limited by certain constraints of the underlying components we ship with. We will lift these as demand increases. Single-node A…", "facts": [ { "kind": "code", "literal": "fields_skipped[]", "chunkId": "limits" }, { "kind": "code", "literal": "fields[]", "chunkId": "limits" }, { "kind": "code", "literal": "truncated: true", "chunkId": "limits" } ], "sources": [ { "chunkId": "limits", "url": "/docs/limits", "anchor": null } ], "mode": "agent-primary", "terms": [ "current", "ceilings", "inherited", "components", "ship", "layer", "limited", "certain", "constraints", "underlying", "lift", "these", "demand", "increases", "single", "node", "fields", "skipped", "truncated", "true", "aerospike", "enforce", "simplicity", "also", "believe", "large", "nvme", "drive", "offers", "enough", "storage", "almost", "every", "dataset", "turbopuffer", "namespaces", "sets", "logical", "separation", "data" ] }, { "id": "limits#no-limits", "kind": "section", "title": "Limits", "heading": "No limits", "group": "Overview", "url": "/docs/limits#no-limits", "summary": "No limits These have no enforced ceiling, but practical limits exist and will show up under load. CRD instances (Index, Function, Pipeline, Scaling) — bounded only by the etcd and operator throughput of your Kubernetes c…", "facts": [ { "kind": "code", "literal": "Index", "chunkId": "limits#no-limits" }, { "kind": "code", "literal": "Function", "chunkId": "limits#no-limits" }, { "kind": "code", "literal": "Pipeline", "chunkId": "limits#no-limits" }, { "kind": "code", "literal": "Scaling", "chunkId": "limits#no-limits" }, { "kind": "code", "literal": "spec.snapshot.retention", "chunkId": "limits#no-limits" }, { "kind": "code", "literal": "retention: never", "chunkId": "limits#no-limits" } ], "sources": [ { "chunkId": "limits#no-limits", "url": "/docs/limits#no-limits", "anchor": "no-limits" } ], "mode": "agent-primary", "terms": [ "limits", "these", "enforced", "ceiling", "practical", "exist", "show", "under", "load", "instances", "index", "function", "pipeline", "scaling", "bounded", "only", "etcd", "operator", "throughput", "kubernetes", "spec", "snapshot", "retention", "never", "cluster", "history", "namespace", "durable", "object", "storage", "cost", "search", "accumulates", "indefinitely", "automatic", "expiry", "clickstream", "event", "volume", "concurrency" ] }, { "id": "roadmap", "kind": "section", "title": "Roadmap & Changelog", "heading": null, "group": "Overview", "url": "/docs/roadmap", "summary": "Where hev layer is headed next, and what has shipped.", "facts": [], "sources": [ { "chunkId": "roadmap", "url": "/docs/roadmap", "anchor": null } ], "mode": "agent-primary", "terms": [ "layer", "headed", "next", "shipped" ] }, { "id": "roadmap#01-blockers", "kind": "section", "title": "Roadmap & Changelog", "heading": "0.1 Blockers", "group": "Overview", "url": "/docs/roadmap#01-blockers", "summary": "0.1 Blockers What stands between today and the 0.1 design-partner cut: 📚 Polish documentation 💸 Cost UAT 🪟 Dashboard UAT 🛒 Finalize hev-shop and publish the repo 🏗️ Narrow cluster topology defaults", "facts": [ { "kind": "code", "literal": "hev-shop", "chunkId": "roadmap#01-blockers" }, { "kind": "value", "literal": "0.1", "chunkId": "roadmap#01-blockers" } ], "sources": [ { "chunkId": "roadmap#01-blockers", "url": "/docs/roadmap#01-blockers", "anchor": "01-blockers" } ], "mode": "agent-primary", "terms": [ "blockers", "stands", "between", "today", "design", "partner", "polish", "documentation", "cost", "dashboard", "finalize", "shop", "publish", "repo", "narrow", "cluster", "topology", "defaults" ] }, { "id": "roadmap#01-release-uat", "kind": "section", "title": "Roadmap & Changelog", "heading": "0.1 Release (UAT)", "group": "Overview", "url": "/docs/roadmap#01-release-uat", "summary": "0.1 Release (UAT)", "facts": [ { "kind": "value", "literal": "0.1", "chunkId": "roadmap#01-release-uat" } ], "sources": [ { "chunkId": "roadmap#01-release-uat", "url": "/docs/roadmap#01-release-uat", "anchor": "01-release-uat" } ], "mode": "agent-primary", "terms": [ "release" ] }, { "id": "roadmap#api-hardening", "kind": "section", "title": "Roadmap & Changelog", "heading": "API hardening", "group": "Overview", "url": "/docs/roadmap#api-hardening", "summary": "API hardening 🧩 Finalize CRDs 🚆 Wire-compatible pass-through reads and writes 🏷️ Naming things", "facts": [], "sources": [ { "chunkId": "roadmap#api-hardening", "url": "/docs/roadmap#api-hardening", "anchor": "api-hardening" } ], "mode": "agent-primary", "terms": [ "hardening", "finalize", "crds", "wire", "compatible", "pass", "through", "reads", "writes", "naming", "things" ] }, { "id": "roadmap#later", "kind": "section", "title": "Roadmap & Changelog", "heading": "Later", "group": "Overview", "url": "/docs/roadmap#later", "summary": "Later ♻️ Soft delete with TTL + restore 🧮 Gateway-side query embedding — warm model endpoint 🕰️ Temporal queries — asof selector 🐇 Exact kNN result cache 🧪 A/B variant indexes 🦚 Per-query observability with LLM-judged Ta…", "facts": [ { "kind": "code", "literal": "as_of", "chunkId": "roadmap#later" }, { "kind": "code", "literal": "layer push", "chunkId": "roadmap#later" } ], "sources": [ { "chunkId": "roadmap#later", "url": "/docs/roadmap#later", "anchor": "later" } ], "mode": "agent-primary", "terms": [ "later", "soft", "delete", "restore", "gateway", "side", "query", "embedding", "warm", "model", "endpoint", "temporal", "queries", "asof", "selector", "exact", "result", "cache", "variant", "indexes", "observability", "judged", "layer", "push", "tail", "quality", "write", "amplification", "baselines", "python", "experience" ] }, { "id": "roadmap#lifecycle-and-operability", "kind": "section", "title": "Roadmap & Changelog", "heading": "Lifecycle and operability", "group": "Overview", "url": "/docs/roadmap#lifecycle-and-operability", "summary": "Lifecycle and operability 🎚️ Autoscaling compute for pipelines and UDFs 🗄️ Document cache endpoint for multi-stage pipelines 📸 Index snapshot history 🧨 Coordinated delete ⛵ Helm and Terraform install scripts 🔐 Scoped API…", "facts": [ { "kind": "code", "literal": "ApiKey", "chunkId": "roadmap#lifecycle-and-operability" } ], "sources": [ { "chunkId": "roadmap#lifecycle-and-operability", "url": "/docs/roadmap#lifecycle-and-operability", "anchor": "lifecycle-and-operability" } ], "mode": "agent-primary", "terms": [ "lifecycle", "operability", "autoscaling", "compute", "pipelines", "udfs", "document", "cache", "endpoint", "multi", "stage", "index", "snapshot", "history", "coordinated", "delete", "helm", "terraform", "install", "scripts", "scoped", "apikey", "keys", "minted", "resources", "warehouse", "declared", "snowflake", "sources" ] }, { "id": "roadmap#search", "kind": "section", "title": "Roadmap & Changelog", "heading": "Search", "group": "Overview", "url": "/docs/roadmap#search", "summary": "Search 🎯 Stable reads during heavy writes 🚦 Ready signal — namespace reports when every row is indexed 📜 Precomputed facet listings in snapshots 🪙 Precomputed facet counts in snapshots 🪃 Scans — row selection by filter,…", "facts": [ { "kind": "code", "literal": "fts", "chunkId": "roadmap#search" }, { "kind": "code", "literal": "ann", "chunkId": "roadmap#search" }, { "kind": "code", "literal": "Auto", "chunkId": "roadmap#search" } ], "sources": [ { "chunkId": "roadmap#search", "url": "/docs/roadmap#search", "anchor": "search" } ], "mode": "agent-primary", "terms": [ "search", "stable", "reads", "during", "heavy", "writes", "ready", "signal", "namespace", "reports", "every", "indexed", "precomputed", "facet", "listings", "snapshots", "counts", "scans", "selection", "filter", "auto", "document", "cached", "vector", "hybrid", "text", "fusion", "fuzzy", "bm25", "fused", "query", "routing", "picks", "mode", "history", "saved", "enhanced", "metadata" ] }, { "id": "roadmap#surfaces", "kind": "section", "title": "Roadmap & Changelog", "heading": "Surfaces", "group": "Overview", "url": "/docs/roadmap#surfaces", "summary": "Surfaces 🪟 Dashboard MVP — CRD management and observability 📚 Documentation site 🧰 Official Python, Go, and TypeScript clients", "facts": [], "sources": [ { "chunkId": "roadmap#surfaces", "url": "/docs/roadmap#surfaces", "anchor": "surfaces" } ], "mode": "agent-primary", "terms": [ "surfaces", "dashboard", "management", "observability", "documentation", "site", "official", "python", "typescript", "clients" ] }, { "id": "roadmap#up-next", "kind": "section", "title": "Roadmap & Changelog", "heading": "Up Next", "group": "Overview", "url": "/docs/roadmap#up-next", "summary": "Up Next Planned for 0.2: 🧭 Pinecone VectorStore backend — a second store kind 📊 Performance benchmarks — published pass-through overhead 🌱 Namespace init — first-time embed population 🔥 Trending searches — reduce-shaped…", "facts": [ { "kind": "code", "literal": "VectorStore", "chunkId": "roadmap#up-next" }, { "kind": "value", "literal": "0.2", "chunkId": "roadmap#up-next" } ], "sources": [ { "chunkId": "roadmap#up-next", "url": "/docs/roadmap#up-next", "anchor": "up-next" } ], "mode": "agent-primary", "terms": [ "next", "planned", "pinecone", "vectorstore", "backend", "second", "store", "kind", "performance", "benchmarks", "published", "pass", "through", "overhead", "namespace", "init", "first", "time", "embed", "population", "trending", "searches", "reduce", "shaped", "udfs", "license", "validation", "portable", "bundle", "stand", "search", "only", "edges" ] }, { "id": "tradeoffs", "kind": "section", "title": "Tradeoffs", "heading": null, "group": "Overview", "url": "/docs/tradeoffs", "summary": "The current product posture and the cases it is not trying to cover. Layer makes a set of design tradeoffs we believe improve functionality of the search engine. This page makes those tradeoffs explicit. As this list gro…", "facts": [ { "kind": "flag", "literal": "--muted", "chunkId": "tradeoffs" }, { "kind": "flag", "literal": "--signal", "chunkId": "tradeoffs" } ], "sources": [ { "chunkId": "tradeoffs", "url": "/docs/tradeoffs", "anchor": null } ], "mode": "agent-primary", "terms": [ "current", "product", "posture", "cases", "trying", "cover", "layer", "makes", "design", "tradeoffs", "believe", "improve", "functionality", "search", "engine", "page", "those", "explicit", "list", "muted", "signal", "grows", "offer", "configuration", "possible", "allow", "users", "configure", "their", "preference", "adds", "latency", "query", "path", "following", "ways", "additional", "network", "configurable", "plan" ] } ], "edges": [] } ``` --- # Introduction Source: https://hevlayer.com/docs import Diagram from "../../components/docs/Diagram.astro"; import { layerMapDiagram } from "../../lib/diagrams"; Layer provides a set of drop-in enhancements to your favorite retrieval systems. Layer lets you scale your own compute over [multi-stage pipelines](/docs/api/pipelines), reason about the [state of your index](/docs/api/namespace-metadata), observe [clickstream](/docs/api/search-history), track [cost](/docs/dashboard), and more. {layerMapDiagram} You run two server components in your own cluster: a Rust **gateway** and a Kubernetes **operator**. The **gateway** is a transparent proxy in front of Turbopuffer. It extends native clients with [fetch](/docs/api/query#fetch), [scans](/docs/api/scans), [snapshots](/docs/api/snapshots), and operator-facing semantics around the cache, write path, and [pipelines](/docs/api/pipelines) — you swap in Layer's drop-in client and change nothing else. It also drives the function runtime: discovering [UDF](/docs/kubernetes/function-crd) work, leasing it to worker pools, retrying, and writing results back, with KEDA scaling each pool to zero between bursts. You call the gateway four ways: the [Python client](/docs/api/introduction#install), the [Go client](/docs/api/introduction#install), the [TypeScript client](/docs/api/introduction#install), or the REST API directly — the clients are generated from the same OpenAPI spec, and every endpoint page shows them side by side. Layer also ships an optional GUI [dashboard](/docs/dashboard). The dashboard manages cluster configuration through CRDs; all other state is persisted in object storage (S3). No durable state lives in a Layer process, so the compute tier is stateless and fully elastic. Because indexing is bursty, especially GPU-bound work, our [Terraform](/docs/install#terraform) installs [Karpenter](https://karpenter.sh) as a cluster autoscaler to provision and scale the nodes Layer's compute runs on. The remaining backing services are the document cache, the indexing-state store, and the metrics store. Every component Layer runs alongside is open source: - **[Karpenter](https://karpenter.sh)** — cluster autoscaler that provisions and scales nodes for Layer's bursty, GPU-bound compute (Apache-2.0). - **[Aerospike](https://aerospike.com)** — NVMe-backed ephemeral document cache (AGPL-3.0). - **[PostgreSQL](https://www.postgresql.org)** — indexing-state store for the pipeline and embed queue (PostgreSQL License). - **[VictoriaMetrics](https://victoriametrics.com)** — metrics store (Apache-2.0). To get started, see the [install guide](/docs/install). For more technical detail, see [Concepts](/docs/concepts), [Guarantees](/docs/guarantees), and [Tradeoffs](/docs/tradeoffs). --- # Concepts Source: https://hevlayer.com/docs/concepts ## Control loops Layer uses a control loop as a core primitive for managing your indexes. It reconciles index state against metrics emitted by the search system, which is how Layer applies row-level transformations ([UDFs](/docs/kubernetes/function-crd)) and keeps an index's stable view current. Related: [UDFs](/docs/kubernetes/function-crd), [snapshots](/docs/api/snapshots), stable watermark. ## Kubernetes autoscaling Because Layer is stateless, you can autoscale every tier independently. Karpenter handles node-level scaling, and KEDA scales pods against signals from an embedded PostgreSQL queue. The data in that queue is used for scaling decisions only — it carries no non-recoverable system state. ## Gateway enhancements Where helpful, the gateway extends your search system with common query patterns and filtering primitives. Layer's enhancements use reserved `_hevlayer_*` attributes; changing the schema on those attributes breaks Layer's guarantees but should degrade gracefully. All functionality is exposed through one API surface, the [Python, Go, or TypeScript client, or plain REST](/docs/api/introduction#install), so applications can route every call through the gateway. Layer works best when traffic flows through it consistently, even for requests that need no extra behavior. ## Scatter/gather Layer can partition a single namespace into hash buckets, called shards, by assigning each row a reserved `_hevlayer_shard` attribute (xxh64 of its id, modulo the shard count). The gateway then scatters a query to every bucket in parallel, one `_hevlayer_shard`-filtered query per shard, and gathers the results: it merges and re-ranks the combined rows down to your requested `top_k` before returning them. Sharding stays invisible to the client — you issue one query and get one ranked result set. The same scatter/gather path backs [scans](/docs/api/scans) (filter, full-text, and radius) and [UDF](/docs/kubernetes/function-crd) discovery scans. ## Document cache The document cache (NVMe-backed Aerospike) does two jobs. Document [reads](/docs/api/query#fetch) are served pull-through: the gateway checks the cache first, and on a miss reads through to Turbopuffer (or S3 for snapshots), returns the row, and backfills the cache best-effort. [Pipeline](/docs/api/pipelines) chunk handoff uses the same store as the queue between CPU and GPU workers. Neither job makes it a hard dependency: document reads fall through to origin if the cache is unavailable, and chunk reads fall back to S3 backing (see [Failure modes](/docs/failure-modes)). One logical cache serves every path, with different uses (document fetch, pipeline chunks, snapshot field-values) separated by Aerospike `set`. ## Glossary | Concept | Current meaning | | --- | --- | | [Namespace](/docs/api/introduction) | A Turbopuffer namespace addressed through `/v2/namespaces/{namespace}`. | | Document | A row id plus attributes, and optionally a vector when writing/searching. | | Document cache | NVMe-backed records keyed by namespace and document id, plus cache sets for pipeline chunks and snapshots. | | Stable watermark | Epoch-ms cut tracked by the consistency watcher when Turbopuffer index status is up-to-date. | | Ready signal | Whether a namespace is fully indexed: `indexed` / `index_lag_rows` on [namespace metadata](/docs/api/namespace-metadata), reconciled from the latest snapshot when every row's vector is indexed. | | [Pipeline](/docs/api/pipelines) | A PostgreSQL-backed state machine for CPU extraction and GPU embedding work. | | [Snapshot](/docs/api/snapshots) | A content-addressed S3 facet histogram written after a namespace is observed stable. | | Facet listing | The distinct values for a field, precomputed in snapshots as `fields[].values[].v` or computed on demand by a values scan. | | Facet count | The document count for a facet value, returned as `fields[].values[].n` in snapshots and `values[].n` in values scan results. | | [Scan](/docs/api/scans) | On-demand row selection by filter, full-text (`fts`), or radius (`ann`) that returns matching IDs or field values asynchronously, or a row count synchronously. | | [UDF](/docs/kubernetes/function-crd) | A stateless worker the gateway coordinates over existing rows to enrich, fan out, or re-upsert data. | | Gateway | The Rust proxy fronting Turbopuffer that serves the compatible API plus cache, scans, snapshots, pipelines, and the UDF runtime. | | [Operator](/docs/kubernetes/operator) | The Kubernetes operator that reconciles Layer's CRDs — functions, pipelines, scaling, and cluster config. | | Shard | A hash bucket within a single namespace. Each row carries a reserved `_hevlayer_shard` value (xxh64 of its id, modulo the shard count) so the gateway can scatter/gather a query across buckets. | | Leg | One subquery in a [hybrid text](/docs/api/query#hybrid-text-fusion) expansion: the full-input BM25 leg or one per-token fuzzy leg. Every leg of a query reads the same stable watermark cut. | | RRF | Reciprocal rank fusion — Turbopuffer-native re-ranking (`rerank_by: ["RRF", ...]`) that merges legs into one list. Layer delegates all fusion math upstream. | | Tokenizer policy | The documented transform from a `HybridText` input to query tokens: UAX #29 word boundaries and lowercasing via Turbopuffer's open-source `alyze` tokenizer (the production `word_v4` code), then drop tokens under 2 characters, dedupe, cap at 15. | | Route | The retrieval strategy the [query router](/docs/api/query#query-routing) picks for an `Auto` query (`hybrid_text`, `semantic`, or `fused`), chosen from the shape of the input text alone. Vector availability gates execution, not the choice. | | Routing policy | The deterministic, versioned decision function behind `Auto`. The version travels in the `routing` echo block and search history so threshold changes are visible. | | Deferral | The response to a vectorless `Auto` query routed `semantic` or `fused`: the routing decision with `executed: false` and no rows. The application embeds and re-issues with the route forced. | | CRD | Custom Resource Definition: the Kubernetes-native resources the operator reconciles — [functions](/docs/kubernetes/function-crd), [pipelines](/docs/kubernetes/pipeline-crd), [scaling](/docs/kubernetes/scaling-crd), and [indexes](/docs/kubernetes/index-crd). | | PromQL | The Prometheus query language. The gateway proxies it to the embedded VictoriaMetrics so you can query metrics without a separate scraper. | --- # Document model Source: https://hevlayer.com/docs/document-model Layer reserves the `_hevlayer_*` attribute prefix for its own bookkeeping. The gateway manages these attributes: your writes and [UDF](/docs/kubernetes/function-crd) completion patches must not set them, and editing them directly breaks Layer's guarantees. | Attribute | Type | Purpose | | --- | --- | --- | | `_hevlayer_upserted_at` | integer (epoch ms) | Server-stamped on every write. The gateway filters queries to `_hevlayer_upserted_at <= watermark` to hold the read-consistency cut while the upstream index catches up. | | `_hevlayer_shard` | integer | Hash bucket assigned at write time (`xxh64(id) % shard_count`), present only on sharded namespaces. Lets the gateway [scatter/gather](/docs/concepts#scattergather) a query across the shards of one namespace. | | `_hevlayer_udf__v` | string | Function completion marker. The gateway stamps the Function's `spec.version` here when a worker completes a row. Hyphens in the Function id are normalized to underscores. | | `_hevlayer_udf__stale_after` | integer or null | Function invalidation marker. Discovery reclaims rows once this epoch-ms timestamp expires; completion clears the marker. | The `_hevlayer_` prefix also namespaces internal cache sets (snapshot field-values and search-history clickstream), but those are cache keys, not part of your document schema. --- # No Guarantees Source: https://hevlayer.com/docs/guarantees import Callout from "../../components/docs/Callout.astro"; Layer can't offer guarantees. We try our best to provide secure, hands-off infrastructure that you are ultimately responsible for. While we can't offer guarantees, we make a set of promises in how we design, secure, and distribute our software that we believe make it easy to use and will stand the test of time. This page covers the specific status of those promises. ## Commitments - Your index stays in your search system. We will not reimplement indexing. Layer keeps a copy of your data, but the search index lives in your vector store. - Your history is backed up to S3. Search history and namespace snapshots are written to the S3 bucket you specify. The format of this data may change. - Data on NVMe. Customer document and chunk data is served from NVMe for price/performance. We try not to stray from this pattern, though some use cases may justify a smaller in-memory document cache. - This documentation is accurate and up to date. When it isn't, that's a bug in the software — report it. - Graceful degradation. We add graceful degradation support whenever possible — the gateway degrades rather than failing hard. The per-scenario behavior and recovery signals live in the [failure-mode runbook](/docs/failure-modes). - Client compatibility. We will (almost) always stay client-compatible with the search systems we front. Where we diverge, it's a feature making an explicit tradeoff we believe is an improvement. - One consistency cut per query. When the gateway expands a query into multiple legs ([hybrid text fusion](/docs/api/query#hybrid-text-fusion), [scatter/gather](/docs/concepts#scattergather)), every leg is filtered at the same stable watermark, injected from a single read. Legs never see different cuts. Layer was developed by a [single person](https://hevmind.com/about) orchestrating agentic coding tools and building automation. Not a single line of code was hand-written. That said, it was made with ❤️ by a human as much as it is built by AI. --- # Tradeoffs Source: https://hevlayer.com/docs/tradeoffs Layer makes a set of design tradeoffs we believe improve functionality of the search engine. This page makes those tradeoffs explicit. As this list grows, we will offer configuration where possible to allow users to configure their preference. Layer adds latency to the query path in the following ways. - An additional network hop (not configurable). - A query plan that allows for [stable reads](/docs/api/query#stable-reads) during heavy writes ([index configurable](/docs/kubernetes/index-crd)). Layer also increases index storage requirements via. - A secondary indexing for filtering by upsert time (not configurable). - A secondary indexing used for scatter gather sharding (not configurable). --- # Limits Source: https://hevlayer.com/docs/limits Layer is limited by certain constraints of the underlying components we ship with. We will lift these as demand increases. - **Single-node Aerospike.** We enforce this for simplicity and also believe that a single large NVMe drive offers enough storage for almost every dataset. - **~4,090 Turbopuffer namespaces.** We use Aerospike sets for logical separation of data, which are limited by the Aerospike Community Edition AGPL license. - **~3 TB cache size.** Another limitation of the Aerospike license. - **10,000 distinct values per scan facet field.** Pre-computed snapshot scans cap each facet field's cardinality. If a field exceeds the cap, it is noted in `fields_skipped[]` rather than `fields[]`, so readers can treat every emitted field as complete. See [snapshots](/docs/api/snapshots). - **1,000,000 distinct values per values scan.** On-demand values scans accumulate their histogram in gateway memory. A job that crosses the cap completes with `truncated: true`: the cap applies after the full pass, keeping the top values by count — each with an exact count — and dropping the low-count tail. See [scans](/docs/api/scans#values-mode). ## No limits These have no enforced ceiling, but practical limits exist and will show up under load. - **CRD instances** (`Index`, `Function`, `Pipeline`, `Scaling`) — bounded only by the etcd and operator throughput of your Kubernetes cluster. - **Snapshot history per namespace** — durable in S3; bounded by `spec.snapshot.retention` when set, or by object storage cost under `retention: never`. - **Search history retention** — accumulates indefinitely in S3; no automatic expiry. - **Clickstream event volume** — accumulates indefinitely in S3; no automatic expiry. - **UDF concurrency per function** — KEDA scales replicas to match queue depth, bounded by your cluster's capacity. - **Pipeline queue depth** — pipeline queues, including chunked document queues, store document IDs and chunk ID lists in S3 manifests and keep only segment state and counters in Postgres. - **Document size and attribute count** — bounded by Turbopuffer and Aerospike record limits, not by Layer. --- # Agents Source: https://hevlayer.com/docs/agents import Callout from "../../components/docs/Callout.astro"; These docs are queryable from the command line. The same engine behind the `⌘K` search on this site ships as a CLI, so your coding agent can search, read, and cite the Layer docs directly — no scraping, no MCP server, no API key. The `layer` CLI also lets agents operate environments, indexes, pipelines, UDFs, and Function runs. The skill bodies below are plain `SKILL.md` files. Use your agent harness' native skill directory when it has one, or paste the same Markdown into `AGENTS.md` or the harness equivalent. ## 1. Install the CLIs ```sh go install github.com/hev/ask/cmd/ask@latest ``` The `ask` binary is self-contained; any agent harness that can run a shell command can use it. From a Layer checkout, build the `layer` CLI when the agent should operate Layer environments instead of only searching docs: ```sh go build -o layer ./apps/layer-cli ``` ## 2. Add the docs skill Set `AGENT_SKILL_HOME` to your harness's skill directory, such as `~/.codex/skills` for Codex or `~/.claude/skills` for Claude Code. ```sh AGENT_SKILL_HOME="${AGENT_SKILL_HOME:-${CODEX_HOME:-$HOME/.codex}/skills}" mkdir -p "$AGENT_SKILL_HOME/hevlayer-docs" cat > "$AGENT_SKILL_HOME/hevlayer-docs/SKILL.md" <<'EOF' --- name: hevlayer-docs description: >- Query the hev layer docs. Use when the user asks about Layer — the Turbopuffer gateway, stable reads, the stable watermark, the document cache, warm jobs, scans (filter, full-text, and radius), snapshots, pipelines, UDFs, the Index/InfraRules/Pipeline/Function CRDs, compute pools, install via Terraform or Helm, failure modes, or the dashboard. --- # hev layer docs Answer Layer questions from the docs, not from memory. Every verb is a keyless read: ask --endpoint https://hevlayer.com/api/ask search "" ask --endpoint https://hevlayer.com/api/ask section get "" ask --endpoint https://hevlayer.com/api/ask overview ask --endpoint https://hevlayer.com/api/ask glossary get "" Start with `search`; fetch sections for detail; use `overview` when you need the full map. Section ids look like `api/query#stable-reads`. Cite sections in your answer as https://hevlayer.com plus the returned `url` field. If `ask` is missing, install it: `go install github.com/hev/ask/cmd/ask@latest` EOF ``` ## 3. Add the layer CLI skill Use this skill when an agent should inspect or operate Layer through the `layer` CLI. The skill keeps read-only inspection, docs lookup, and mutating operations separate. ```sh AGENT_SKILL_HOME="${AGENT_SKILL_HOME:-${CODEX_HOME:-$HOME/.codex}/skills}" mkdir -p "$AGENT_SKILL_HOME/hevlayer-layer-cli" cat > "$AGENT_SKILL_HOME/hevlayer-layer-cli/SKILL.md" <<'EOF' --- name: hevlayer-layer-cli description: >- Use the hevlayer layer CLI. Use when the user asks an agent to inspect Layer environments, query docs through layer ask, list or get indexes, pipelines, or UDFs, open the operations TUI, delete indexes, or run Function manifests with the layer CLI. --- # hevlayer layer CLI Use `layer` to operate hevlayer from the terminal. In a Layer checkout, prefer a repo-local binary: go build -o layer ./apps/layer-cli ./layer --help Use `./layer ...` for a repo-local binary and `layer ...` for one on `PATH`. Prefer `-o json` for agent parsing and do not print API keys. For docs questions, start with `layer ask` against the committed digest: layer ask grep "" layer ask cat "" layer ask tree layer ask glossary get "" For read-only operational inspection, prefer: layer -o json env ls layer -o json env show [NAME] layer -o json index list layer -o json index get NAME layer -o json pipeline list layer -o json pipeline get ID layer -o json udf list layer -o json udf get UDF_ID Only `layer run` needs Kubernetes access by default. It applies a Function CR, registers the UDF spec with the gateway, triggers discovery, and optionally watches until the queue drains. Confirm the target environment, gateway URL, kube context, and Kubernetes namespace before mutating state. Mutating commands include `layer env add`, `layer env use`, `layer env rm`, `layer index delete`, `layer run`, and `layer run --rm`. Resolve configuration in this order: explicit flags, `LAYER_*` or `HEVLAYER_*` environment variables, `--env` or `LAYER_ENV`, the active `~/.hevlayer/config.toml` environment, then the built-in base URL. EOF ``` ## 4. Ask ```sh ask --endpoint https://hevlayer.com/api/ask search "cache is down" ``` ```json { "results": [ { "title": "Concepts", "heading": "Document cache", "url": "/docs/concepts#document-cache", "group": "Overview", "snippet": "The document cache does two jobs: pull-through document reads..." } ] } ``` From here your agent typically runs `section get` on the winning id and answers with the citation. ## The verbs | Verb | Returns | | --- | --- | | `overview` | Orientation context plus the full section map with stable ids | | `search ""` | Ranked sections with snippets and deep links | | `section get ""` | One section: summary, exact identifiers, source URL | | `glossary get ""` | A product term resolved through its aliases (`watermark` → stable watermark) | ## Why answers stay grounded Search runs over a committed, reviewable digest of these docs — the same corpus, heading by heading, that renders on this site. Every anchor in it is verified against the rendered pages in CI, so a cited deep link like [/docs/api/query#stable-reads](/docs/api/query#stable-reads) always resolves. When the docs change, the digest is rebuilt and recommitted with them. Every verb above is a read against the public docs. Nothing to sign up for, nothing to configure beyond the endpoint URL. The docs are also available as plain text for direct ingestion: [/llms.txt](/llms.txt) (index) and [/llms-full.txt](/llms-full.txt) (full corpus). The CLI is the better path for agents that can run commands — it ranks, resolves aliases, and costs a fraction of the tokens. --- # Roadmap & Changelog Source: https://hevlayer.com/docs/roadmap ## 0.1 Blockers What stands between today and the 0.1 design-partner cut: - 📚 Polish documentation - 💸 Cost UAT - 🪟 Dashboard UAT - 🛒 Finalize `hev-shop` and publish the repo - 🏗️ Narrow cluster topology defaults ## Up Next Planned for 0.2: - 🧭 Pinecone `VectorStore` backend — a second store kind - 📊 Performance benchmarks — published pass-through overhead - 🌱 Namespace init — first-time embed population - 🔥 Trending searches — reduce-shaped UDFs - 🔑 License key validation - 🧳 Portable key bundle — stand up search-only edges ### Later - ♻️ Soft delete with TTL + restore - 🧮 Gateway-side query embedding — warm model endpoint - 🕰️ Temporal queries — `as_of` selector - 🐇 Exact kNN result cache - 🧪 A/B variant indexes - 🦚 Per-query observability with LLM-judged Tail Quality - 📣 Write amplification baselines - 📮 `layer push` — Python UDF dev experience ## 0.1 Release (UAT) ### API hardening - 🧩 Finalize CRDs - 🚆 Wire-compatible pass-through reads and writes - 🏷️ Naming things ### Lifecycle and operability - 🎚️ [Autoscaling compute](/docs/kubernetes/scaling-crd) for pipelines and UDFs - 🗄️ [Document cache endpoint](/docs/api/query#fetch) for multi-stage pipelines - 📸 [Index snapshot history](/docs/api/snapshots) - 🧨 Coordinated delete - ⛵ [Helm and Terraform install](/docs/install) scripts - 🔐 [Scoped API keys](/docs/api/keys) — minted [`ApiKey` resources](/docs/kubernetes/apikey-crd) - 🏭 [Warehouse CRD](/docs/kubernetes/warehouse-crd) — declared Snowflake sources ### Surfaces - 🪟 [Dashboard MVP](/docs/dashboard) — CRD management and observability - 📚 Documentation site - 🧰 Official Python, Go, and TypeScript clients ### Search - 🎯 [Stable reads](/docs/api/query#stable-reads) during heavy writes - 🚦 [Ready signal](/docs/api/namespace-metadata) — namespace reports when every row is indexed - 📜 Precomputed facet listings in [snapshots](/docs/api/snapshots) - 🪙 Precomputed facet counts in [snapshots](/docs/api/snapshots) - 🪃 [Scans](/docs/api/scans) — row selection by filter, `fts`, or `ann` - 🆔 Search by id via document-cached vector - 🪢 [Hybrid text fusion](/docs/api/query#hybrid-text-fusion) — fuzzy + BM25 fused by RRF - 🧭 [Query routing](/docs/api/query#query-routing) — `Auto` picks the search mode - 📰 [Search history](/docs/api/search-history) saved to S3 - 🗂️ Enhanced [namespace metadata](/docs/api/namespace-metadata) --- # FAQ Source: https://hevlayer.com/docs/faq This page answers the questions the rest of the docs don't: licensing, pricing, and where the project is headed. ## What is the licensing for hev layer? Right now hev layer is distributed under a proprietary license and requires a signed beta agreement. ## Will it be a paid product? Yes. I'm building hev layer as a business, not a side project. Paying customers get a vendor with every incentive to support them well. ## Will any of it be open source? Some of it. The clients will be published publicly during the design preview, and as commercial adoption grows I plan to open the gateway as well. The dashboard, operator, and autoscaling components will stay under a commercial license. Opening the gateway is also the ultimate form of [graceful degradation](/docs/guarantees) — a core value of hev layer. Infrastructure on your query path shouldn't depend on any one company sticking around, including mine. ## How much will it cost? Pricing isn't final. The shape will be: a small line item for an enterprise, sized to fund proper support and full-time development. ## Is hev layer a hosted service? No. Layer runs in your Kubernetes cluster, next to your data, against your own Turbopuffer account. See [install](/docs/install) for what a deployment looks like. ## How do I get started? [Sign up](/#design-preview) and I'll follow up to schedule a discovery call. ## Who built hev layer? [Adam Hevenor](https://hevmind.com/about). hev layer is a [hev mind](https://hevmind.com) product. --- # Install Source: https://hevlayer.com/docs/install import Callout from "../../components/docs/Callout.astro"; A hev layer install has two stages. **Terraform** provisions the required AWS resources: IAM, S3, ECR, networking, cost-read roles, and, for the recommended path, a fresh EKS cluster. **Helm** installs the gateway, operator, and document cache into that cluster and wires them to the AWS resources Terraform produced. You can skip Terraform if you already have the AWS resources hev layer needs. At minimum, provide an S3 bucket and gateway IRSA role for snapshots and history. For the full feature set, also provide dashboard cost-read IAM, image registry locations, and cluster-level components equivalent to the Terraform outputs. ## Install shape An install is one Helm release per environment with one S3 bucket for snapshot and history data. The chart renders a default [`VectorStore`](/docs/kubernetes/vectorstore-crd) from the credential you provide; an install can define additional `VectorStore` resources, each with its own upstream credential and inbound auth policy, and route namespaces between them with `Index.spec.backend.storeRef`. Scoped gateway-only bearer keys are available through the `keys` inbound auth mode described below. ## Terraform The Terraform configuration in `infra/terraform/` provisions the AWS resources that the gateway and operator need. It is opinionated about the resources hev layer needs to behave correctly and conservative about resources around it. Route53 hosted zones and ACM certificates are opt-in; most installs bring existing DNS and TLS. ### What it sets up | Resource | Purpose | | --- | --- | | S3 bucket | Durable storage for namespace snapshots, search history, and clickstream events. | | IAM roles + IRSA policies | Gateway S3 access, dashboard cost-read access, and worker/operator AWS access. | | ECR repositories | Image registry for the gateway, operator, and customer-built function images. | | EKS + VPC + node pools | Recommended fresh-cluster runtime for design partners. | | Route53 + ACM | Optional DNS zones, records, and TLS certificates when `manage_public_dns=true`. | ### Cluster: recommended Design-partner installs should use a fresh EKS cluster unless there is a specific reason to bind hev layer to an existing one. The cluster path provisions: - a VPC with the subnets and endpoints hev layer expects - an EKS control plane and one always-on `system` node group, defaulting to an `i4i.large` so the serving path and document cache share local NVMe - public worker subnets by default, with no NAT Gateway in the fresh cluster path - Karpenter for scale-from-zero `worker-cpu` and `worker-gpu` indexing capacity - the AWS Load Balancer Controller for ingress - EFS for shared persistent volumes If you already operate an EKS cluster, you can disable the cluster modules and point hev layer at the existing cluster. You are still responsible for the functional prerequisites: an S3 bucket for snapshots/history, gateway IRSA that can read/write that bucket, dashboard IRSA for AWS cost and pricing reads, image registry access, Karpenter or equivalent node autoscaling for workers, and the AWS Load Balancer Controller if you use public ingress. For design partners, deploy hev layer to a fresh cluster. The baseline is one always-on i4i node for the gateway, dashboard, control loops, and document cache. CPU and GPU indexing workers scale from zero, so embedding and extraction cost follows indexing duty cycle instead of becoming a standing line item. The fresh path also avoids a NAT Gateway: workers run in public subnets, so the large data volumes that flow during indexing skip NAT's per-GB processing charge. Existing clusters typically route worker egress through NAT already, which turns every pipeline run into a metered transfer. ### Cost notes The Terraform is designed to deploy a cost-efficient AWS footprint with autoscaling for on-demand indexing work. At rest, the fixed costs are mostly EKS, one i4i `system` node, the shared ALB, and small storage lines. On current us-east-1 on-demand pricing, that baseline is roughly the low hundreds of dollars per month before variable traffic, object storage, and upstream vector-store usage. Indexing bursts scale CPU or GPU worker nodes up through Karpenter and back down when queues drain. If you switch workers to private subnets, enabling NAT adds a standing hourly and egress cost. Heavier search use cases may need more read-side infrastructure: additional gateway replicas, larger always-on nodes, or a dedicated document-cache pool for steady cache pressure. Contact hev layer for help sizing read-heavy deployments. ### Outputs Terraform emits the values the Helm chart needs to install: the S3 bucket name, gateway IRSA role ARN, dashboard cost-read role ARN, ECR image URLs, and cluster metadata. Pass these into the Helm values file described below. ## Helm The Helm chart at `infra/helm/layer/` installs the gateway, operator, and document cache into a cluster that already has the AWS resources from [Terraform](#terraform) or equivalent resources you manage. ### Required values Most of the chart is opinionated defaults. In a typical install the credential you bring from outside the cluster becomes the default `VectorStore` credential. | Value | Required | Notes | | --- | --- | --- | | `vectorStore.credential.apiKey` | yes | Upstream store credential. With the default `deriveFromStore` auth mode, clients also send this as the gateway bearer token. | | `vectorStore.endpoint.url` | yes | Upstream store API base URL. Defaults to Turbopuffer's AWS us-east-1 endpoint. | | `vectorStore.endpoint.region` | yes | Region label for the rendered `VectorStore`. | | `vectorStore.inboundAuth.mode` | no | `deriveFromStore`, `keys`, or `open`. Defaults to `deriveFromStore`. | | `vectorStore.inboundAuth.keys` | for `keys` mode | Gateway-only bearer keys with `read`, `write`, and `admin` scopes. | | `gateway.image` | yes | Gateway image URL — Terraform emits this as an ECR output. | | `s3.bucket` | yes | S3 bucket Terraform created for snapshots and history. | | `serviceAccount.roleArn` | yes | IRSA role ARN that grants the gateway access to the S3 bucket. | | `gateway.indexNamespace` | no | Namespace containing `Index` CRs. Blank follows `operator.discovery.indexNamespace`, then the Helm release namespace. | | `gateway.indexConfig.enabled` | no | Enables gateway reads of `Index` CR routing and policy such as `spec.backend.storeRef`, `spec.snapshot.facetFields`, and `spec.scan.threads`. | | `gateway.indexGc.enabled` | no | Enables namespace hard-delete cleanup of operator-discovered `Index` CRs. | | `gateway.consistency.stablePollIntervalMs` | no | Slow polling cadence for namespaces last observed stable. Defaults to `60000`; cold and updating namespaces keep the fast gateway default. | | `dashboard.serviceAccount.roleArn` | for cost tab | IRSA role ARN with AWS pricing, CloudWatch, and cost read access. | | `ingress.host` | optional | Set when you want a public ingress; use your DNS/TLS or enable Terraform-managed Route53/ACM. | Most other Helm inputs are wiring between resources the install process already produced. The store API key is the credential hev layer cannot generate for you. The chart stores it in a Kubernetes Secret, points the default `VectorStore` at that Secret, and the gateway derives its default inbound bearer from the same key. ### Gateway auth modes The default `deriveFromStore` mode is the single-tenant BYOC path: ```yaml vectorStore: credential: apiKey: tpuf_... inboundAuth: mode: deriveFromStore ``` For an install that needs a gateway-only bearer, use `keys` mode. The chart renders `apiKey` values into the release Secret and references them from the `VectorStore`; omit `apiKey` when pointing at a pre-created Secret. ```yaml vectorStore: credential: apiKey: tpuf_... inboundAuth: mode: keys workerSecretKey: layer-inbound-worker-api-key keys: - name: worker scopes: [read, write, admin] apiKey: layer_worker_... secretRef: key: layer-inbound-worker-api-key ``` In `keys` mode, operator workers, KEDA, and the dashboard use `workerSecretName` / `workerSecretKey` as their gateway bearer. Blank `workerSecretName` uses the release Secret; blank `workerSecretKey` uses `layer-inbound-worker-api-key`. ### Run the install ```sh helm upgrade --install layer ./infra/helm/layer \ --namespace layer --create-namespace \ -f values.customer.yaml ``` The chart is not published to a public Helm repository — install from the source path or from the chart artifact provided during onboarding. ### What gets installed - `layer-gateway` — Rust gateway for Turbopuffer-compatible routes, fetch, scans, snapshots, warm jobs, and pipeline state. - `layer-operator` — reconciler for VectorStore, Index, InfraRules, Pipeline, and Function CRDs documented in [Kubernetes](/docs/kubernetes/operator). - `layer-document-cache` — Aerospike-backed document cache, scale-to-zero by default, scheduled onto the always-on i4i system node in the baseline profile. - Optional Karpenter `NodePool` / `EC2NodeClass` resources for `worker-cpu` and `worker-gpu` indexing capacity when `workerKarpenter.enabled=true`. A dedicated `document-cache` pool is still available for larger installs by setting `documentCache.nodeRole=document-cache` and `documentCache.karpenter.enabled=true`. - Supporting resources: service accounts, IRSA bindings, ingress, and CRDs. ### Default InfraRules When `operator.infraRules.create=true`, Helm renders the cluster-scoped `InfraRules/default` object used by every Pipeline and Function `spec.scaling.pool` reference. If a workload omits `scaling.pool`, the operator maps `worker.computeClass: cpu` or `gpu` to the stock `cpu` or `gpu` pool. The default compute pools are: | Pool | Use | | --- | --- | | `cpu` | General CPU workers such as extraction, ingestion, and lightweight Functions. | | `cpu-large` | CPU workers that need local ephemeral-storage headroom for per-pod source caches. | | `gpu` | One-NVIDIA-GPU workers for embedding and model inference. | The stock pools select `layer.hev.dev/node-role=worker-cpu` or `worker-gpu`, matching the chart's `workerKarpenter` NodePools. Override `operator.infraRules.computePools` to tune resource requests, limits, node selectors, tolerations, GPU SKU hints, or per-workload replica ceilings for your cluster. See [InfraRules CRD](/docs/kubernetes/scaling-crd) for the full field shape. --- # Operator Overview Source: https://hevlayer.com/docs/kubernetes/operator `layer-operator` manages declarative state for your hev layer deployment. It serves a few crucial functions — monitoring for changes to your indexes and managing scaling. It does this through a set of abstractions known as [custom resource definitions (CRDs)](/docs/concepts#glossary). The gateway handles the read and write path; the operator handles everything that wants to be expressed as desired state in the cluster: which vector store the gateway fronts, which indexes exist, how worker pools scale, and which stateless functions run against which indexes. ## CRDs The operator reconciles five resource kinds, each documented on its own page: - [VectorStore CRD](/docs/kubernetes/vectorstore-crd) — the upstream store endpoint, credential reference, and gateway inbound auth policy. - [Index CRD](/docs/kubernetes/index-crd) — one resource per Turbopuffer namespace the gateway should manage. - [InfraRules CRD](/docs/kubernetes/scaling-crd) — cluster-wide compute pools, document cache rules, and shared scaling policy. - [Pipeline CRD](/docs/kubernetes/pipeline-crd) — staged work that changes row count. - [Function CRD](/docs/kubernetes/function-crd) — stateless user-defined functions that read and write attributes on an index. ## Relationship to the gateway The gateway and the operator are decoupled. The operator reconciles declarative state; the gateway serves the read and write path. Neither sits in the other's hot path, so the gateway keeps serving even if the operator is restarted or lagging. The link between them is one-directional and read-only. For some features the gateway reads CRD status, such as which indexes exist and which worker pools are ready, to inform what it serves. It never writes to the CRDs; declarative state is authored by you and reconciled by the operator, and the gateway is only ever a reader of it. ## Scheduling and node pools The operator applies the compute pool chosen by each Pipeline and Function. A pool can set container resources, `nodeSelector`, and `tolerations`, so operators can pin CPU, storage-heavy CPU, and GPU work to the right node capacity. Helm installs `cpu`, `cpu-large`, and `gpu` pools by default. The stock pools select the chart-rendered Karpenter worker pools: `layer.hev.dev/node-role=worker-cpu` for CPU and `layer.hev.dev/node-role=worker-gpu` for GPU. The GPU pool also requests `nvidia.com/gpu: "1"` and carries the standard NVIDIA toleration. This is configured once on `InfraRules/default`, not per workload — see [InfraRules](/docs/kubernetes/scaling-crd) for the compute-pool fields and how Pipelines and Functions choose a pool. --- # VectorStore CRD Source: https://hevlayer.com/docs/kubernetes/vectorstore-crd A `VectorStore` is the gateway's upstream connection. It names the store kind, endpoint, credential Secret, and the inbound auth policy the gateway applies to client requests. An install may define more than one `VectorStore`; each `Index.spec.backend.storeRef` selects which store serves that upstream namespace. ```yaml apiVersion: hevlayer.com/v1alpha1 kind: VectorStore metadata: name: turbopuffer-default namespace: layer spec: kind: turbopuffer default: true endpoint: url: https://aws-us-east-1.turbopuffer.com region: aws-us-east-1 credential: secretRef: name: layer key: turbopuffer-api-key inboundAuth: mode: deriveFromStore ``` ## Connection | Field | Purpose | | --- | --- | | `kind` | `turbopuffer`. `pinecone` is reserved by the schema but rejected by the operator until implemented. | | `default` | Marks the store used when an `Index` omits `spec.backend.storeRef`. A single store is treated as the default. | | `endpoint.url` | Upstream API base URL. | | `endpoint.region` | Operator-visible region label for this store. | | `credential.secretRef` | Secret key in the same namespace as the `VectorStore`. The credential is never stored in the CRD. | ## Routing The gateway builds one upstream client per `VectorStore` in the namespace. Requests whose namespace has an `Index` with `spec.backend.storeRef` use that store; other namespaces use the default store. Two `Index` objects cannot resolve to the same upstream namespace. ## Inbound auth `inboundAuth.mode` controls what bearer token the gateway accepts: | Mode | Behavior | | --- | --- | | `deriveFromStore` | Default. The gateway accepts the default store's credential as the inbound bearer. This is the single-tenant BYOC shape. | | `keys` | The gateway accepts the listed independent key Secrets and enforces their `read`, `write`, and `admin` scopes. | | `open` | No inbound auth. Use only for explicitly open environments. | Under `deriveFromStore`, clients set `Authorization: Bearer ` when calling the gateway. Operator-managed workers and KEDA use the same Secret through `LAYER_GATEWAY_API_KEY`. Under `keys`, each key points at a Secret in the same namespace: ```yaml spec: inboundAuth: mode: keys keys: - name: shop-rw scopes: [read, write] secretRef: name: layer key: layer-inbound-shop-rw-api-key ``` `read` covers GET/HEAD routes and read-shaped POST routes such as query, batch fetch, scans, and metrics proxy queries. `write` covers namespace writes and worker queue claim/complete routes. `admin` covers Pipeline and Function create/delete/control routes and also satisfies `read` and `write`. In every mode the gateway also accepts a minted [`ApiKey`](/docs/kubernetes/apikey-crd) token whose `vectorstore.` entitlement names this store, enforcing that entitlement's scopes and namespace globs. ## Status The operator sets `status.reachable` and a `Ready` condition after validating the Secret references and probing `GET /v1/namespaces` on the store endpoint. --- # ApiKey CRD Source: https://hevlayer.com/docs/kubernetes/apikey-crd An `ApiKey` is a minted credential as a resource. Layer owns the credential lifecycle — mint, verify, revoke, expire — and what the key opens is declared per resource: each entitlement names a [`VectorStore`](/docs/kubernetes/vectorstore-crd), a [`Warehouse`](/docs/kubernetes/warehouse-crd), or Layer itself, and carries the scopes and claims for that target. Claims are opaque to Layer — an external system can use Layer as its key store and keep authorization decisions to itself. Keys have two authoring surfaces that round-trip through one schema: `kubectl get apikey -o yaml` and `GET /v2/keys/{keyId}` are two spellings of the same object. ```yaml apiVersion: hevlayer.com/v1alpha1 kind: ApiKey metadata: name: cohort-reader namespace: layer spec: owner: acme description: cohort read access entitlements: vectorstore.prod-turbopuffer: scopes: [read] namespaces: ["cohort-*"] warehouse.prod-snowflake: claims: - "notes:cohort:*:read" expiresAfter: 365d status: keyId: 0a1b2c3d-… phase: Active lookupHash: sha256:… createdAt: "2026-06-10T00:00:00Z" expiresAt: "2027-06-10T00:00:00Z" secretRef: name: apikey-cohort-reader ``` ## Spec | Field | Purpose | | --- | --- | | `owner` | Optional free-form owner label, echoed in list and authenticate responses. | | `description` | Optional free-form description. | | `entitlements` | Map keyed by target resource. Each entry carries `scopes`, `namespaces`, and `claims` for that target. | | `expiresAfter` | Duration or `never`. Defaults to `365d`; `status.expiresAt` is computed at mint. | ## Entitlements | Key | Target | | --- | --- | | `vectorstore.` | Data-plane access through the named store. `scopes` (`read`, `write`) gate routes whose `Index` resolves to that store; `namespaces` globs constrain which upstream namespaces. | | `warehouse.` | A list of opaque `claims` strings bound to the source system. Layer stores and echoes them; the application routes on them. No client route reaches a source — clients touch indexes, not warehouses — so the entitlement grants nothing in Layer and inerts when the warehouse is deleted. | | `layer` | The control plane itself. `scopes: [admin]` covers key management and Pipeline/Function create/delete/control routes, and satisfies `read` and `write` everywhere. | Scope meanings match [inbound auth](/docs/kubernetes/vectorstore-crd#inbound-auth): `read` covers query, fetch, scans, and metrics; `write` covers namespace writes and worker routes. `claims` is a list of opaque strings, allowed on any entitlement and the only field on a warehouse entitlement. Layer stores them, returns them from list, get, and authenticate, and never interprets them — an existing permission grammar (`service:resource_type:resource_id:action` strings, a legacy entitlement vocabulary) drops in verbatim, and the consuming application maps them to its own authorization. An entitlement whose target does not exist grants nothing and surfaces as a status condition (`EntitlementTargetMissing`) — not an admission error, so keys and their targets can be applied in either order. Check the condition after applying: a typo in a target name looks the same as a missing target. A key whose entitlements carry only claims — no scopes — is a pure external-store key: it authenticates, but opens no Layer route. ## Minting **REST.** `POST /v2/keys` generates the token, creates the `ApiKey` resource, and returns the token in the response — once. The raw token is never persisted; Layer stores only one-way hashes on the resource. ```http POST /v2/keys # 201 { keyId, …, token } — token returned once GET /v2/keys # metadata only; ?includeRevoked GET /v2/keys/{keyId} POST /v2/keys/{keyId}/revoke # idempotent DELETE /v2/keys/{keyId} # hard delete POST /v2/keys/authenticate # body { token } → 200 { keyId, entitlements, … } | 401 ``` Key-management routes require a key with the `layer` entitlement at `admin` scope. `POST /v2/keys/authenticate` is unauthenticated by construction — the token is the credential. **CRD.** Apply an `ApiKey` with no credential. The operator mints the token, writes it to a Secret named in `status.secretRef` (key `token`), and moves `phase` from `Pending` to `Active`. The Secret is the token delivery; it is owned by the `ApiKey` and garbage-collected with it. Rotation is delete-and-reapply — a new key, a new token. ## Verification External systems present the raw token to `POST /v2/keys/authenticate` and get back `keyId` (a stable actor id) plus the full `entitlements` map, then make their own authorization decisions from the claims. The gateway also accepts any `Active` key's token as a bearer on its own routes, enforcing the entitlement for the store or control-plane surface the route resolves to. Verification is one indexed lookup plus one hash check against a watch-fed in-memory map — the hot path never reads the control plane per request. `status.lastSeenAt` advances at most once per five minutes per key. | Phase | Meaning | | --- | --- | | `Pending` | CRD-authored key awaiting mint. | | `Active` | Verifiable; token works. | | `Revoked` | `POST /v2/keys/{keyId}/revoke` was called; token refused. | | `Expired` | `status.expiresAt` passed; token refused. | Deleting a `VectorStore` or `Warehouse` inerts every entitlement that names it: the keys stay `Active` for their other entitlements, and the deletion is finalizer-guarded on the target's side while keys still reference it. ## Kubernetes RBAC CRD authoring makes kubectl a minting surface, so the chart ships roles to delegate key administration without cluster-admin: | ClusterRole | Grants | | --- | --- | | `hevlayer-key-admin` | Full verbs on `apikeys`, plus `get` on delivered token Secrets. Can mint, revoke, and collect tokens. | | `hevlayer-key-viewer` | `get`/`list`/`watch` on `apikeys`. No Secret access — status hashes are one-way, so viewing is audit, not credential access. | Neither role aggregates into the built-in `view`/`edit`/`admin` ClusterRoles: namespace viewer never silently means key viewer. Bindings are the cluster operator's explicit act; set `rbac.keyRoleBindings` in Helm values to render them for the single-team case. ## Bootstrapping `LAYER_GATEWAY_API_KEY` is the bootstrap credential: it mints the first admin key — ```yaml spec: entitlements: layer: scopes: [admin] ``` — after which routine minting uses minted admin keys. Cluster operators can equally bootstrap by applying an `ApiKey` resource, since CRD authoring needs only kubectl access. --- # Warehouse CRD Source: https://hevlayer.com/docs/kubernetes/warehouse-crd A `Warehouse` declares an upstream source system — the system of record pipelines extract rows from, plus the verified credential to reach it. Data in Layer is derived from a warehouse and reconstructible from it. The serving side is the [`VectorStore`](/docs/kubernetes/vectorstore-crd); the two sit on opposite sides of the gateway. ```yaml apiVersion: hevlayer.com/v1alpha1 kind: Warehouse metadata: name: prod-snowflake namespace: layer spec: kind: snowflake snowflake: account: acme-xy12345 user: SVC_LAYER role: SVC_LAYER_ROLE warehouse: EXTRACT_WH keyPairSecretRef: name: snowflake-rsa pool: size: 5 timeout: 30s verifyInterval: 1h ``` ## Connection | Field | Purpose | | --- | --- | | `kind` | `snowflake`. `databricks` and `iceberg` are reserved by the schema but rejected by the operator until implemented. | | `snowflake.account` | Snowflake account identifier. | | `snowflake.user` | Service user the key pair authenticates. | | `snowflake.role` | Optional role assumed on connect. | | `snowflake.warehouse` | Snowflake compute warehouse extraction queries run on. | | `snowflake.keyPairSecretRef` | Secret in the same namespace holding `private-key.pem` and optional `passphrase`. The credential is never stored in the CRD. | | `snowflake.pool` | Connection pool tuning: `size`, `timeout`. | | `verifyInterval` | Probe cadence. Defaults to `1h`. | A warehouse is account, credential, and compute — not a catalog. Which database, schema, or table to read belongs to the [pipeline source](#pipeline-source); one credential reaches many databases. ## Verification The operator probes the warehouse on apply, whenever the referenced Secret's content changes, and every `verifyInterval`. For `snowflake` the probe opens a key-pair session, runs `SELECT 1` on the declared compute warehouse, and closes. | Phase | Meaning | | --- | --- | | `Pending` | Not yet probed. | | `Verified` | Last probe succeeded; `status.verifiedAt` is the probe time. | | `Failed` | Last probe failed; `status.failureReason` says why. | `Failed` is a loud signal, not an outage: in-flight pipeline runs keep their connections, new runs refuse to start, and the condition surfaces in `kubectl get warehouse` and the dashboard. Pipelines start only against a `Verified` warehouse. ## Rotation Swap the referenced Secret's content. The operator re-verifies and `status.verifiedAt` advances; consumers resolve credentials through the warehouse at connection-build time, so new connections pick up the new key with no redeploy. Pointing `keyPairSecretRef` at a different Secret name is a spec edit with the same flow. ## Pipeline source A pipeline extracting from a warehouse names it in `spec.sourceRef`. The source block owns the *what* — database, schema, query, cursor — and the warehouse owns the *where* and *who*: ```yaml spec: sourceRef: kind: snowflake warehouseRef: prod-snowflake database: ANALYTICS query: >- SELECT ID, TITLE, BODY, REFRESH_ID FROM PUBLIC.NOTES WHERE REFRESH_ID > :cursor cursor: column: REFRESH_ID ``` When `sourceRef.kind` is `snowflake`, the operator requires `warehouseRef` to name a `Verified` warehouse in the same namespace. It mounts the warehouse's key-pair Secret into the worker pod at `/var/run/hevlayer/warehouse/` and injects `HEVLAYER_WAREHOUSE` — connection JSON resolved from the warehouse spec (account, user, role, compute warehouse, pool), no credential material. The worker builds its own connection from the two; `HEVLAYER_SOURCE_REF` carries the source block verbatim as for [any other source](/docs/kubernetes/pipeline-crd#source). ## Keys An [`ApiKey`](/docs/kubernetes/apikey-crd) binds to a warehouse with a `warehouse.` entitlement carrying a list of opaque claims strings. Layer stores and echoes the strings; the application routes on them. No client route reaches a source system — clients touch indexes, not warehouses — so the entitlement grants nothing in Layer, and it inerts when the warehouse is deleted. ## Deletion Deleting a warehouse fences everything drawing from it. A finalizer blocks deletion while `status.consumers` is non-zero — pipelines extracting from it or keys entitled to it — annotate with `hevlayer.com/force-delete: "true"` to override. ## Status ```yaml status: phase: Verified verifiedAt: "2026-06-10T00:00:00Z" failureReason: null consumers: pipelines: 2 apiKeys: 1 ``` The operator emits Kubernetes Events on phase transitions and counts observed references in `status.consumers`. --- # Index CRD Source: https://hevlayer.com/docs/kubernetes/index-crd An `Index` represents one namespace exposed through the gateway. It declares which upstream namespace to use, snapshot policy, cache posture, and consistency mode. The backend connection itself lives in a [VectorStore](/docs/kubernetes/vectorstore-crd). ```yaml apiVersion: hevlayer.com/v1 kind: Index metadata: name: products namespace: layer spec: backend: storeRef: turbopuffer-default namespace: products distanceMetric: cosine_distance metadata: labels: app: shop tags: - catalog snapshot: interval: 5m retention: never facetFields: - category - brand scan: threads: 8 cache: ttl: 24h capGiB: 64 mode: standard consistency: strong ``` ## Backend | Field | Purpose | | --- | --- | | `backend.storeRef` | Optional `VectorStore` name in the same namespace. The gateway routes requests for this upstream namespace to that store. Defaults to the namespace's default store. | | `backend.namespace` | Optional upstream namespace override. Defaults to the Index name. | | `backend.distanceMetric` | Vector metric, default `cosine_distance`. | ## Snapshot policy | Field | Default | Purpose | | --- | --- | --- | | `snapshot.facetFields` | `[]` | Fields the gateway materializes into durable facet snapshots. Empty disables the automatic writer. | | `snapshot.interval` | `5m` | Minimum spacing between automatic snapshot writes after upstream-stable advances. | | `snapshot.retention` | `never` | `never` keeps all snapshot bodies; a duration such as `30d` prunes older bodies while keeping the latest. | ## Scan policy `scan.threads` sets the per-namespace default for origin scan fan-out: the maximum concurrent upstream requests one scan may issue during scatter/gather. It defaults to `8` and is clamped by the gateway's server cap and the active shard count. Request-level `threads` overrides this default for one scan. ## Cache policy Aerospike remains an ephemeral cache; durable snapshot history stays in S3. Cache warming uses the same scan fan-out policy as other origin scans. ## Status The operator reports observed generation, metadata sync state, and conditions. `status.snapshot.lastRun` and `lastSuccess` are reserved for the gateway history bridge. --- # InfraRules CRD Source: https://hevlayer.com/docs/kubernetes/scaling-crd `InfraRules` is the cluster-scoped policy object for Layer-managed runtime infrastructure. There is exactly one object: `InfraRules/default`. Pipelines and Functions do not reference a separate autoscaling resource. They set `spec.scaling` inline and choose a pool from `InfraRules/default.spec.computePools`. ## InfraRules ```yaml apiVersion: hevlayer.com/v1alpha1 kind: InfraRules metadata: name: default spec: computePools: - name: cpu kind: cpu nodeSelector: layer.hev.dev/node-role: worker-cpu layer.hev.dev/compute: cpu tolerations: - key: layer.hev.dev/node-role operator: Equal value: worker-cpu effect: NoSchedule resources: requests: cpu: "1" memory: 2Gi limits: cpu: "2" memory: 4Gi maxReplicasPerWorkload: 32 - name: cpu-large kind: cpu nodeSelector: layer.hev.dev/node-role: worker-cpu layer.hev.dev/compute: cpu tolerations: - key: layer.hev.dev/node-role operator: Equal value: worker-cpu effect: NoSchedule resources: requests: cpu: "1" memory: 2Gi ephemeral-storage: 35Gi limits: cpu: "4" memory: 4Gi ephemeral-storage: 40Gi maxReplicasPerWorkload: 8 - name: gpu kind: gpu nodeSelector: layer.hev.dev/node-role: worker-gpu layer.hev.dev/compute: gpu tolerations: - key: layer.hev.dev/node-role operator: Equal value: worker-gpu effect: NoSchedule - key: nvidia.com/gpu operator: Exists effect: NoSchedule resources: requests: cpu: 250m memory: 4Gi nvidia.com/gpu: "1" limits: cpu: "2" memory: 10Gi nvidia.com/gpu: "1" maxReplicasPerWorkload: 4 documentCache: capGiB: 256 replicationFactor: 1 scaling: mode: autoscale nodes: min: 0 max: 1 ``` The operator validates that the object is named `default`. Helm can render the default object with `operator.infraRules.create=true`. ## Compute pools The Helm defaults define three well-known pools: | Pool | Use | | --- | --- | | `cpu` | General CPU workers. | | `cpu-large` | CPU workers that need local ephemeral-storage headroom. | | `gpu` | One-NVIDIA-GPU workers for embedding and inference. | The default pools select the Karpenter-backed worker nodes with `layer.hev.dev/node-role=worker-cpu` or `worker-gpu`. The default `gpu` pool also requests `nvidia.com/gpu: "1"` and includes the standard NVIDIA toleration. Override `nodeSelector`, `gpuType`, or resource envelopes in `operator.infraRules.computePools` when your cluster uses different worker pool names or specific SKUs. | Field | Purpose | | --- | --- | | `name` | Referenced by `spec.scaling.pool` on Pipeline and Function resources. | | `kind` | Pool class label such as `cpu` or `gpu`. | | `gpuType` | Optional descriptive GPU type for GPU pools. | | `nodeSelector` | Applied to worker pods that choose the pool. | | `tolerations` | Applied to worker pods that choose the pool. | | `resources` | Container resources applied to worker pods. | | `maxReplicasPerWorkload` | Hard ceiling for one Pipeline or Function. | If a workload names an unknown pool or asks for more replicas than the pool ceiling, the operator leaves the workload unready and records a condition on its status. ## Workload scaling ```yaml scaling: pool: cpu mode: autoscale replicas: min: 0 max: 4 ``` | Mode | Behavior | | --- | --- | | `autoscale` | Emit a KEDA `ScaledObject` and let queue depth scale the Deployment between `min` and `max`. | | `fixed` | Set Deployment replicas to `replicas.min`; no KEDA object is emitted. | | `disabled` | Scale the Deployment to 0; no KEDA object is emitted. | Paused workloads also scale to 0. To keep a cold-start-heavy worker warm, set `mode: autoscale` and `replicas.min: 1`. When a Function or Pipeline omits `scaling.pool`, the operator uses `worker.computeClass` to choose the stock `cpu` or `gpu` pool. ## Document cache rules `documentCache` captures the operator-owned document cache settings: capacity, replication factor, and node count. Helm still renders the document-cache KEDA object directly; `InfraRules` is the declared policy shape the operator reports and validates against. --- # Pipeline CRD Source: https://hevlayer.com/docs/kubernetes/pipeline-crd The `Pipeline` CRD declares the scaling characteristics you want for ingesting data. Ingestion typically runs in stages: a CPU stage for chunking and extraction, followed by a GPU stage for embedding. You can declare the spec in YAML, from code through the [pipeline API](/docs/api/pipelines), or a combination of both — it is recommended you declare your pipeline scaling characteristics in YAML while setting your namespace via the client. `spec.sourceRef` lets you declare your pipeline's upstream details as well — the operator hands it to the worker as an environment variable, so the worker reads its source from config instead of hardcoding it. ```yaml apiVersion: hevlayer.com/v1alpha1 kind: Pipeline metadata: name: product-images namespace: layer spec: target: namespace: products sourceRef: kind: sqs queueUrl: https://sqs.us-east-1.amazonaws.com/123456789/product-images worker: image: ghcr.io/hev/product-image-worker:latest computeClass: cpu batchSize: 64 timeoutSeconds: 60 scaling: pool: cpu mode: autoscale replicas: min: 0 max: 8 ``` ## Target `spec.target.namespace` is the Turbopuffer namespace the pipeline writes. The gateway pipeline API owns document state, chunks, and vector writes for that target namespace. ## Pipeline id `spec.pipelineId` names the gateway pipeline (the queue) the worker stages into and scales on. It defaults to the resource name. Set it when multiple worker resources share one queue: the extract and embed stages of a [two-stage pipeline](/docs/api/pipelines) both set `pipelineId: products`. ## Source `spec.sourceRef` is intentionally open JSON for the external source that feeds the worker: SQS, Kafka, S3 events, a partner API, or a one-off migration source. The operator injects it into the worker pod verbatim as `HEVLAYER_SOURCE_REF`; the worker image owns source-specific behavior. See [Extract and chunk](/docs/api/pipelines#extract-and-chunk) for a worker reading it. ## Worker | Field | Purpose | | --- | --- | | `image` | Worker image. | | `computeClass` | `cpu` or `gpu`. Defaults to `cpu`; when `scaling.pool` is omitted, the operator maps this to the stock `cpu` or `gpu` pool. | | `batchSize` | Work items per batch. | | `timeoutSeconds` | Worker call timeout. | | `podSpec` | Optional pod-level merge patch. | The operator creates one Deployment per Pipeline and injects: | Variable | Value | | --- | --- | | `HEVLAYER_PIPELINE_ID` | `spec.pipelineId`, defaulting to the resource name. | | `HEVLAYER_TARGET_NAMESPACE` | `spec.target.namespace`. | | `HEVLAYER_BASE_URL` | The gateway base URL. | | `HEVLAYER_SOURCE_REF` | `spec.sourceRef` as JSON, when set. | | `LAYER_GATEWAY_API_KEY` | Gateway bearer token. In `deriveFromStore` mode this is the default `VectorStore` credential; in `keys` mode it is the configured inbound worker key. | ## Scaling ```yaml scaling: pool: cpu mode: autoscale replicas: min: 0 max: 8 ``` `spec.scaling.pool`, when set, must name a pool in [`InfraRules/default`](/docs/kubernetes/scaling-crd). When omitted, the operator uses `worker.computeClass` to choose the stock `cpu` or `gpu` pool. Helm installs the well-known `cpu`, `cpu-large`, and `gpu` pools by default. `mode: autoscale` creates a KEDA `ScaledObject` backed by pipeline queue depth. `mode: fixed` pins the Deployment to `replicas.min`; `mode: disabled` scales it to zero. `spec.paused: true` also scales the worker to zero. ## Status Use the [pipeline status API](/docs/api/pipelines#wait-for-completion) for status: queue counts, stage progress, and worker state. The resource itself reports only managed object references and readiness conditions. --- # Function CRD Source: https://hevlayer.com/docs/kubernetes/function-crd import CodeTabs from "../../../components/docs/CodeTabs.astro"; The `Function` CRD is a User Defined Function (UDF) that runs over rows that already exist in an [Index](/docs/kubernetes/index-crd). It is the right shape for classifiers, enrichment, backfills, fan-out from an existing row, and deterministic re-upserts. UDFs are best defined in YAML and invoked by the [layer CLI](/docs/cli#run-a-function). The operator creates worker resources; the gateway owns discovery, queueing, retries, leases, and completion markers. Workers own their data writes. Use a [Pipeline](/docs/kubernetes/pipeline-crd) when external data becomes rows in Layer. Use a Function when compute starts from rows that are already in Layer. ```yaml apiVersion: hevlayer.com/v1alpha1 kind: Function metadata: name: tag-products namespace: layer spec: targetNamespaces: - products inputs: - id - title version: v1 filter: - category - Eq - outdoor worker: image: ghcr.io/hev/tag-products:latest dispatch: pull computeClass: cpu batchSize: 32 timeoutSeconds: 30 schedule: discoveryIntervalSeconds: 300 leaseSeconds: 120 maxInFlightBatches: 8 maxConcurrentScans: 1 retry: maxAttempts: 8 initialBackoffSeconds: 5 maxBackoffSeconds: 300 triggers: - discovery scaling: pool: cpu mode: autoscale replicas: min: 0 max: 6 ``` ## Selection Use `targetNamespaces` for explicit namespaces. Use `indexSelector` when labels on `Index` resources should choose the namespaces. `filter` preserves arbitrary JSON, including array-form Turbopuffer filters. The operator stores the shape as-is; the gateway evaluates it during discovery after AND-ing it with the generated completion-marker predicate. Do not include a version-marker predicate in `filter`; the gateway creates that from `spec.version`. ## Worker | Field | Purpose | | --- | --- | | `image` | Worker image. | | `dispatch` | `pull` for SDK claim/poll workers, `push` for HTTP `/run` workers. | | `computeClass` | `cpu` or `gpu`. Defaults to `cpu`; when `scaling.pool` is omitted, the operator maps this to the stock `cpu` or `gpu` pool. | | `port` | Push-dispatch service port. | | `batchSize` | Rows per batch. | | `timeoutSeconds` | Worker call timeout. | | `podSpec` | Optional pod-level merge patch. | To apply the CR, register the gateway UDF, trigger discovery, and watch the queue with one command, use [`layer run -f`](/docs/cli#run-a-function). The worker pod receives `HEVLAYER_UDF_ID`, `HEVLAYER_BASE_URL`, `HEVLAYER_UDF_BATCH_SIZE`, `HEVLAYER_UDF_TIMEOUT_SECONDS`, `HEVLAYER_UDF_LEASE_SECONDS`, and `LAYER_GATEWAY_API_KEY`. The gateway bearer is sourced from the default `VectorStore` credential in `deriveFromStore` mode, or from the configured inbound worker key in `keys` mode. ## Simple classifier The Python client turns a normal function into the claim/process/complete loop. `output="tags"` is client-side metadata: the CRD does not declare an output attribute. `run_udf_worker` sends the returned value as a completion `attributes.tags` patch, and the gateway stamps the reserved completion marker in the same patch. The Go client drives the same worker protocol directly, as does the TypeScript client — claim a batch, process rows, report completions and failures. ```python import asyncio from hevlayer.udf import PermanentError, TransientError, run_udf_worker, udf @udf(inputs=["id", "title", "description"], output="tags", kind="tags") def tag_product(*, id: str, title: str | None, description: str | None) -> list[str]: if not title: raise PermanentError(f"{id}: missing title") try: text = f"{title} {description or ''}".lower() except TypeError as exc: raise TransientError(str(exc)) from exc tags: list[str] = [] if "wireless" in text: tags.append("wireless") if "waterproof" in text: tags.append("waterproof") return tags or ["uncategorized"] if __name__ == "__main__": asyncio.run(run_udf_worker(tag_product, udf_id="product-tags")) ``` ```go package main import ( "context" "os" "strings" hevlayer "github.com/hev/layer/clients/go" ) func tags(title, description string) []string { text := strings.ToLower(title + " " + description) var out []string if strings.Contains(text, "wireless") { out = append(out, "wireless") } if strings.Contains(text, "waterproof") { out = append(out, "waterproof") } if len(out) == 0 { out = []string{"uncategorized"} } return out } func main() { ctx := context.Background() udfID := os.Getenv("HEVLAYER_UDF_ID") layer := hevlayer.NewClient( hevlayer.WithBaseURL(os.Getenv("HEVLAYER_BASE_URL")), hevlayer.WithAPIKey(os.Getenv("LAYER_GATEWAY_API_KEY")), ) for { claimed, err := layer.ClaimUdfItems(ctx, udfID, &hevlayer.UdfClaimRequest{ WorkerID: "tag-products-0", Limit: 32, }) if err != nil { continue } var done []hevlayer.UdfCompleteItem var failed []hevlayer.UdfFailItem for _, item := range claimed.Items { title, _ := item.Input["title"].(string) description, _ := item.Input["description"].(string) if title == "" { failed = append(failed, hevlayer.UdfFailItem{ Namespace: item.Namespace, ID: item.ID, Kind: "permanent", Message: "missing title", }) continue } done = append(done, hevlayer.UdfCompleteItem{ Namespace: item.Namespace, ID: item.ID, Attributes: map[string]interface{}{"tags": tags(title, description)}, }) } if len(done) > 0 { layer.CompleteUdfItems(ctx, udfID, &hevlayer.UdfCompleteRequest{ WorkerID: "tag-products-0", Items: done, }) } if len(failed) > 0 { layer.FailUdfItems(ctx, udfID, &hevlayer.UdfFailRequest{ WorkerID: "tag-products-0", Items: failed, }) } } } ``` ```typescript import { Hevlayer } from "hevlayer"; function tags(title: string, description: string): string[] { const text = `${title} ${description}`.toLowerCase(); const out: string[] = []; if (text.includes("wireless")) out.push("wireless"); if (text.includes("waterproof")) out.push("waterproof"); return out.length ? out : ["uncategorized"]; } const udfId = process.env.HEVLAYER_UDF_ID!; const layer = new Hevlayer({ baseUrl: process.env.HEVLAYER_BASE_URL, apiKey: process.env.LAYER_GATEWAY_API_KEY, }); while (true) { const claimed = await layer.claimUdfItems(udfId, { worker_id: "tag-products-0", limit: 32, }); const done = []; const failed = []; for (const item of claimed.items) { const title = typeof item.input.title === "string" ? item.input.title : ""; const description = typeof item.input.description === "string" ? item.input.description : ""; if (!title) { failed.push({ namespace: item.namespace, id: item.id, kind: "permanent", message: "missing title", }); continue; } done.push({ namespace: item.namespace, id: item.id, attributes: { tags: tags(title, description) }, }); } if (done.length > 0) { await layer.completeUdfItems(udfId, { worker_id: "tag-products-0", items: done }); } if (failed.length > 0) { await layer.failUdfItems(udfId, { worker_id: "tag-products-0", items: failed }); } } ``` In Python, function parameters are keyword-only and named to match `inputs`; raise `TransientError` for retryable work and `PermanentError` for unrecoverable input. In Go and TypeScript, report the same split through `FailUdfItems` / `failUdfItems` with `kind: "transient"` or `kind: "permanent"`. ## GPU classifier More complicated classifiers (e.g. a vision-language classifier) may require a model to run on a GPU. ```yaml apiVersion: hevlayer.com/v1alpha1 kind: Function metadata: name: product-color namespace: layer spec: targetNamespaces: - amazon-products inputs: - id - image_url version: v1 worker: image: ghcr.io/hev/hev-shop-udf-product-color:latest dispatch: pull computeClass: gpu batchSize: 8 timeoutSeconds: 120 schedule: leaseSeconds: 300 maxInFlightBatches: 2 triggers: - discovery scaling: pool: gpu mode: autoscale replicas: min: 0 max: 2 ``` `worker.computeClass: gpu` defaults omitted `scaling.pool` to the `gpu` pool from [`InfraRules/default`](/docs/kubernetes/scaling-crd). The stock pool selects `layer.hev.dev/node-role=worker-gpu`, requests one NVIDIA GPU, and carries the worker and NVIDIA tolerations: ```yaml computePools: - name: gpu kind: gpu maxReplicasPerWorkload: 4 nodeSelector: layer.hev.dev/node-role: worker-gpu layer.hev.dev/compute: gpu tolerations: - key: layer.hev.dev/node-role operator: Equal value: worker-gpu effect: NoSchedule - key: nvidia.com/gpu operator: Exists effect: NoSchedule resources: requests: { memory: 4Gi, nvidia.com/gpu: "1" } limits: { memory: 10Gi, nvidia.com/gpu: "1" } ``` The worker loads the model once at startup and classifies per row. CLIP zero-shot classification labels each product image with its dominant color: ```python import asyncio import io import httpx import torch from PIL import Image from transformers import pipeline from hevlayer.udf import PermanentError, TransientError, run_udf_worker, udf COLORS = ["black", "white", "gray", "red", "blue", "green", "brown", "multicolor"] classifier = pipeline( "zero-shot-image-classification", model="openai/clip-vit-large-patch14", device="cuda" if torch.cuda.is_available() else "cpu", ) @udf(inputs=["id", "image_url"], output="color", kind="classification") def classify_color(*, id: str, image_url: str | None) -> str: if not image_url: raise PermanentError(f"{id}: missing image_url") try: resp = httpx.get(image_url, timeout=10.0, follow_redirects=True) resp.raise_for_status() image = Image.open(io.BytesIO(resp.content)).convert("RGB") except httpx.HTTPError as exc: raise TransientError(f"{id}: image fetch failed: {exc}") from exc except OSError as exc: raise PermanentError(f"{id}: undecodable image: {exc}") from exc scores = classifier(image, candidate_labels=COLORS) return scores[0]["label"] if __name__ == "__main__": asyncio.run(run_udf_worker(classify_color, udf_id="product-color")) ``` The worker image needs `torch`, `transformers`, `pillow`, and `httpx` alongside the `hevlayer` Python client. Bake the model weights into the image so autoscaled pods do not re-download them on every cold start. Sizing for inference: keep `worker.batchSize` low and `worker.timeoutSeconds` high enough for one batch of forward passes, and make `schedule.leaseSeconds` outlast a full batch so claims do not reissue mid-inference. `replicas.min: 1` keeps a warm worker when model cold-start dominates; `min: 0` scales to zero between sweeps. ## Scaling `spec.scaling` is the same scaling config [Pipelines use](/docs/kubernetes/pipeline-crd#scaling): a pool from `InfraRules/default`, a mode, and replica bounds. For Functions, `mode: autoscale` emits a KEDA `ScaledObject` triggered by `layer_udf_queue_depth`. Replica maxima above the pool's `maxReplicasPerWorkload` are rejected in status. ## Writeback Workers own data writes. The common single-attribute case uses the Python client's sugar: `@udf(output="tags")` makes `run_udf_worker` send returned values as `attributes.tags` in the completion call — in Go (or over REST) the same thing is `attributes` on each completion item. The gateway applies those attributes and the reserved completion marker in one `patch_columns` write. Completion attributes must not use the reserved `_hevlayer_*` prefix. Python workers that need more control can declare the `tpuf` parameter, write through the client, and return `None`; completion then stamps only the marker. Use deterministic IDs when a Function creates rows so at-least-once retries remain idempotent. Deleting a Function garbage-collects operator-managed Kubernetes resources. It does not delete already-written attributes. ## Lifecycle ```sh kubectl get function product-tags kubectl describe function product-tags layer udf get product-tags kubectl patch function product-tags --type=merge -p '{"spec":{"paused":true}}' kubectl patch function product-tags --type=merge -p '{"spec":{"paused":false}}' curl -X POST -H "authorization: Bearer $LAYER_GATEWAY_API_KEY" \ $LAYER_GATEWAY_URL/v2/udfs/product-tags/reset-failed kubectl delete function product-tags ``` ## Version markers `spec.version` is the re-run safety rail and defaults to `v1`. On completion, the gateway stamps `_hevlayer_udf__v` with that version, normalizing hyphens in the Function name to underscores. For `metadata.name: product-color`, the marker is `_hevlayer_udf_product_color_v`. Discovery automatically looks for rows whose marker is missing, differs from `spec.version`, or has an expired `_hevlayer_udf__stale_after` marker. Bump `spec.version` when a model, taxonomy, or prompt changes. ## Tuning knobs | Knob | What it bounds | | --- | --- | | `worker.batchSize` | Rows per worker batch. | | `worker.timeoutSeconds` | Worker call timeout. | | `schedule.leaseSeconds` | How long a claim is held before reissue. | | `schedule.discoveryIntervalSeconds` | Time between discovery scan jobs. | | `schedule.maxInFlightBatches` | Concurrent worker batches per UDF. | | `schedule.maxConcurrentScans` | Concurrent namespace discovery jobs. | | `retry.maxAttempts` | Tries before a row lands in `failed`. | --- # Failure Modes Source: https://hevlayer.com/docs/failure-modes import Callout from "../../components/docs/Callout.astro"; Layer strives to degrade gracefully: queries and document fetch served from Turbopuffer keep functioning when components around them fail. This page details the scenarios where that does not apply. ## Read Reads route through the gateway, but a gateway outage does not take your queries dark. The Python and Go SDKs fall through to Turbopuffer direct when the gateway is unreachable, so Turbopuffer-compatible queries keep serving rather than failing, minus the document cache, search history, and Layer's query enhancements (see [Client fall-through](#client-fall-through) below). Layer-only read paths (document fetch, warm jobs, pipeline and UDF status, snapshots, and search history) fail fast, because they depend on gateway-owned cache, queue, history, and consistency state. The document cache is stateless and can scale to zero with no disruption: document fetches fall through to origin (Turbopuffer, or S3 for snapshots) on a miss or cache outage, so a cache failure degrades latency, not availability. ## Write Writes also fall through to Turbopuffer direct when the gateway is unreachable (again, see [Client fall-through](#client-fall-through)); the durable upstream still accepts the row, but the write skips document-cache warming and pipeline staging until the gateway returns. ### Pipeline stop-writes The primary failure mode for writes through a healthy gateway is Aerospike stop-writes during a multi-stage pipeline job: staged documents stay warm in the cache but carry no vector data yet, and once that data exceeds the Aerospike drive allocation the cache rejects further writes. The pipeline does not stall. Each stage persists its chunk bodies to S3 before it touches the cache, and pipeline state lives in PostgreSQL, so the Aerospike write is best-effort: on stop-writes the gateway logs the skipped write and the stage still completes. Downstream chunk reads degrade to the S3 backing for as long as the cache is rejecting writes. Recovery is automatic. The Helm document cache restarts on stop-writes by default (`documentCache.autoRestartOnStopWrites: true`) and clears its Aerospike backing file on pod start (`documentCache.storage.resetOnStart: true`); the gateway reconnects in the background and refills the cache from S3 on demand. No pipeline work is lost — S3 and PostgreSQL are the durable recovery boundary and must stay healthy. Operator signals: - `layer_aerospike_op_duration_seconds{status="aerospike_stop_writes"}` — the stop-writes condition itself, the same series the [dashboard](/docs/dashboard) charts. - `hevlayer_cache_cold_responses_total` — reads being served from S3 backing instead of the cache while it recovers. - `hevlayer_document_cache_cold_starts_total` and `hevlayer_document_cache_cold_start_seconds` — the demand-triggered reconnect-and-refill cycle after the cache restarts. - Gateway warn logs `Aerospike chunk write failed (best-effort)` and `Aerospike chunk read failed; falling back to S3 backing`. ## Client fall-through When the gateway is unreachable, the SDKs retry the call against Turbopuffer directly for operations that need no Layer state — simple vector queries, writes, and raw Turbopuffer-compatible methods (schema, metadata, namespace listing). These calls succeed without the document cache, search history, or Layer's query enhancements, and set the perf `fallback` field to `turbopuffer_direct`. Fall-through requires Turbopuffer credentials (`TURBOPUFFER_API_KEY`, or `WithTurbopufferAPIKey` / `turbopuffer_api_key`); without them the original gateway error propagates unchanged. Fall-through is on by default. Disable it with `fallback_to_turbopuffer=False` on `AsyncHevlayer` or `WithFallbackToTurbopuffer(false)` on the Go client. For the exact list of which operations fall through and which fail fast, see [Client fall-through](/docs/api/introduction#client-fall-through) in the API introduction. --- # Layer CLI Source: https://hevlayer.com/docs/cli The `layer` CLI operates hevlayer from the terminal. It manages named environments, observes index, pipeline, and UDF state from the gateway, mints and revokes API keys, and runs Function manifests. Every read goes through the gateway API with an API key; only `run` touches Kubernetes — it applies the Function CR, registers the UDF spec with the gateway, triggers discovery, and optionally watches until the queue drains. `run` is the only command that needs a kube context: set it on the environment with `--kube-context`/`--kube-namespace` or per invocation with `--context`/`--kube-namespace`. ## Install From the repository root: ```sh go build -o layer ./apps/layer-cli ``` ## Configuration `layer` reads named environments from `~/.hevlayer/config.toml`. The directory is created with mode `0700`; the config file is written with mode `0600`. ```toml active = "partner" [envs.partner] base_url = "https://aws-us-east-1.hevlayer.com" api_key = "..." kube_context = "partner-cluster" kube_namespace = "hevlayer" [envs.local] base_url = "http://localhost:8080" api_key = "dev" kube_context = "kind-hevlayer" ``` Resolution order is: | Priority | Source | | --- | --- | | 1 | Explicit flags such as `--base-url`, `--api-key`, `--context`, and `--kube-namespace` | | 2 | `LAYER_BASE_URL`, `LAYER_API_KEY`, and the `HEVLAYER_` twins | | 3 | Environment selected by `--env` or `LAYER_ENV` | | 4 | Active environment in `~/.hevlayer/config.toml` | | 5 | Built-in base URL default | A shell exporting `LAYER_BASE_URL` or `LAYER_API_KEY` keeps the env-var-only behavior and does not need a config file. `--env` and `LAYER_ENV` select an environment for one invocation without changing the active environment. | Flag | Environment | Default | | --- | --- | --- | | `--base-url` | `LAYER_BASE_URL`, `HEVLAYER_BASE_URL` | `https://aws-us-east-1.hevlayer.com` | | `--api-key` | `LAYER_API_KEY`, `HEVLAYER_API_KEY` | none | | `--env` | `LAYER_ENV` | active config env | | `-o`, `--output` | none | `table` | Output formats are `table`, `json`, and `names`. ## Environments ```sh layer env add partner --base-url https://aws-us-east-1.hevlayer.com \ --api-key "$LAYER_API_KEY" --kube-context partner-cluster \ --kube-namespace hevlayer layer env use partner layer env ls layer env show partner -o json layer env rm partner ``` `env add` prompts for missing values on a TTY. On a non-TTY, the required values must be supplied by flags. API keys are masked in `env ls` and `env show`. ## Run A Function ```sh layer run -f tag-products.yaml layer run -f tag-products.yaml --index amazon-products-staging layer run -f tag-products.yaml --detach layer run -f tag-products.yaml --rm ``` The input is a Kubernetes `Function` manifest. `--index` overrides `spec.targetNamespaces` with one target. `--context` selects a kubeconfig context; `--kube-namespace` selects the Kubernetes namespace for the Function CR. `--no-apply` skips the Kubernetes apply step for workers managed outside the operator. `spec.version` is registered with the gateway as the Function completion marker version. Bump it before re-running a Function after changing a model, prompt, taxonomy, or worker write contract. `--detach` returns after registration and discovery. Without `--detach`, the CLI polls UDF status until discovery has completed and `pending_count` and `processing_count` are both zero. A drained queue with failures exits non-zero. `--rm` deletes the gateway registration and, unless `--no-apply` is set, the Function CR after the queue drains cleanly. A drain with failures leaves both in place so you can inspect them. Watch a run from another terminal: ```sh layer udf list layer udf get product-tags --watch ``` `udf list` lists registered UDFs with pending, processing, failed, discovery sweep count, and indexed rate. `udf get` shows those fields for one UDF; `--watch` polls until `pending_count` and `processing_count` are both zero. ## TUI Bare `layer` on a TTY opens the read-only operations TUI (`layer browse` is the explicit spelling); on a non-TTY it prints usage and exits `2`. Press `i`/`f`/`p`/`k`/`e` to switch between indexes, functions, pipelines, keys, and environments from any view, `enter` to open a detail view, and `esc`/`q` to back out. Every view has a non-interactive command twin with the same data — the TUI humanizes timestamps and sizes; the commands emit raw values for scripting. | TUI view | Command | | --- | --- | | Environments | `layer env ls` | | Functions | `layer udf list` | | Function detail | `layer udf get UDF_ID [--watch]` | | Indexes | `layer index list` | | Index detail | `layer index get NAME` | | Pipelines | `layer pipeline list` | | Pipeline detail | `layer pipeline get ID` | | Keys | `layer keys ls` | | Key detail | `layer keys get KEY_ID` | The keys views are read-only like the rest of the TUI: minting and revoking stay in the commands. ## Keys ```sh layer keys mint cohort-reader --owner acme \ --entitle vectorstore.prod-turbopuffer=read \ --namespaces "cohort-*" \ --claim warehouse.prod-snowflake="notes:cohort:*:read" layer keys ls layer keys get cohort-reader layer keys revoke cohort-reader layer keys rm cohort-reader ``` `keys mint` creates the key through the gateway and prints the token once — alone on stdout, so `layer keys mint … | pbcopy` captures it; the metadata table goes to stderr. There is no way to print it again. | Flag | Shape | | --- | --- | | `--entitle` | `TARGET[=SCOPE[+SCOPE]]`, repeatable. Targets are `vectorstore.`, `warehouse.`, or `layer`. | | `--namespaces` | Upstream-namespace globs for the vectorstore entitlement, comma-separated. | | `--claim` | `TARGET=STRING`, repeatable. Appends an opaque claim string to that target's entitlement. | | `--expires-after` | Duration or `never`; defaults to `365d`. | `--entitle layer=admin` mints an admin key. For anything longer than a couple of flags, write the object instead: `layer keys mint -f key.yaml` takes the same `ApiKey` manifest `kubectl apply` does. `keys ls` and `keys get` show metadata only — key id, owner, phase, entitlement targets, expiry, last seen — never tokens or hashes. `revoke` is idempotent and keeps the record; `rm` hard-deletes it. All `keys` commands call the gateway key routes, which require a key with the `layer` entitlement at `admin` scope (or the bootstrap gateway key); no kube access is involved. ## Ask The Docs `layer ask` queries the committed docs digest with the `ask` CLI. It is keyless and local by default: from a checkout, it finds `site/.hev-ask`, prefers a sibling `../ask` source checkout, and falls back to the docs site's installed `@hevmind/ask` package or an `ask` binary on `PATH`. ```sh layer ask tree layer ask grep "warm cache" layer ask cat api/query layer ask glossary get watermark layer -o json ask tree ``` Use `--endpoint` to query a deployed hev ask endpoint instead of the local digest: ```sh layer ask --endpoint https://hevlayer.com/api/ask tree ``` ## Inspect An Index ```sh layer index get shop-products layer index get shop-products -o json ``` `index get` reports row count, size, schema summary, last write, stable watermark and lag, index (WAL) status, cache state, and snapshot history. Timestamps and sizes are raw (epoch-ms, bytes); `-o json` carries the full snapshot list. ## Pipelines ```sh layer pipeline list layer pipeline get product-images ``` `pipeline list` reads registered pipelines and fans out to the [pipeline status API](/docs/api/pipelines#wait-for-completion) for each one's live queue depth (`pending`, `processing`, `failed`, `rate/min`); `pipeline get` adds target namespace, distance metric, and created-at. A pipeline with no worker staged into it yet renders without queue counts rather than erroring. Reads need only an API key — no kube access. `layer push` is deferred to the managed build/dev-loop milestone. --- # Dashboard Source: https://hevlayer.com/docs/dashboard import Callout from "../../components/docs/Callout.astro"; The Layer dashboard is the operator UI that ships in-cluster alongside the gateway, as the `layer-dashboard` Deployment and Service. This page covers running it: the access it needs, how to reach it, how to gate it, and how to turn it off. ## Access it needs The dashboard is read-mostly and backed by three sources, each with its own grant: - **The gateway API** — the same endpoints customers use, plus the Prometheus-compatible metrics proxy at `/v2/metrics`. Authenticated with a gateway bearer (`LAYER_GATEWAY_API_KEY`). In `deriveFromStore` mode this is the default `VectorStore` credential; in `keys` mode it is the configured inbound worker key. It does not touch PostgreSQL, Aerospike, or VictoriaMetrics directly — metrics arrive through the gateway proxy. - **The Kubernetes API** — reads `hevlayer.com` CRDs (VectorStores, Indexes, InfraRules) and the workload objects behind them (pods, deployments/statefulsets, HPAs, KEDA ScaledObjects, nodes) through RBAC bound to its ServiceAccount. `dashboard.kubeAccess.enabled` grants the read role; with it off the dashboard still runs but the cluster/scaling views show a "kube access not configured" banner. `dashboard.writeAccess.enabled` adds a narrow write role for operator controls (Index spec patches, Karpenter NodePool disruption); set it `false` for a read-only install. - **AWS cost APIs** — the cost view reads the AWS Pricing API and CloudWatch via IRSA (`dashboard.serviceAccount.roleArn`). Attribution is infra-level only; there is no per-namespace cost modeling. ## Networking The dashboard is an operator tool. **Reach it over a port-forward** rather than exposing it publicly: ```sh kubectl port-forward -n svc/layer-dashboard 8081:8081 ``` Then open `http://localhost:8081`. Customer workloads only ever receive the gateway base URL and credentials — never the dashboard. ## Basic auth HTTP Basic auth sits in front of every dashboard route and is **required** — the dashboard refuses to start without it. Set credentials through the chart: ```yaml dashboard: basicAuth: user: ops password: ``` The chart render fails if either field is blank while the dashboard is enabled. ## Disabling the dashboard The dashboard is optional. Disable it and the Deployment, Service, RBAC, and ingress all skip rendering: ```yaml dashboard: enabled: false ``` The gateway and transform runtime run unchanged without it; you lose only the operator UI. ## Operational notes The dashboard is intentionally read-mostly. Mutating actions (UDF pause, InfraRules or scaling edits) are gated through CRD apply or explicit confirm dialogs, and write access is governed separately by `dashboard.writeAccess.enabled`. --- # Introduction Source: https://hevlayer.com/docs/api/introduction import CodeTabs from "../../../components/docs/CodeTabs.astro"; import Upstream from "../../../components/docs/Upstream.astro"; Layer matches the Turbopuffer wire contract so existing clients keep working when you point them at the gateway. Where a route has an upstream equivalent, the site documents what Layer adds — not the upstream behavior itself. Follow the **Upstream docs** link on each page for the underlying request/response shape. ## Install There are four ways to call Layer: the Python client, the Go client, the TypeScript client, and the REST API itself. The clients are generated from `apps/layer-gateway/openapi.yaml`, so all four expose the same operations — every endpoint page on this site shows them side by side. Anything the clients can do, plain HTTP can do. ```sh pip install hevlayer # Python 3.11+ go get github.com/hev/layer/clients/go # Go 1.22+ npm install hevlayer # Node 18+ ``` Point a client at the gateway: ```python import os from hevlayer import AsyncHevlayer client = AsyncHevlayer( base_url=os.environ["LAYER_GATEWAY_URL"], api_key=os.environ["LAYER_GATEWAY_API_KEY"], ) ``` ```go import ( "os" hevlayer "github.com/hev/layer/clients/go" ) client := hevlayer.NewClient( hevlayer.WithBaseURL(os.Getenv("LAYER_GATEWAY_URL")), hevlayer.WithAPIKey(os.Getenv("LAYER_GATEWAY_API_KEY")), ) ``` ```typescript import { Hevlayer } from "hevlayer"; const client = new Hevlayer({ baseUrl: process.env.LAYER_GATEWAY_URL, apiKey: process.env.LAYER_GATEWAY_API_KEY, }); ``` ```bash curl "$LAYER_GATEWAY_URL/v2/namespaces" \ -H "Authorization: Bearer $LAYER_GATEWAY_API_KEY" ``` Code examples across these pages assume this `client` — and in Go, a `ctx context.Context`. The cURL tab on each page is the bare REST contract; any HTTP stack works the same way. ## Authentication Every request carries `Authorization: Bearer `. The gateway accepts two kinds of bearer: - **The store key.** The default `VectorStore` credential (the Turbopuffer key you already own) is accepted as an admin bearer. This is the drop-in default: point an existing client at the gateway and keep your key. No setup, full access. - **A minted key.** Admin can mint keys scoped to a set of namespaces crossed with `read`/`write` — hand one to a team or a service without exposing the rest of the store. Minted keys are gateway-only and never work against the upstream directly. See [API keys](/docs/api/keys). Routes are classified `read`, `write`, or `admin`; each endpoint page notes anything beyond the obvious (GET/query-shaped routes are `read`, namespace writes are `write`, Pipeline/Function/key management is `admin`). A request past a key's scope or namespace grant answers 403 with the reason named. Connection environment variables: | Variable | Purpose | | --- | --- | | `LAYER_GATEWAY_URL` | Base URL of the gateway. | | `LAYER_GATEWAY_API_KEY` | Bearer token sent on every gateway request. In `deriveFromStore` mode this is the default `VectorStore` credential; in `keys` mode it is one of the configured inbound keys. | | `TURBOPUFFER_API_KEY` | Optional direct fallback key for Turbopuffer-compatible SDK calls when the gateway is unreachable. | | `TURBOPUFFER_API_URL` | Optional direct fallback base URL; defaults to `https://aws-us-east-1.turbopuffer.com`. | Additional language targets are added through the SDK harness rather than maintained by hand. ## Client fall-through The Python, Go, and TypeScript SDKs can fall through to Turbopuffer direct when the gateway is unreachable. The fallback is limited to calls that can be satisfied without Layer state: simple vector queries and raw Turbopuffer-compatible methods such as `write_namespace` / `WriteNamespace` / `writeNamespace`, `query_turbopuffer_namespace` / `QueryTurbopufferNamespace` / `queryTurbopufferNamespace`, and namespace schema/listing calls. The clients emit a log warning and set the perf fallback field to `turbopuffer_direct` when perf collection is enabled. Fetches, warm jobs, pipelines, UDFs, `nearest_to_id` queries, and other Layer-only workflows still fail fast because they depend on gateway-owned cache, queue, history, or consistency state. Set `fallback_to_turbopuffer=False` on `AsyncHevlayer`, or `WithFallbackToTurbopuffer(false)` on the Go client, or `fallbackToTurbopuffer: false` on the TypeScript client, to disable direct fallback. ## Enhancements to upstream routes Each of the routes below is wire-compatible with Turbopuffer. The body of each section describes only what Layer overlays on top. ### Write — `POST /v2/namespaces/{ns}` Upstream contract for upsert, delete, and `patch_rows`. - Best-effort Aerospike document-cache mirror before explicit-id upstream writes. - Server-stamped `_hevlayer_upserted_at` on every upsert and patch, which powers the consistency watermark on the query path. - `_hevlayer_*` attributes are reserved — writes to them are rejected. Page: [Write](/docs/api/write). ### Query — `POST /v2/namespaces/{ns}/query` Upstream contract for vector and FTS queries — request shape, ranking, filters, attribute selection. - Stable reads via an injected `_hevlayer_upserted_at <= watermark` predicate while the upstream index is `updating`. - One-shot 429 retry with the watermark filter forced on, for queries that race a write storm. - `x-layer-stable-as-of` returned on stable-read responses so callers can correlate freshness across reads. Page: [Query](/docs/api/query). ### Metadata — `GET /v2/namespaces/{ns}/metadata` Upstream contract for namespace metadata — schema, row count, index status, timestamps. - Proxied upstream verbatim, then enriched with a `layer` block containing `stable_as_of` and `is_stable`. Page: [Namespace metadata](/docs/api/namespace-metadata). ### Cache warm hint — `GET /v1/namespaces/{ns}/hint_cache_warm` Upstream contract for the cache warm hint. - With no query parameters: a raw upstream passthrough, response returned verbatim. - With any warm option supplied: forwards the hint upstream and runs Layer-side warm steps — a warm job to backfill the Aerospike document cache from origin, plus a mirror of the latest S3 snapshot body into Aerospike. Each step is independently toggleable per request. Page: [Warm cache](/docs/api/warm-cache). ## Cross-cutting conventions These apply to every endpoint Layer proxies, whether the route is upstream-compatible or Layer-only. - **`_hevlayer_*` reserved.** Document attributes prefixed with `_hevlayer_` are reserved for the proxy layer. Writing to them is a validation error; reading them is fine when explicitly requested. The gateway stamps `_hevlayer_upserted_at` itself on every upsert and patch — a caller-supplied value is ignored and overwritten with the server's epoch-ms watermark. - **Hard vs soft failures.** Turbopuffer write/query failures are hard failures and return 5xx. Aerospike document-cache failures are soft and never block the response. - **`x-layer-cache` header.** Fetch responses include `hit`, `miss`, or `miss-on-error` so callers can distinguish a cold cache from an outage. - **Response headers.** Reads that go through the watermark path include `x-layer-stable-as-of`; query pagination uses `x-layer-next-cursor`. See [Response headers](/docs/api/response-headers). ## Compatibility posture Layer aims to be a drop-in for existing Turbopuffer clients. Routes that the upstream does not implement are namespaced under `/v2/` and do not shadow upstream behavior. If a Turbopuffer client sends a request to a route Layer doesn't proxy, the gateway returns 404 — it does not silently re-route to an upstream that might handle it differently. --- # Write & Stage Source: https://hevlayer.com/docs/api/write import Upstream from "../../../components/docs/Upstream.astro"; import CodeTabs from "../../../components/docs/CodeTabs.astro"; Writes are wire-compatible with the upstream `POST /v2/namespaces/{ns}` endpoint. The request body (upserts, deletes, patches, and filter writes, combined in one request) is documented upstream. The sections below are what Layer adds on top. Send native write bodies with `write_namespace`. Layer stamps every row-producing write with `_hevlayer_upserted_at` and mirrors it to the document cache. The stamp is what holds the [read watermark](/docs/api/query); the full set of reserved attributes Layer manages on a row lives in the [document model](/docs/document-model). ## Status Layer validates the body before forwarding and can fail independently of Turbopuffer, so the write path carries a few statuses a plain proxy wouldn't: - **200 OK** — applied upstream and stamped. - **422 Unprocessable Entity** — Layer rejected the body before forwarding: no recognized native write operation, a reserved `_hevlayer_*` attribute name, or a removed custom-write key. The body is a Layer error (`{ "error": "validation_error", … }`), not a Turbopuffer one. - **Upstream passthrough** — any non-2xx Turbopuffer returns is relayed verbatim, including a failed conditional write (`upsert_condition`, `patch_condition`, `delete_condition`). - **502 Bad Gateway** — Layer could not reach Turbopuffer (`{ "error": "upstream_error", … }`); the write did not apply. ## Stage Stage caches a document before it's upserted upstream into your vector store. That O(1) read/write is especially useful for queuing chunks in a [two-stage pipeline](/docs/api/pipelines), where a CPU worker stages chunks and a GPU worker reads them back to write vectors. Staged documents are ephemeral until they're upserted, though — a Layer document cache outage loses anything still staged. ```python await client.put_pipeline_document_chunks("product-images", "asin-B08N5WRWNW", { "chunks": [ {"id": "asin-B08N5WRWNW-0", "text": "Wireless noise-cancelling headphones"}, {"id": "asin-B08N5WRWNW-1", "text": "40-hour battery life", "metadata": {"page": 2}}, ], }) ``` ```go client.PutPipelineDocumentChunks(ctx, "product-images", "asin-B08N5WRWNW", &hevlayer.PutChunksRequest{ Chunks: []hevlayer.Chunk{ {ID: "asin-B08N5WRWNW-0", Text: "Wireless noise-cancelling headphones"}, {ID: "asin-B08N5WRWNW-1", Text: "40-hour battery life", Metadata: map[string]interface{}{"page": 2}}, }, }) ``` ```typescript await client.putPipelineDocumentChunks("product-images", "asin-B08N5WRWNW", { chunks: [ { id: "asin-B08N5WRWNW-0", text: "Wireless noise-cancelling headphones" }, { id: "asin-B08N5WRWNW-1", text: "40-hour battery life", metadata: { page: 2 } }, ], }); ``` ```bash curl -X PUT "$LAYER_GATEWAY_URL/v2/pipelines/product-images/documents/asin-B08N5WRWNW" \ -H "Authorization: Bearer $LAYER_GATEWAY_API_KEY" \ -H "Content-Type: application/json" \ -d '{ "chunks": [ {"id": "asin-B08N5WRWNW-0", "text": "Wireless noise-cancelling headphones"}, {"id": "asin-B08N5WRWNW-1", "text": "40-hour battery life", "metadata": {"page": 2}} ] }' ``` Staging stores chunks in the Aerospike document cache and marks the document `pending`. Re-staging the same document ID replaces the chunks and resets state to `pending`. The full pipeline API is documented under [Pipelines](/docs/api/pipelines). --- # Query & Fetch Source: https://hevlayer.com/docs/api/query import Upstream from "../../../components/docs/Upstream.astro"; import CodeTabs from "../../../components/docs/CodeTabs.astro"; Query response bodies are wire-compatible with the upstream `POST /v2/namespaces/{ns}/query` endpoint. Layer metadata is reported in `x-layer-*` response headers. ## Stable reads Layer uses the same query syntax as upstream but defaults to [stable reads](/docs/concepts#control-loops). Every response carries an `x-layer-stable-as-of` watermark: the point the upstream index is known to be caught up to. A query issued right after an upsert never returns partially-indexed rows and never 429s under write pressure, so derived views like [facets](/docs/api/snapshots) and [counts](/docs/api/scans) stay in sync with your index. ```http HTTP/1.1 200 OK x-layer-stable-as-of: 1715600400000 {"rows":[{"id":"asin-B08N5WRWNW","$dist":0.42,"title":"..."}]} ``` This is achieved by: 1. Queries run at `consistency=eventual` upstream, so they never block on indexing. 2. A [control loop](/docs/concepts#control-loops) polls each registered namespace's `index.status` and records the latest status plus, when stable, a watermark equal to `poll_start - safety_margin`. Cold or updating namespaces use the fast polling interval; stable namespaces back off to the stable interval until the next write re-arms the fast tier. 3. Per-query decision: - `Updating` → inject a hidden `_hevlayer_upserted_at <= watermark` predicate so the read never sees partially-indexed rows. - `Stable` or `Unknown` → run without the predicate. The upstream index is caught up (or no contrary evidence exists). 4. On a 429 to an unfiltered query, Layer retries once with the watermark filter forced on. Responses report `x-layer-stable-as-of` (epoch ms) when the watcher has a watermark for the namespace. It is omitted on a cold-start gateway that has not yet observed a stable poll. Paginated single-query responses return the next page token in `x-layer-next-cursor`; pass that value back as `cursor` in the next request body. Stable-read behavior is set per namespace with the `consistency` field on the [Index CRD](/docs/kubernetes/index-crd). Two gateway tunables control the watcher: | Variable | Default | Purpose | | --- | --- | --- | | `CONSISTENCY_POLL_INTERVAL_MS` | 1000 | Fast cadence for cold and updating namespaces. | | `CONSISTENCY_STABLE_POLL_INTERVAL_MS` | 60000 | Slow cadence for namespaces last observed stable. Set equal to the fast interval to restore one uniform cadence. | | `CONSISTENCY_SAFETY_MARGIN_MS` | 500 | Cushion between poll time and watermark to cover in-flight upserts. | ## Query by id Pass `nearest_to_id` in place of `vector` to rank by stored document vectors instead of a raw query vector — exactly one of the two is required. `nearest_to_id` takes an **array of document ids**: the gateway resolves each id's vector (document cache first, Turbopuffer on miss with a cache backfill) and averages them component-wise into a single centroid, then ranks nearest neighbors to that centroid. Pass one id to rank by a single document; pass several to get "more like these" over a set of seeds. ```python response = await client.query_namespace("products", { "nearest_to_id": ["asin-B08N5WRWNW", "asin-B07PXGQC1Q"], "top_k": 10, "include_attributes": ["title", "category"], }) ``` ```go response, err := client.QueryNamespace(ctx, "products", &hevlayer.QueryRequest{ NearestToID: []string{"asin-B08N5WRWNW", "asin-B07PXGQC1Q"}, TopK: 10, IncludeAttributes: []string{"title", "category"}, }) ``` ```typescript const response = await client.queryNamespace("products", { nearest_to_id: ["asin-B08N5WRWNW", "asin-B07PXGQC1Q"], top_k: 10, include_attributes: ["title", "category"], }); ``` ```bash curl -X POST "$LAYER_GATEWAY_URL/v2/namespaces/products/query" \ -H "Authorization: Bearer $LAYER_GATEWAY_API_KEY" \ -H "Content-Type: application/json" \ -d '{ "nearest_to_id": ["asin-B08N5WRWNW", "asin-B07PXGQC1Q"], "top_k": 10, "include_attributes": ["title", "category"] }' ``` | Outcome | Status | | --- | --- | | Every id resolved (cache or origin) | 200, ranked results | | Any id has no stored vector anywhere | 404 (names the missing ids) | | `nearest_to_id` empty, or both/neither of `vector` / `nearest_to_id` | 422 | The centroid is an unweighted mean, so seed ids contribute equally regardless of how many you pass. All resolved vectors share the namespace's dimensionality, so no reconciliation is needed across seeds. This fuses the seeds into one ranking; to run several *independent* rankings in a single request, see [multi-query](#multi-query). ## Multi-query `nearest_to_id` fuses several seeds into a **single** ranking. To run several **independent** queries in one round trip, each with its own ranking, post a `queries` array. The response is a parallel `results` array: one ranked result set per query, in request order. Multi-query is wire-compatible with the upstream multi-query overload. The request shape and per-leg `rank_by` vocabulary are documented upstream. Layer wraps non-fused batches so every leg reads the same stable cut and the response body remains upstream-shaped: `{ "results": [{ "rows": ... }] }`. ```python batch = await client.multi_query_turbopuffer_namespace("products", { "queries": [ {"rank_by": ["vector", "ANN", [0.1, 0.2, 0.3]], "top_k": 10}, {"rank_by": ["title", "BM25", "wireless earbuds"], "top_k": 10}, ], }) # batch.results[0].rows ranked by vector; batch.results[1].rows by text ``` ```go batch, err := client.MultiQueryTurbopufferNamespace(ctx, "products", &hevlayer.TurbopufferMultiQueryRequest{ Queries: []hevlayer.TurbopufferQueryRequest{ {"rank_by": []any{"vector", "ANN", []float64{0.1, 0.2, 0.3}}, "top_k": 10}, {"rank_by": []any{"title", "BM25", "wireless earbuds"}, "top_k": 10}, }, }) ``` ```typescript const batch = await client.multiQueryTurbopufferNamespace("products", { queries: [ { rank_by: ["vector", "ANN", [0.1, 0.2, 0.3]], top_k: 10 }, { rank_by: ["title", "BM25", "wireless earbuds"], top_k: 10 }, ], }); // batch.results[0].rows ranked by vector; batch.results[1].rows by text ``` ```bash curl -X POST "$LAYER_GATEWAY_URL/v2/namespaces/products/query?stainless_overload=multiQuery" \ -H "Authorization: Bearer $LAYER_GATEWAY_API_KEY" \ -H "Content-Type: application/json" \ -d '{ "queries": [ {"rank_by": ["vector", "ANN", [0.1, 0.2, 0.3]], "top_k": 10}, {"rank_by": ["title", "BM25", "wireless earbuds"], "top_k": 10} ] }' ``` All legs in a non-fused batch share one `x-layer-stable-as-of` value. A leg may use native `rank_by`, or the Layer `vector` / `nearest_to_id` single-query shape; `nearest_to_id` is resolved before the leg is sent upstream. Batches must contain 2 to 16 legs. `cursor` is rejected at the top level and per leg because pagination is single-query only. When `rerank_by` is present, Layer treats the request as an upstream fused query and passes the body through unchanged. Reach for multi-query when you genuinely need N rankings — distinct user queries batched into one round trip, or hybrid retrieval fused upstream with RRF. Reach for `nearest_to_id` when many seeds should collapse into one "more like these" ranking. To get typo-tolerant text search without building the fused query yourself, see [hybrid text fusion](#hybrid-text-fusion). ## Hybrid text fusion BM25 misses typos and morphological variants; fuzzy matching alone loses the relevance signal BM25 provides. `HybridText` runs both in one request: the gateway tokenizes your input string, expands it into one BM25 leg plus one fuzzy leg per token, and Turbopuffer fuses the legs with reciprocal rank fusion (RRF). One expression in, typo-tolerant ranked results out. `HybridText` is a Layer-only `rank_by` spelling on the existing query route — no new endpoint, no client changes beyond the expression. The gateway tokenizes with [`alyze`](https://github.com/turbopuffer/alyze), Turbopuffer's own open-source tokenizer and the same code that segmented your text at index time, so query tokens match index terms by construction. ```python response = await client.query_namespace("support-tickets", { "rank_by": ["content", "HybridText", "conection timout kubernets"], "top_k": 10, "filters": ["tenant", "Eq", "t-42"], "include_attributes": ["content", "title"], }) ``` ```go response, err := client.QueryNamespace(ctx, "support-tickets", &hevlayer.QueryRequest{ RankBy: []any{"content", "HybridText", "conection timout kubernets"}, TopK: 10, Filters: []any{"tenant", "Eq", "t-42"}, IncludeAttributes: []string{"content", "title"}, }) ``` ```typescript const response = await client.queryNamespace("support-tickets", { rank_by: ["content", "HybridText", "conection timout kubernets"], top_k: 10, filters: ["tenant", "Eq", "t-42"], include_attributes: ["content", "title"], }); ``` ```bash curl -X POST "$LAYER_GATEWAY_URL/v2/namespaces/support-tickets/query" \ -H "Authorization: Bearer $LAYER_GATEWAY_API_KEY" \ -H "Content-Type: application/json" \ -d '{ "rank_by": ["content", "HybridText", "conection timout kubernets"], "top_k": 10, "filters": ["tenant", "Eq", "t-42"], "include_attributes": ["content", "title"] }' ``` An optional fourth tuple element tunes the expansion. Defaults: ```json ["content", "HybridText", "conection timout kubernets", { "fuzziness": "auto", "rank_constant": 60, "per_leg_limit": null }] ``` | Option | Default | Meaning | | --- | --- | --- | | `fuzziness` | `"auto"` | Max edit distance for fuzzy legs. `"auto"` maps per token: 1 for tokens of 5 characters or fewer, 2 for longer tokens. Fixed `0`, `1`, or `2` applies to all tokens. | | `rank_constant` | `60` | Turbopuffer's RRF constant, passed through verbatim. Integer > 0. | | `per_leg_limit` | `clamp(5 × top_k, 50, 200)` | How deep each leg retrieves before fusion. Integer > 0. | | `threads` | `Index.spec.scan.threads`, else `8` | Maximum concurrent upstream requests when the gateway scatter/gathers the expansion across a [sharded](/docs/concepts#scattergather) namespace — the same fan-out control as [scans](/docs/api/scans). Clamped to active shards. No effect on unsharded namespaces, where the expansion is a single fused upstream call. | ### Tokenization The input string becomes tokens under a fixed, documented policy: 1. Split on Unicode (UAX #29) word boundaries and lowercase, using `alyze` — the code behind Turbopuffer's production `word_v4` tokenizer. Punctuation-only tokens never survive the split. 2. Drop tokens shorter than 2 characters. 3. Dedupe. 4. Cap at 15 tokens (15 fuzzy legs + 1 BM25 leg = 16, the upstream subquery limit). Tokens cut by the cap are counted in `tokens_dropped`. Stemming, stopword removal, and language detection are not applied. The input must yield at least one token; one token is fine (that is still two legs, the RRF minimum). ### Response Results are the upstream RRF-fused list. A `hybrid` block echoes the effective expansion so defaults are never invisible: ```json { "rows": [ { "id": "ticket-4117", "$score": 0.0639, "content": "...", "title": "Connection timeout on Kubernetes ingress" } ], "hybrid": { "tokens": ["conection", "timout", "kubernets"], "tokens_dropped": 0, "fuzziness": "auto", "rank_constant": 60, "legs": 4, "per_leg_limit": 50 } } ``` | Field | Meaning | | --- | --- | | `$score` | Upstream RRF score. Comparable **within** a response, not across requests — do not threshold on it. | | `tokens` | Tokens that produced fuzzy legs, post-policy. | | `tokens_dropped` | Tokens removed by the 15-token cap (not by the length or punctuation rules). | | `legs` | Total subqueries sent upstream (fuzzy legs + 1 BM25 leg). | The `hybrid` block appears only on `HybridText` responses. On sharded namespaces it also reports the effective `threads` fan-out width. Requests without a `HybridText` expression, including native multi-query + `rerank_by` bodies, keep their upstream-shaped responses byte-for-byte. ### Semantics - **One round trip.** The expansion is a single upstream multi-query fused by `rerank_by: ["RRF", ...]`. Layer implements no fusion math and does not reorder results. - **One consistency cut.** Request-level `filters` are replicated to every leg, and the [stable-read](#stable-reads) watermark predicate is injected into every leg from a single read — all legs see the same cut. Responses carry `x-layer-stable-as-of` as usual. - **All-or-nothing.** Upstream multi-query has no partial results; any leg failing fails the request. - **Replay as a unit.** The query logs to [search history](/docs/api/search-history) as one entry carrying the `HybridText` expression, so replaying it reproduces the whole expansion. ### Validation All return `422`: | Condition | Why | | --- | --- | | Input yields zero tokens under the policy | Nothing to expand. | | `cursor` present | Fused scores do not form the monotone bands pagination relies on. | | `HybridText` inside a `queries` array | The expansion is one multi-query deep by construction. | | `fuzziness` not in `"auto" \| 0 \| 1 \| 2`; `rank_constant` ≤ 0; `per_leg_limit` ≤ 0; `threads` < 1 | Out of range. | To let the gateway pick between hybrid text and semantic retrieval per query, see [query routing](#query-routing). ## Query routing Real search boxes receive both `"timout"` and `"why do pods lose their connection during deploys"`. The first wants [hybrid text fusion](#hybrid-text-fusion); the second wants semantic retrieval — lexical legs add noise on long conversational input, and ANN underperforms on short identifier-shaped tokens. `Auto` is a Layer-only `rank_by` spelling that makes that call per query, so the branch doesn't live ad hoc in your application code. The gateway never embeds. The route is chosen from the shape of the input alone, and when the chosen route needs a query vector the request didn't include, the response is the routing decision instead of results — your application embeds and re-issues with the route forced. Short keyword traffic executes immediately and never pays for an embedding. ```python response = await client.query_namespace("support-tickets", { "rank_by": ["content", "Auto", user_input], "top_k": 10, "filters": ["tenant", "Eq", "t-42"], }) if not response.routing.executed: vector = await embed(user_input) # your model or API response = await client.query_namespace("support-tickets", { "rank_by": ["content", "Auto", user_input, { "route": response.routing.route, "vector": vector, }], "top_k": 10, "filters": ["tenant", "Eq", "t-42"], }) ``` ```go response, err := client.QueryNamespace(ctx, "support-tickets", &hevlayer.QueryRequest{ RankBy: []any{"content", "Auto", userInput}, TopK: 10, Filters: []any{"tenant", "Eq", "t-42"}, }) if err == nil && !response.Routing.Executed { vector := embed(userInput) // your model or API response, err = client.QueryNamespace(ctx, "support-tickets", &hevlayer.QueryRequest{ RankBy: []any{"content", "Auto", userInput, map[string]any{ "route": response.Routing.Route, "vector": vector, }}, TopK: 10, Filters: []any{"tenant", "Eq", "t-42"}, }) } ``` ```typescript let response = await client.queryNamespace("support-tickets", { rank_by: ["content", "Auto", userInput], top_k: 10, filters: ["tenant", "Eq", "t-42"], }); if (!response.routing.executed) { const vector = await embed(userInput); // your model or API response = await client.queryNamespace("support-tickets", { rank_by: ["content", "Auto", userInput, { route: response.routing.route, vector, }], top_k: 10, filters: ["tenant", "Eq", "t-42"], }); } ``` ```bash # First request: no vector. Executes lexically, or returns the decision. curl -X POST "$LAYER_GATEWAY_URL/v2/namespaces/support-tickets/query" \ -H "Authorization: Bearer $LAYER_GATEWAY_API_KEY" \ -H "Content-Type: application/json" \ -d '{ "rank_by": ["content", "Auto", "why do pods lose their connection during deploys"], "top_k": 10, "filters": ["tenant", "Eq", "t-42"] }' # Routed semantic without a vector, so the body is the decision, not rows: # {"rows": [], "routing": {"route": "semantic", "policy": "v1", "tokens": 8, "executed": false}} # Embed, then re-issue with the route forced: curl -X POST "$LAYER_GATEWAY_URL/v2/namespaces/support-tickets/query" \ -H "Authorization: Bearer $LAYER_GATEWAY_API_KEY" \ -H "Content-Type: application/json" \ -d '{ "rank_by": ["content", "Auto", "why do pods lose their connection during deploys", { "route": "semantic", "vector": [0.0012, -0.043] }], "top_k": 10, "filters": ["tenant", "Eq", "t-42"] }' ``` ### Routing policy The v1 policy reads the token count of the input under the same [tokenizer policy](#tokenization) as hybrid text fusion: | Tokens | Route | Runs | | --- | --- | --- | | ≤ 2 | `hybrid_text` | The [hybrid text fusion](#hybrid-text-fusion) expansion. | | ≥ 8 | `semantic` | ANN over the supplied query vector. | | 3 – 7 | `fused` | Both, merged upstream by RRF. | Vector availability never changes which route is chosen — only whether it executes in this request. `hybrid_text` always executes; `semantic` and `fused` execute when the request supplies a `vector` and defer otherwise. The policy is versioned (`"policy": "v1"`) so threshold changes are visible in [search history](/docs/api/search-history). ### Options The optional fourth tuple element: | Option | Default | Meaning | | --- | --- | --- | | `route` | `"auto"` | Force `"hybrid_text"`, `"semantic"`, or `"fused"` instead of applying the policy. Used on re-issue after a deferral, and for A/B comparison of strategies on the same input. | | `vector` | — | The query vector for the semantic leg. Dimensionality must match the namespace. | When the chosen route expands hybrid-text legs, the hybrid defaults apply and the [`hybrid` echo block](#response) appears alongside `routing`. ### Response Every `Auto` response carries a `routing` block: ```json { "rows": [{"id": "ticket-4117", "$score": 0.0639, "title": "..."}], "routing": { "route": "hybrid_text", "policy": "v1", "tokens": 1, "executed": true }, "hybrid": {"tokens": ["timout"], "tokens_dropped": 0, "fuzziness": "auto", "rank_constant": 60, "legs": 2, "per_leg_limit": 50} } ``` | Field | Meaning | | --- | --- | | `route` | The strategy chosen (or forced). | | `policy` | Routing policy version that made the decision. `"forced"` when `route` was supplied. | | `tokens` | Token count the policy read, post tokenizer policy. | | `executed` | `false` on a deferral: the route needs a vector the request didn't supply. `rows` is empty; embed and re-issue with the route forced. | Routed queries follow the same semantics as their underlying strategy: one consistency cut across all legs, all-or-nothing leg failure, and a single [search history](/docs/api/search-history) entry carrying the `Auto` expression and the decision. ### Validation All return `422`: | Condition | Why | | --- | --- | | Forced `"semantic"` or `"fused"` without `vector` | Forcing asserts you have the vector; only auto-routing defers. | | Input yields zero tokens under the policy | Nothing to route. | | `vector` dimensionality mismatch | Same check as a plain vector query. | | `cursor` present, or `Auto` inside a `queries` array | Inherited from [hybrid text fusion](#validation). | ## Counting matches To count how many rows match a full-text or vector query, use [scan](/docs/api/scans) count mode with the `fts` or `ann` selector. Ranked counts share the single `/scans` endpoint with filter counts — `fts` is exact, `ann` is a radius scan flagged `approximate`, and both honor the `exhaustive` flag and the count deadline. ## Fetch Fetch is a Layer-only endpoint with no upstream equivalent. The NVMe cache is checked first; on miss or error the gateway falls through to Turbopuffer and backfills the cache best-effort. ### Single fetch ```python doc = await client.fetch_document( "products", "asin-B08N5WRWNW", include_attributes=["title", "category"], ) ``` ```go doc, err := client.FetchDocument(ctx, "products", "asin-B08N5WRWNW", &hevlayer.FetchDocumentParams{ IncludeAttributes: []string{"title", "category"}, }) ``` ```typescript const doc = await client.fetchDocument("products", "asin-B08N5WRWNW", { includeAttributes: ["title", "category"], }); ``` ```bash curl "$LAYER_GATEWAY_URL/v2/namespaces/products/documents/asin-B08N5WRWNW?include_attributes=title,category" \ -H "Authorization: Bearer $LAYER_GATEWAY_API_KEY" ``` | Outcome | Status | Header | | --- | --- | --- | | Cached hit | 200 | `x-layer-cache: hit` | | Cache miss, upstream hit, cache backfilled | 200 | `x-layer-cache: miss` | | Cache unavailable, upstream hit | 200 | `x-layer-cache: miss-on-error` | | Missing from both layers | 404 | — | ### Batch fetch ```python batch = await client.fetch_documents("products", { "ids": ["asin-1", "asin-2", "asin-3"], "include_attributes": ["title"], }) ``` ```go batch, err := client.FetchDocuments(ctx, "products", &hevlayer.FetchDocumentsRequest{ Ids: []string{"asin-1", "asin-2", "asin-3"}, IncludeAttributes: []string{"title"}, }) ``` ```typescript const batch = await client.fetchDocuments("products", { ids: ["asin-1", "asin-2", "asin-3"], include_attributes: ["title"], }); ``` ```bash curl -X POST "$LAYER_GATEWAY_URL/v2/namespaces/products/documents" \ -H "Authorization: Bearer $LAYER_GATEWAY_API_KEY" \ -H "Content-Type: application/json" \ -d '{ "ids": ["asin-1", "asin-2", "asin-3"], "include_attributes": ["title"] }' ``` ```json { "documents": [ {"id": "asin-1", "attributes": {"title": "..."}}, {"id": "asin-3", "attributes": {"title": "..."}} ], "missing": ["asin-2"] } ``` Batch fetch returns found documents and missing ids inline instead of a partial 404. `documents` preserves request order; ids the gateway could not find anywhere land in `missing`. Because order is preserved, batch fetch is a convenient way to reassemble a [pipeline](/docs/api/pipelines)'s chunks back into their original document — request the chunk ids in sequence and concatenate the results. ### Behavior matrix | Cache state | Single fetch | Batch fetch | | --- | --- | --- | | Hit | cache | cache | | Miss, upstream present | upstream + backfill | upstream + backfill | | Miss, upstream absent | 404 | inline `missing` | | Cache unavailable | upstream, `miss-on-error` | upstream, `miss-on-error` | --- # Scan Source: https://hevlayer.com/docs/api/scans import CodeTabs from "../../../components/docs/CodeTabs.astro"; A scan is on-demand row selection over a namespace. It picks rows by one of three **selectors** and returns their IDs (`mode: ids`, an asynchronous job), their count (`mode: count`, synchronous), or the distinct values of one attribute field (`mode: values`, an asynchronous job): | Input | Field | Meaning | Notes | | --- | --- | --- | --- | | Filter selector | `filters` | An attribute predicate, or all rows when omitted. | Exact | | Full-text selector | `fts` | A BM25 predicate against a text field. | Exact | | Radius selector | `ann` | Rows within `radius` of a query vector. | Approximate (ANN recall) | | Fan-out control | `threads` | Maximum concurrent upstream requests for origin scatter/gather. | Origin only; defaults from `Index.spec.scan.threads`, then `8`. | A request carries **at most one** ranked selector (`fts` or `ann`). `filters` is always optional and, when present alongside a ranked selector, is ANDed onto the match set as an extra constraint. A request with both `fts` and `ann` is a `422`. At cutover, `mode: ids` is filter-only (ranked IDs are a defined fast-follow), while `mode: count` and `mode: values` support all three selectors. Use scans for bulk exports, manual inspection, UDF discovery debugging, cache/origin consistency checks, exact or approximate counts, and field value discovery. ## Routes | Route | Method | Behavior | | --- | --- | --- | | `POST /v2/namespaces/{ns}/scans` | POST | Create an ID or values scan job, or return a count. | | `GET /v2/namespaces/{ns}/scans` | GET | List scan jobs for the namespace. | | `GET /v2/namespaces/{ns}/scans/{id}` | GET | Read one scan job. | | `GET /v2/namespaces/{ns}/scans/{id}/results` | GET | Read completed scan IDs or values. | | `DELETE /v2/namespaces/{ns}/scans/{id}` | DELETE | Drop the in-memory scan job. | ## ID Mode ```python job = await client.create_scan("products", { "source": "auto", "mode": "ids", "filters": ["category", "Eq", "Electronics"], "threads": 8, "page_size": 1000, }) ``` ```go job, err := client.CreateScan(ctx, "products", &hevlayer.CreateScanRequest{ Source: "auto", Mode: "ids", Filters: []interface{}{"category", "Eq", "Electronics"}, Threads: 8, PageSize: 1000, }) ``` ```typescript const job = await client.createScan("products", { source: "auto", mode: "ids", filters: ["category", "Eq", "Electronics"], threads: 8, page_size: 1000, }); ``` ```bash curl -X POST "$LAYER_GATEWAY_URL/v2/namespaces/products/scans" \ -H "Authorization: Bearer $LAYER_GATEWAY_API_KEY" \ -H "Content-Type: application/json" \ -d '{ "source": "auto", "mode": "ids", "filters": ["category", "Eq", "Electronics"], "threads": 8, "page_size": 1000 }' ``` `mode` defaults to `ids`. Valid ID-mode sources are `auto`, `cache`, and `origin`. The Python and TypeScript clients also ship `scan(...)` helpers that create the job and poll until it completes; in Go, poll `GetScan` until `status` is `completed`. The create response is `202 Accepted`: ```json { "id": "scan-uuid", "namespace": "products", "source": "auto", "effective_source": "origin", "status": "running", "progress": 0, "documents_scanned": 0, "threads": 8, "created_at": "2026-05-26T10:00:00Z" } ``` Read IDs after `status` is `completed`: ```python results = await client.get_scan_results("products", job.id, limit=1000, offset=0) ``` ```go results, err := client.GetScanResults(ctx, "products", scanID, &hevlayer.GetScanResultsParams{Limit: 1000, Offset: 0}) ``` ```typescript const results = await client.getScanResults("products", job.id, { limit: 1000, offset: 0, }); ``` ```bash curl "$LAYER_GATEWAY_URL/v2/namespaces/products/scans/scan-uuid/results?limit=1000&offset=0" \ -H "Authorization: Bearer $LAYER_GATEWAY_API_KEY" ``` ```json { "ids": ["doc-1", "doc-2"], "total": 2 } ``` ## Count Mode ```python count = await client.create_scan("products", { "mode": "count", "source": "auto", "filters": ["category", "Eq", "Electronics"], "threads": 8, "timeout_seconds": 30, }) ``` ```go count, err := client.CreateScan(ctx, "products", &hevlayer.CreateScanRequest{ Mode: "count", Source: "auto", Filters: []interface{}{"category", "Eq", "Electronics"}, Threads: 8, TimeoutSeconds: 30, }) ``` ```typescript const count = await client.createScan("products", { mode: "count", source: "auto", filters: ["category", "Eq", "Electronics"], threads: 8, timeout_seconds: 30, }); ``` ```bash curl -X POST "$LAYER_GATEWAY_URL/v2/namespaces/products/scans" \ -H "Authorization: Bearer $LAYER_GATEWAY_API_KEY" \ -H "Content-Type: application/json" \ -d '{ "mode": "count", "source": "auto", "filters": ["category", "Eq", "Electronics"], "threads": 8, "timeout_seconds": 30 }' ``` ```json { "count": 4210, "served_by": "snapshot", "snapshot_sha": "3f9e8b21", "watermark_ms": 1747300000123, "elapsed_ms": 3 } ``` When `watermark_ms` is present, the response also includes `x-layer-stable-as-of` with the same epoch-ms value. Count-mode sources are `auto`, `snapshot`, `cache`, and `origin`. Snapshot reads are eligible only for a single leaf `Eq` or `In` filter on a field present in the latest snapshot `fields[]`. `And`, `Or`, `Not`, range operators, fields absent from the snapshot, and skipped fields fall through under `auto` and fail with `412 precondition_failed` under `source: snapshot`. Live count responses include: ```json { "count": 4210, "served_by": "origin", "bounded": false, "timed_out": false, "shards_saturated": 0, "shards_total": 1, "threads": 1, "elapsed_ms": 42 } ``` ## Values Mode A values scan enumerates the distinct values of one attribute `field` over the rows the selector picks, each with its document count. Use it to discover a field's value set — what product categories exist, what tags appear on rows matching a query — instead of confirming values you already know with counts. `field` is required for `mode: values` (and rejected on other modes with `422`). It must name a scalar string or integer attribute, or an array of strings — each array element counts once per containing document. Vector fields are a `422`. ```python job = await client.create_scan("products", { "mode": "values", "field": "category", "source": "auto", "filters": ["in_stock", "Eq", True], }) ``` ```go job, err := client.CreateScan(ctx, "products", &hevlayer.CreateScanRequest{ Mode: "values", Field: "category", Source: "auto", Filters: []interface{}{"in_stock", "Eq", true}, }) ``` ```typescript const job = await client.createScan("products", { mode: "values", field: "category", source: "auto", filters: ["in_stock", "Eq", true], }); ``` ```bash curl -X POST "$LAYER_GATEWAY_URL/v2/namespaces/products/scans" \ -H "Authorization: Bearer $LAYER_GATEWAY_API_KEY" \ -H "Content-Type: application/json" \ -d '{ "mode": "values", "field": "category", "source": "auto", "filters": ["in_stock", "Eq", true] }' ``` Like ID mode, the create response is a `202 Accepted` job, and the `scan(...)` SDK helpers poll it to completion: ```json { "id": "scan-uuid", "namespace": "products", "mode": "values", "field": "category", "source": "auto", "effective_source": "origin", "status": "running", "progress": 0, "documents_scanned": 0, "threads": 8, "created_at": "2026-05-26T10:00:00Z" } ``` Read values from the same results route after `status` is `completed`, with the same `limit`/`offset` pagination as scan IDs: ```json { "values": [ {"v": "electronics", "n": 4210}, {"v": "books", "n": 1240} ], "total": 2, "truncated": false } ``` `v`/`n` is the same vocabulary [snapshot](/docs/api/snapshots) facet histograms use: `v` is the value, `n` its document count. Ordering is deterministic — `n` descending, then `v` ascending. Counts are exact for filter-selector scans; on a ranked scan with a saturated shard the job carries `bounded: true` and each `n` is a `>=` lower bound. ### Precomputed serving An unfiltered values scan (no `filters`, no ranked selector) on a field present in the latest snapshot `fields[]` is answered straight from the snapshot's facet histogram: the job completes during the create call — the `202` body already shows `status: completed` — and carries `effective_source: snapshot` with `snapshot_sha` and `watermark_ms`. Fields in `fields_skipped[]` or absent from the snapshot fall through to cache/origin under `auto` and fail with `412 precondition_failed` under explicit `source: snapshot`, as do scans carrying any selector. ### High cardinality Snapshot facet histograms cap each field at 10,000 distinct values and skip fields beyond it; values scans are the enumeration path for exactly those fields. A values job accumulates its histogram in gateway memory and caps the listing at **1,000,000 distinct values**. A scan that crosses the cap completes rather than failing: - The cap applies after the full pass, so every emitted `n` stays exact. - The listing truncates deterministically to the top 1,000,000 values by count (value-ascending tiebreak); the low-count tail is dropped. - The job and its results carry `truncated: true`, meaning the listing is incomplete. `truncated`, `bounded`, and `approximate` are independent flags: `truncated` is a gateway memory bound on the listing, `bounded` is upstream `top_k` saturation on a ranked scan's counts, and `approximate` is ANN recall fuzz on a radius ball's membership. ## Fan-out width Origin scans fan out one upstream request per active shard. `threads` sets the maximum number of those upstream requests a single scan may have in flight at once. It means concurrent requests, not operating-system threads; the gateway is async. Resolution order: 1. `threads` on the scan request. 2. `spec.scan.threads` on the namespace's `Index` resource. 3. The gateway default, `8`. The effective value is clamped to the active shard count and the server cap, `32`, then echoed as `threads` on origin responses and completed scan jobs. Snapshot and cache reads do not fan out, so they ignore this field and omit the echo. ## Full-text count Count rows matching a BM25 query with the `fts` selector. Full-text counts are exact and always run origin scatter/gather, so `source` must be omitted, `auto`, or `origin`. A `filters` array, when present, is ANDed on as an extra constraint. ```python count = await client.create_scan("products", { "mode": "count", "fts": {"field": "title", "query": "wireless headphones"}, "filters": ["category", "Eq", "Electronics"], "exhaustive": True, }) ``` ```go count, err := client.CreateScan(ctx, "products", &hevlayer.CreateScanRequest{ Mode: "count", Fts: &hevlayer.FtsScan{Field: "title", Query: "wireless headphones"}, Filters: []interface{}{"category", "Eq", "Electronics"}, Exhaustive: true, }) ``` ```typescript const count = await client.createScan("products", { mode: "count", fts: { field: "title", query: "wireless headphones" }, filters: ["category", "Eq", "Electronics"], exhaustive: true, }); ``` ```bash curl -X POST "$LAYER_GATEWAY_URL/v2/namespaces/products/scans" \ -H "Authorization: Bearer $LAYER_GATEWAY_API_KEY" \ -H "Content-Type: application/json" \ -d '{ "mode": "count", "fts": {"field": "title", "query": "wireless headphones"}, "filters": ["category", "Eq", "Electronics"], "exhaustive": true }' ``` ## Radius count Count rows within `radius` of a query vector with the `ann` selector — a distance-ball scan. `radius` is required and finite (without an upper bound every row is in the ball); `field` defaults to `vector`. Like `fts`, radius counts always run origin scatter/gather. The count is **approximate**: ANN recall means the index's membership of the ball may differ from the true set, independent of saturation, so the response carries `approximate: true`. ```python count = await client.create_scan("products", { "mode": "count", "ann": {"field": "vector", "vector": [0.12, -0.3, 0.88], "radius": 0.25}, }) ``` ```go count, err := client.CreateScan(ctx, "products", &hevlayer.CreateScanRequest{ Mode: "count", Ann: &hevlayer.AnnScan{Field: "vector", Vector: []float64{0.12, -0.3, 0.88}, Radius: 0.25}, }) ``` ```typescript const count = await client.createScan("products", { mode: "count", ann: { field: "vector", vector: [0.12, -0.3, 0.88], radius: 0.25 }, }); ``` ```bash curl -X POST "$LAYER_GATEWAY_URL/v2/namespaces/products/scans" \ -H "Authorization: Bearer $LAYER_GATEWAY_API_KEY" \ -H "Content-Type: application/json" \ -d '{ "mode": "count", "ann": {"field": "vector", "vector": [0.12, -0.3, 0.88], "radius": 0.25} }' ``` ```json { "count": 980, "served_by": "origin", "approximate": true, "bounded": false, "timed_out": false, "shards_saturated": 0, "shards_total": 1, "threads": 1, "elapsed_ms": 51 } ``` ### Bounding ranked scans Ranked selectors fan out one Turbopuffer query per shard, each capped at `top_k = 10_000`. `threads` bounds fan-out width: how many shard requests can run at once. `exhaustive` and `timeout_seconds` bound depth: what happens when a shard hits that cap and how long recursion can run. - `exhaustive: false` (default) — one scatter/gather. A saturated shard contributes its cap as a lower bound; the response carries `bounded: true` with `shards_saturated > 0`. - `exhaustive: true` — recurse on each saturated shard via score-band pagination (BM25: `$score < last` with an `id` tiebreak; ANN: `$dist > last`) until every page is short or `timeout_seconds` elapses. The same `threads` value applies to the initial round and every exhaustive round over the remaining saturated shards. `bounded` and `approximate` are independent. `bounded` means a shard saturated and the count is a `>=` lower bound for the rows the index returned; `approximate` means the distance ball's membership is itself fuzzy. An `ann` count can be `bounded: false` yet still `approximate: true`. ## Sources | Source | ID mode | Count mode | Values mode | | --- | --- | --- | --- | | `auto` | Cache when fresh enough, otherwise origin | Snapshot first, then cache/origin. | Snapshot when eligible, then cache/origin. | | `snapshot` | Not supported | Latest snapshot only; requires eligible `Eq` or `In`. | Latest snapshot facet listing; requires an unfiltered scan on a field in `fields[]`. | | `cache` | Aerospike document cache only | Aerospike document cache only | Aerospike document cache only. | | `origin` | Turbopuffer paginated scan | Turbopuffer paginated scan | Turbopuffer paginated scan with gateway-side dedupe. | This table covers the filter selector. The `fts` and `ann` selectors have no snapshot or cache evaluator, so they always run origin scatter/gather: omitted, `auto`, and `origin` all resolve to origin, and `snapshot` or `cache` returns `422`. ## Filters Scans accept the same Turbopuffer filter array as [query](/docs/api/query). On origin scans, the filter is pushed to Turbopuffer. On cache scans, the gateway evaluates it against cached document attributes. Supported cache operators are `Eq`, `NotEq`, `Gt`, `Gte`, `Lt`, `Lte`, `In`, `NotIn`, `And`, `Or`, and `Not`. If `auto` sees a filter the cache cannot evaluate, it uses origin. Explicit `source: cache` with an unsupported filter fails rather than returning partial results. ## Auto-Mode Policy Auto ties cache freshness to the same consistency watermark used by [stable reads](/docs/api/query#stable-reads). The gateway tracks per-namespace `cache_warmed_through`, the watermark observed at the end of the last successful origin warm. | Cache state | Watermark state | Action | | --- | --- | --- | | Empty | any | Run origin and stamp `cache_warmed_through`. | | Populated, `cache_warmed_through >= watermark` | observed | Serve cache. | | Populated, `cache_warmed_through < watermark` | observed | Serve cache and start a background origin warm. | | Populated, no `cache_warmed_through` yet | observed | Serve cache and start a background origin warm. | | Populated | not yet observed | Serve cache. | When cache is used, `_hevlayer_upserted_at <= cache_warmed_through` is added before the user filter so the scan is a stable warmed view. ## Operational notes - ID and values scan state is in-memory and ephemeral; it resets on gateway restart. - Count scans have a deadline, default 30s and maximum 300s. - Values jobs cap at 1,000,000 distinct values per scan and set `truncated: true` when crossed; the listing keeps the top values by count, each with an exact count. - Origin scan fan-out defaults to 8 concurrent upstream requests per scan unless the request or `Index.spec.scan.threads` sets a different value. - Snapshot-served count scans are exact at the snapshot `watermark_ms`. --- # Pipelines Source: https://hevlayer.com/docs/api/pipelines import CodeTabs from "../../../components/docs/CodeTabs.astro"; The pipeline API keeps the code you need to index data simple and organized. A typical pipeline has two stages: extraction and chunking on CPU, followed by embedding on GPU. This guide walks through a best-practice layout for that pipeline; the concepts expand to N stages. ## Document lifecycle ``` put chunks put vectors (new doc) ──────────► pending ──────────────► indexed ▲ │ re-stage (idempotent) ``` - **pending** — chunks stored, waiting for embedding. - **indexed** — vectors written to Turbopuffer. `embedding` is a claim stage: documents sit in it only while leased to a worker, and recover to `pending` when a lease expires. Re-staging a document resets it to `pending` with new chunks, which is how you reprocess after source data changes. ## File tree ``` indexer/ ├── pipelines/ │ ├── extract-chunk.yaml # CPU stage — Pipeline resource │ └── embed.yaml # GPU stage — Pipeline resource ├── extract_chunk.py # read the source, stage chunks ├── embed.py # claim pending docs, write vectors └── app.py # REST API: trigger a run, wait for completion ``` The two YAML files declare the worker images, pools, and scaling — see the [Pipeline CRD](/docs/kubernetes/pipeline-crd) for the fields. Both set `pipelineId: products` so the two workers share one queue. The rest of this page is the worker code — shown in Python and Go; every call is also a plain REST endpoint (see [Write & Stage](/docs/api/write)). ## Extract and chunk The CPU worker reads the source, splits text into chunks, and stages them. Staging chunks stores them durably (S3, cached in the document cache) and marks the document `pending`. The worker hardcodes nothing: the operator injects the pipeline id, the gateway URL, and `spec.sourceRef` as environment variables — see the [worker variables](/docs/kubernetes/pipeline-crd#worker) on the CRD page. The queue URL below comes from the `sourceRef` declared in `pipelines/extract-chunk.yaml`. ```python # extract_chunk.py import asyncio import json import os import boto3 from hevlayer import AsyncHevlayer PIPELINE = os.environ["HEVLAYER_PIPELINE_ID"] SOURCE = json.loads(os.environ["HEVLAYER_SOURCE_REF"]) sqs = boto3.client("sqs") def chunks(text: str, size: int = 800) -> list[str]: return [text[i : i + size] for i in range(0, len(text), size)] async def main() -> None: async with AsyncHevlayer( base_url=os.environ["HEVLAYER_BASE_URL"], api_key=os.environ.get("LAYER_GATEWAY_API_KEY"), ) as layer: while True: batch = sqs.receive_message( QueueUrl=SOURCE["queueUrl"], MaxNumberOfMessages=10, ).get("Messages", []) for m in batch: doc = json.loads(m["Body"]) await layer.put_pipeline_document_chunks(PIPELINE, doc["id"], { "chunks": [ {"id": f"{doc['id']}-{i}", "text": t} for i, t in enumerate(chunks(doc["text"])) ], }) sqs.delete_message(QueueUrl=SOURCE["queueUrl"], ReceiptHandle=m["ReceiptHandle"]) asyncio.run(main()) ``` ```go // extract_chunk.go package main import ( "context" "encoding/json" "fmt" "os" "github.com/aws/aws-sdk-go-v2/config" "github.com/aws/aws-sdk-go-v2/service/sqs" hevlayer "github.com/hev/layer/clients/go" ) func chunks(text string, size int) []string { var out []string for i := 0; i < len(text); i += size { out = append(out, text[i:min(i+size, len(text))]) } return out } func main() { ctx := context.Background() pipeline := os.Getenv("HEVLAYER_PIPELINE_ID") var source struct { QueueURL string `json:"queueUrl"` } json.Unmarshal([]byte(os.Getenv("HEVLAYER_SOURCE_REF")), &source) cfg, _ := config.LoadDefaultConfig(ctx) queue := sqs.NewFromConfig(cfg) layer := hevlayer.NewClient( hevlayer.WithBaseURL(os.Getenv("HEVLAYER_BASE_URL")), hevlayer.WithAPIKey(os.Getenv("LAYER_GATEWAY_API_KEY")), ) for { batch, err := queue.ReceiveMessage(ctx, &sqs.ReceiveMessageInput{ QueueUrl: &source.QueueURL, MaxNumberOfMessages: 10, }) if err != nil { continue } for _, m := range batch.Messages { var doc struct { ID string `json:"id"` Text string `json:"text"` } json.Unmarshal([]byte(*m.Body), &doc) var staged []hevlayer.Chunk for i, t := range chunks(doc.Text, 800) { staged = append(staged, hevlayer.Chunk{ID: fmt.Sprintf("%s-%d", doc.ID, i), Text: t}) } layer.PutPipelineDocumentChunks(ctx, pipeline, doc.ID, &hevlayer.PutChunksRequest{Chunks: staged}) queue.DeleteMessage(ctx, &sqs.DeleteMessageInput{ QueueUrl: &source.QueueURL, ReceiptHandle: m.ReceiptHandle, }) } } } ``` ```typescript // extract_chunk.ts import { DeleteMessageCommand, ReceiveMessageCommand, SQSClient, } from "@aws-sdk/client-sqs"; import { Hevlayer } from "hevlayer"; const PIPELINE = process.env.HEVLAYER_PIPELINE_ID!; const SOURCE = JSON.parse(process.env.HEVLAYER_SOURCE_REF!); const sqs = new SQSClient({}); const layer = new Hevlayer({ baseUrl: process.env.HEVLAYER_BASE_URL, apiKey: process.env.LAYER_GATEWAY_API_KEY, }); function chunks(text: string, size = 800): string[] { const out: string[] = []; for (let i = 0; i < text.length; i += size) out.push(text.slice(i, i + size)); return out; } while (true) { const batch = await sqs.send(new ReceiveMessageCommand({ QueueUrl: SOURCE.queueUrl, MaxNumberOfMessages: 10, })); for (const message of batch.Messages ?? []) { const doc = JSON.parse(message.Body ?? "{}"); await layer.putPipelineDocumentChunks(PIPELINE, doc.id, { chunks: chunks(doc.text).map((text, i) => ({ id: `${doc.id}-${i}`, text })), }); await sqs.send(new DeleteMessageCommand({ QueueUrl: SOURCE.queueUrl, ReceiptHandle: message.ReceiptHandle, })); } } ``` ## Embed The GPU worker claims pending documents, reads their chunks back, and writes vectors. Writing vectors upserts to Turbopuffer and marks the document `indexed`. Claims are leased, so a worker that crashes loses nothing. ```python # embed.py import asyncio import os from hevlayer import AsyncHevlayer from sentence_transformers import SentenceTransformer PIPELINE = os.environ["HEVLAYER_PIPELINE_ID"] model = SentenceTransformer("all-MiniLM-L6-v2") async def main() -> None: async with AsyncHevlayer( base_url=os.environ["HEVLAYER_BASE_URL"], api_key=os.environ.get("LAYER_GATEWAY_API_KEY"), ) as layer: while True: claimed = await layer.claim_documents(PIPELINE, { "stage": "pending", "claim_stage": "embedding", "limit": 16, "worker_id": "embed-0", }) for doc_id in claimed.documents: doc_chunks = await layer.get_pipeline_document_chunks(PIPELINE, doc_id) vectors = model.encode([c.text for c in doc_chunks]) await layer.put_pipeline_document_vectors(PIPELINE, doc_id, { "vectors": [ {"id": c.id, "vector": v.tolist(), "attributes": {"text": c.text}} for c, v in zip(doc_chunks, vectors) ], }) asyncio.run(main()) ``` ```go // embed.go package main import ( "context" "os" hevlayer "github.com/hev/layer/clients/go" ) func main() { ctx := context.Background() pipeline := os.Getenv("HEVLAYER_PIPELINE_ID") layer := hevlayer.NewClient( hevlayer.WithBaseURL(os.Getenv("HEVLAYER_BASE_URL")), hevlayer.WithAPIKey(os.Getenv("LAYER_GATEWAY_API_KEY")), ) for { claimed, err := layer.ClaimDocuments(ctx, pipeline, &hevlayer.ClaimDocumentsRequest{ Stage: "pending", ClaimStage: "embedding", Limit: 16, WorkerID: "embed-0", }) if err != nil { continue } for _, docID := range claimed.Documents { docChunks, err := layer.GetPipelineDocumentChunks(ctx, pipeline, docID) if err != nil { continue } texts := make([]string, len(*docChunks)) for i, c := range *docChunks { texts[i] = c.Text } vectors := embed(texts) // your embedding model or service entries := make([]hevlayer.VectorEntry, len(*docChunks)) for i, c := range *docChunks { entries[i] = hevlayer.VectorEntry{ ID: c.ID, Vector: vectors[i], Attributes: map[string]interface{}{"text": c.Text}, } } layer.PutPipelineDocumentVectors(ctx, pipeline, docID, &hevlayer.PutVectorsRequest{Vectors: entries}) } } } ``` ```typescript // embed.ts import { Hevlayer } from "hevlayer"; const PIPELINE = process.env.HEVLAYER_PIPELINE_ID!; const layer = new Hevlayer({ baseUrl: process.env.HEVLAYER_BASE_URL, apiKey: process.env.LAYER_GATEWAY_API_KEY, }); while (true) { const claimed = await layer.claimDocuments(PIPELINE, { stage: "pending", claim_stage: "embedding", limit: 16, worker_id: "embed-0", }); for (const docId of claimed.documents) { const docChunks = await layer.getPipelineDocumentChunks(PIPELINE, docId); const vectors = await embed(docChunks.map((chunk) => chunk.text)); await layer.putPipelineDocumentVectors(PIPELINE, docId, { vectors: docChunks.map((chunk, i) => ({ id: chunk.id, vector: vectors[i], attributes: { text: chunk.text }, })), }); } } ``` ## Deploy Build the two workers into the images your YAML references and push them to a registry your cluster can pull — Layer does not build images. Then apply the resources: ```sh kubectl apply -f pipelines/ ``` The operator creates one Deployment per resource and the embed pool's KEDA object. Order doesn't matter here: the app creates the gateway pipeline before it enqueues a batch (staging into a pipeline id that doesn't exist returns 404), so workers never see a missing pipeline. Nothing else to wire: the CRD [types themselves](/docs/install#helm) install with the Helm chart. ## Trigger a run The app exposes the pipeline to the rest of your system as one endpoint: `POST /index-runs` sends a batch to the source queue, then waits for the run to complete and returns the snapshot it produced. The pipeline is created on first use — this is where the target namespace is set in code. ```python # app.py import asyncio import json import os import time import boto3 from fastapi import FastAPI from hevlayer import AsyncHevlayer, HevlayerError QUEUE = "https://sqs.us-east-1.amazonaws.com/123456789/product-updates" sqs = boto3.client("sqs") app = FastAPI() layer = AsyncHevlayer( base_url=os.environ["HEVLAYER_BASE_URL"], api_key=os.environ.get("LAYER_GATEWAY_API_KEY"), ) @app.post("/index-runs") async def index_run(documents: list[dict]) -> dict: started_ms = int(time.time() * 1000) try: await layer.create_pipeline({"id": "products", "target_namespace": "products"}) except HevlayerError as e: if e.status_code != 409: # 409: already exists raise for doc in documents: sqs.send_message(QueueUrl=QUEUE, MessageBody=json.dumps(doc)) await drain() sha = await next_snapshot(after_ms=started_ms) return {"documents": len(documents), "snapshot": sha} ``` ```go // app.go var ( queueURL = "https://sqs.us-east-1.amazonaws.com/123456789/product-updates" queue *sqs.Client // sqs.NewFromConfig in main layer = hevlayer.NewClient( hevlayer.WithBaseURL(os.Getenv("HEVLAYER_BASE_URL")), hevlayer.WithAPIKey(os.Getenv("LAYER_GATEWAY_API_KEY")), ) ) func indexRun(w http.ResponseWriter, r *http.Request) { ctx := r.Context() startedMs := time.Now().UnixMilli() var documents []map[string]interface{} json.NewDecoder(r.Body).Decode(&documents) _, err := layer.CreatePipeline(ctx, &hevlayer.CreatePipelineRequest{ ID: "products", TargetNamespace: "products", }) var herr *hevlayer.HevlayerError if err != nil && !(errors.As(err, &herr) && herr.StatusCode == 409) { // 409: already exists http.Error(w, err.Error(), http.StatusBadGateway) return } for _, doc := range documents { body, _ := json.Marshal(doc) mb := string(body) queue.SendMessage(ctx, &sqs.SendMessageInput{QueueUrl: &queueURL, MessageBody: &mb}) } drain(ctx) sha := nextSnapshot(ctx, startedMs) json.NewEncoder(w).Encode(map[string]interface{}{ "documents": len(documents), "snapshot": sha, }) } ``` ```typescript // app.ts import { SendMessageCommand, SQSClient } from "@aws-sdk/client-sqs"; import { Hevlayer } from "hevlayer"; const queueUrl = "https://sqs.us-east-1.amazonaws.com/123456789/product-updates"; const queue = new SQSClient({}); const layer = new Hevlayer({ baseUrl: process.env.HEVLAYER_BASE_URL, apiKey: process.env.LAYER_GATEWAY_API_KEY, }); async function indexRun(documents: Record[]) { const startedMs = Date.now(); await layer.ensurePipeline({ id: "products", target_namespace: "products" }); for (const doc of documents) { await queue.send(new SendMessageCommand({ QueueUrl: queueUrl, MessageBody: JSON.stringify(doc), })); } await drain(); return { documents: documents.length, snapshot: await nextSnapshot(startedMs) }; } ``` ## Wait for completion A run is complete in two steps: the queue drains, then the consistency watcher observes the namespace stable and writes a [snapshot](/docs/api/snapshots) past the run's watermark. `pending_count` is the same signal KEDA scales on — when it reaches zero, the embed pool scales back to zero. The snapshot SHA addresses facet listings and counts exact at that watermark; flip your application to it. ```python # app.py async def drain() -> None: while True: status = await layer.get_pipeline_status("products") if status.pending_count == 0: # status.counts: {"pending": 0, "indexed": 8530} return await asyncio.sleep(10) async def next_snapshot(after_ms: int) -> str: while True: history = await layer.list_namespace_history("products", limit=1) if history and history[0].watermark_ms >= after_ms: return history[0].sha await asyncio.sleep(30) ``` ```go // app.go func drain(ctx context.Context) { for { status, err := layer.GetPipelineStatus(ctx, "products") if err == nil && status.PendingCount == 0 { // status.Counts: {"pending": 0, "indexed": 8530} return } time.Sleep(10 * time.Second) } } func nextSnapshot(ctx context.Context, afterMs int64) string { for { history, err := layer.ListNamespaceHistory(ctx, "products", &hevlayer.ListNamespaceHistoryParams{Limit: 1}) if err == nil && len(history) > 0 && history[0].WatermarkMs >= afterMs { return history[0].Sha } time.Sleep(30 * time.Second) } } ``` ```typescript // app.ts const sleep = (ms: number) => new Promise((resolve) => setTimeout(resolve, ms)); async function drain() { while (true) { const status = await layer.getPipelineStatus("products"); if (status.pending_count === 0) return; await sleep(10_000); } } async function nextSnapshot(afterMs: number): Promise { while (true) { const history = await layer.listNamespaceHistory("products", { limit: 1 }); if (history.length > 0 && history[0].watermark_ms >= afterMs) { return history[0].sha; } await sleep(30_000); } } ``` Once vectors are indexed, query and fetch them through the namespace API — see [Query & Fetch](/docs/api/query). ## Failure model - Turbopuffer write failures are hard: the vectors route returns 502 and the document stays in `embedding` for re-claim. - Aerospike cache failures do not block chunk reads when S3 backing is present; PostgreSQL connectivity failures return 500 and should be retried with backoff. The stop-writes recovery path and the metrics to watch live in the [failure-mode runbook](/docs/failure-modes#pipeline-stop-writes). - Lease expiry is handled server-side. A worker that crashes mid-embedding has its documents recovered on the next claim sweep. --- # Namespace metadata Source: https://hevlayer.com/docs/api/namespace-metadata import Upstream from "../../../components/docs/Upstream.astro"; import CodeTabs from "../../../components/docs/CodeTabs.astro"; The metadata payload is proxied verbatim from the upstream `/v2/namespaces/{ns}/metadata` endpoint. Schema, row counts, index status, and timestamps follow the upstream contract. Layer adds a single sub-object on top. ## Request ```python metadata = await client.get_namespace_metadata("products") ``` ```go metadata, err := client.GetNamespaceMetadata(ctx, "products") ``` ```typescript const metadata = await client.getNamespaceMetadata("products"); ``` ```bash curl "$LAYER_GATEWAY_URL/v2/namespaces/products/metadata" \ -H "Authorization: Bearer $LAYER_GATEWAY_API_KEY" ``` ```jsonc { // Proxied from Turbopuffer verbatim "schema": { }, "approx_row_count": 12500, "approx_logical_bytes": 48800000, "created_at": "2026-03-15T10:30:45Z", "updated_at": "2026-05-12T18:49:00Z", "last_write_at": "2026-05-12T18:48:30Z", "index": { "status": "up-to-date" }, // Layer enhancement "layer": { "stable_as_of": 1715600400000, "is_stable": true, "indexed": true, "index_lag_rows": 0 } } ``` ## The `layer` block | Field | Meaning | | --- | --- | | `stable_as_of` | Epoch-ms watermark from the most recent stable poll. Null on cold start before the watcher has observed a stable namespace. | | `is_stable` | Whether the most recent poll observed `index.status == "up-to-date"`. False on cold start, true once the watcher catches up. | | `indexed` | Whether every row in the namespace carries an indexed vector. True once the snapshot's indexed-vector row count has caught up to the namespace row count; false while rows are still awaiting their first index, as during a bulk load or a [pipeline](/docs/api/pipelines) mid-flight. Null for FTS-only namespaces, which have no vector column to reconcile. | | `index_lag_rows` | Count of rows present in the namespace that do not yet have an indexed vector. Zero when `indexed` is true. Reconciled from the most recent [snapshot](/docs/api/snapshots), so it trails live writes by the snapshot cadence. | A read for a namespace that does not exist returns upstream's 404, matching Turbopuffer's own metadata endpoint. `is_stable` is the *current* signal — it drives the per-query filter-skip decision on the query path. `stable_as_of` is the *historical* watermark — the cut a filtered query would apply. After a namespace is observed stable, the watcher refreshes this watermark on the stable-tier cadence (`CONSISTENCY_STABLE_POLL_INTERVAL_MS`, default 60000 ms). Writes re-arm the fast tier, so active namespaces are polled on `CONSISTENCY_POLL_INTERVAL_MS`. `indexed` answers a different question than `is_stable`. `is_stable` reports whether the upstream index has caught up on the rows it has *seen*, which is what read-after-write depends on. `indexed` reports whether every row that *should* be present is present and queryable, which is what a bulk load or a [pipeline](/docs/api/pipelines) needs to know it has finished: rows can be staged and counted before their vectors are indexed, so a namespace can read `is_stable: true` while `indexed: false` with a non-zero `index_lag_rows`. The reconciliation runs against the latest [snapshot](/docs/api/snapshots), so `indexed` advances on the snapshot cadence rather than per write. For snapshot history derived from these freshness signals, see [Snapshots](/docs/api/snapshots). ## List namespaces `GET /v2/namespaces` is a Layer-only augmented listing. It pages the upstream namespace list and enriches each row with stability and cache signals. It is the endpoint the dashboard's inventory view reads. ```python namespaces = await client.list_namespaces(prefix="prod", page_size=100) ``` ```go namespaces, err := client.ListNamespaces(ctx, &hevlayer.ListNamespacesParams{ Prefix: "prod", PageSize: 100, }) ``` ```typescript const namespaces = await client.listNamespaces({ prefix: "prod", pageSize: 100, }); ``` ```bash curl "$LAYER_GATEWAY_URL/v2/namespaces?prefix=prod&page_size=100" \ -H "Authorization: Bearer $LAYER_GATEWAY_API_KEY" ``` ```jsonc { "namespaces": [ { "name": "products", "row_count": 12500, "size_bytes": 48800000, "stable_as_of_ms": 1715600400000, "is_stable": true, "index": { "status": "up-to-date" }, "cache_state": {"state": "warm", "warm_inflight": false}, "last_write_ms": 1715600399000, "shadow": false, "labels": {} } ], "next_cursor": "..." } ``` Each row carries freshness signals derived from that row's metadata fetch. `is_stable` is true when `index.status` is `"up-to-date"`, false when it is `"updating"`, and omitted when metadata has no index signal or the fetch failed. `stable_as_of_ms` is set to the metadata observation time for rows reported up to date. `indexed` and `index_lag_rows` live on `GET /v2/namespaces/{namespace}/metadata`, where the gateway can do the snapshot lookup for one namespace without adding object-store reads to the high-fanout list path. `index` is Turbopuffer's indexing state, passed through verbatim: | Field | Meaning | | --- | --- | | `index.status` | `"updating"` or `"up-to-date"`. | | `index.unindexed_bytes` | Write-ahead-log bytes not yet indexed. Present only while `updating` (omitted once caught up). Unindexed data is still searched by queries, so a non-zero value means *behind, but serving* — watch it fall to confirm indexing is draining rather than wedged. | Listing is read-only and does not register namespaces with the consistency watcher. Write traffic and snapshot facet configuration register the namespaces that need durable watermarks. | Query param | Purpose | | --- | --- | | `prefix` | Restrict to namespaces whose name starts with this string. | | `cursor` | Pagination cursor from a prior `next_cursor`. | | `page_size` | Page size; the upstream list page is capped at 1000. | A per-row metadata failure degrades to a row with `metadata_error` set rather than dropping the namespace, so the list stays complete even when a single namespace's metadata call fails. Responses are served from a short-TTL cache (`NAMESPACE_LIST_CACHE_TTL_MS`, default `10000`) so dashboard polling does not fan out a metadata call per namespace per refresh. --- # Warm cache Source: https://hevlayer.com/docs/api/warm-cache import Upstream from "../../../components/docs/Upstream.astro"; import Callout from "../../../components/docs/Callout.astro"; import CodeTabs from "../../../components/docs/CodeTabs.astro"; Layer exposes two warm endpoints. `hint_cache_warm` is the Turbopuffer-compatible hint; `warm` is the Layer-only shortcut that creates a gateway warm job. `GET /v1/namespaces/{ns}/hint_cache_warm` matches Turbopuffer's warm-cache hint. The upstream call advises the index to pre-load. Layer additionally runs cache-warm steps on the gateway side. ## Hint-cache warm With no query parameters, the call is a raw passthrough: the gateway forwards it to Turbopuffer unchanged and returns the upstream response verbatim. Existing Turbopuffer clients keep their exact wire behavior. ```bash curl "$LAYER_GATEWAY_URL/v1/namespaces/products/hint_cache_warm" \ -H "Authorization: Bearer $LAYER_GATEWAY_API_KEY" ``` Supplying any warm option (`turbopuffer`, `documents`, `snapshots`, `page_size`) switches the call into Layer orchestration. Steps then default on; each is independently toggleable: | Step | What it does | | --- | --- | | `turbopuffer=true` | Forwards the warm hint upstream. | | `documents=true` | Starts an origin warm job to backfill the NVMe cache. | | `snapshots=true` | Mirrors the latest S3 snapshot body into NVMe. | ```python result = await client.hint_cache_warm( "products", turbopuffer=False, documents=False, snapshots=True, ) ``` ```typescript const result = await client.hintCacheWarm("products", { turbopuffer: false, documents: false, snapshots: true, }); ``` ```bash curl "$LAYER_GATEWAY_URL/v1/namespaces/products/hint_cache_warm?turbopuffer=false&documents=false&snapshots=true" \ -H "Authorization: Bearer $LAYER_GATEWAY_API_KEY" ``` The generated Go client omits `false` query parameters, so it cannot turn steps off — disable steps over REST (or the Python client) instead. The orchestrated response reports per-step status: ```json { "namespace": "products", "turbopuffer": { "enabled": true, "status": "completed" }, "documents": { "enabled": true, "status": "started", "job": { "id": "warm-job-uuid", "status": "running" } }, "snapshots": { "enabled": true, "status": "completed", "key": "snapshots/products/...", "watermark_ms": 1715600400000, "sha": "..." } } ``` If `documents` is enabled, the response includes a warm job; poll it through `/warm-jobs/{id}`. ## Layer warm `POST /v2/namespaces/{ns}/warm` creates an asynchronous job that pages through Turbopuffer, backfills Aerospike, and refreshes `cache_warmed_through`. Use it when bootstrapping a namespace whose data was written outside the gateway. ```python job = await client.warm_cache("products", page_size=1000) ``` ```go job, err := client.WarmCache(ctx, "products", &hevlayer.WarmCacheParams{ PageSize: 1000, }) ``` ```typescript const job = await client.warmCache("products", { pageSize: 1000 }); ``` ```bash curl -X POST "$LAYER_GATEWAY_URL/v2/namespaces/products/warm?page_size=1000" \ -H "Authorization: Bearer $LAYER_GATEWAY_API_KEY" ``` The response is `202 Accepted` with the warm job: ```json { "id": "warm-job-uuid", "namespace": "products", "status": "running", "progress": 0, "documents_scanned": 0, "created_at": "2026-05-26T10:00:00Z" } ``` Poll it through: ```python job = await client.get_warm_job("products", job.id) ``` ```go job, err := client.GetWarmJob(ctx, "products", jobID) ``` ```typescript const job = await client.getWarmJob("products", jobId); ``` ```bash curl "$LAYER_GATEWAY_URL/v2/namespaces/products/warm-jobs/warm-job-uuid" \ -H "Authorization: Bearer $LAYER_GATEWAY_API_KEY" ``` ## Cache-cold behavior Warm jobs, cache scans, cache snapshot jobs, and pipeline chunk reads return 503 `cache_cold` when the NVMe cache is unavailable. Fetch and fetch-many fall through to Turbopuffer with `x-layer-cache: miss-on-error` instead. The split is deliberate. Fetch is correctness-first: a cache outage must not turn into a missing document. Warm is throughput-first: warming on a cold cache would be wasted work, so the gateway reports the cold state to the caller rather than silently no-op-ing. A bare `hint_cache_warm` passthrough never touches the gateway cache, so it succeeds even while the cache is cold. The orchestrated form returns 503 `cache_cold` only when `documents` or `snapshots` is requested. For how the cache recovers from an outage and the signals to watch, see the [failure-mode runbook](/docs/failure-modes#read). --- # Snapshot History Source: https://hevlayer.com/docs/api/snapshots import CodeTabs from "../../../components/docs/CodeTabs.astro"; Snapshots are materialized facet histograms for a namespace. They carry facet listings in `values[].v` and facet counts in `values[].n`, stored durably in S3 and mirrored into Aerospike for the latest body. Use `POST /snapshots` to materialize a field now. Use history and body routes to read the durable chronology written by the consistency watcher. ## Snapshot policy Configure automatic snapshot writes on the namespace's `Index` CR: ```yaml apiVersion: hevlayer.com/v1 kind: Index metadata: name: products spec: backend: namespace: products snapshot: interval: 5m retention: 30d facetFields: - category - brand ``` | Field | Default | Behavior | | --- | --- | --- | | `facetFields` | `[]` | Facet fields to histogram. Empty or unset disables the automatic snapshot writer for the namespace, so history and activity stay empty. | | `interval` | `5m` | Minimum spacing between automatic snapshot writes. The writer fires on each upstream-stable advance; `interval` only floors how often a write lands. The gateway fallback is `LAYER_SNAPSHOT_MIN_INTERVAL_MS`. | | `retention` | `never` | `never` keeps all history. A duration such as `30d` prunes S3 bodies older than the window, while always keeping the most recent body. | Snapshots are event-driven, not scheduled: an idle namespace does not get a new snapshot just because `interval` elapsed. The gateway refreshes Index policy periodically, so edits take effect without a pod restart. Manual `POST /snapshots` jobs with `source: origin` and the automatic writer use the same shard fan-out path. Origin work is bounded by `spec.scan.threads`; stored and cache snapshot reads do not fan out. ## Routes | Route | Method | Behavior | | --- | --- | --- | | `POST /v2/namespaces/{ns}/snapshots` | POST | Create an on-demand snapshot job for one field. | | `GET /v2/namespaces/{ns}/snapshot-jobs` | GET | List in-memory snapshot jobs. | | `GET /v2/namespaces/{ns}/snapshot-jobs/{id}` | GET | Read one snapshot job. | | `GET /v2/namespaces/{ns}/history` | GET | Newest-first durable snapshot history. | | `GET /v2/namespaces/{ns}/snapshots/{sha}` | GET | Full snapshot body by full SHA or 7-char prefix. | | `GET /v2/activity/snapshots` | GET | Cross-namespace snapshot-write activity stream. | ## Manual snapshot ```python job = await client.create_snapshot("products", { "field": "category", "source": "auto", "filters": ["brand", "Eq", "Acme"], "page_size": 1000, }) ``` ```go job, err := client.CreateSnapshot(ctx, "products", &hevlayer.CreateSnapshotRequest{ Field: "category", Source: "auto", Filters: []interface{}{"brand", "Eq", "Acme"}, PageSize: 1000, }) ``` ```typescript const job = await client.createSnapshot("products", { field: "category", source: "auto", filters: ["brand", "Eq", "Acme"], page_size: 1000, }); ``` ```bash curl -X POST "$LAYER_GATEWAY_URL/v2/namespaces/products/snapshots" \ -H "Authorization: Bearer $LAYER_GATEWAY_API_KEY" \ -H "Content-Type: application/json" \ -d '{ "field": "category", "source": "auto", "filters": ["brand", "Eq", "Acme"], "page_size": 1000 }' ``` Valid sources are `auto`, `stored`, `cache`, and `origin`. | Source | Reads from | Notes | | --- | --- | --- | | `auto` | Stored snapshot when possible, otherwise cache/origin policy | Default. Stored snapshots only support unfiltered configured fields. | | `stored` | Latest S3 snapshot body, with Aerospike mirror as a cache | Fastest path for configured facet fields. | | `cache` | Aerospike document cache | Supports filters the cache can evaluate. | | `origin` | Turbopuffer paginated scan | Authoritative. Persists the computed snapshot body to S3. | The response is `202 Accepted`: ```json { "id": "snapshot-job-uuid", "namespace": "products", "field": "category", "source": "auto", "status": "running", "progress": 0, "documents_scanned": 0, "created_at": "2026-05-26T10:00:00Z" } ``` Poll the job: ```python job = await client.get_snapshot_job("products", job.id) ``` ```go job, err := client.GetSnapshotJob(ctx, "products", jobID) ``` ```typescript const job = await client.getSnapshotJob("products", jobId); ``` ```bash curl "$LAYER_GATEWAY_URL/v2/namespaces/products/snapshot-jobs/snapshot-job-uuid" \ -H "Authorization: Bearer $LAYER_GATEWAY_API_KEY" ``` Completed jobs include `sha` when a body was materialized: ```json { "id": "snapshot-job-uuid", "namespace": "products", "field": "category", "source": "origin", "status": "completed", "documents_scanned": 12844, "sha": "3f9e8b21", "stable_as_of": 1747300000123 } ``` ## History ```python history = await client.list_namespace_history("products", limit=20) ``` ```go history, err := client.ListNamespaceHistory(ctx, "products", &hevlayer.ListNamespaceHistoryParams{Limit: 20}) ``` ```typescript const history = await client.listNamespaceHistory("products", { limit: 20 }); ``` ```bash curl "$LAYER_GATEWAY_URL/v2/namespaces/products/history?limit=20" \ -H "Authorization: Bearer $LAYER_GATEWAY_API_KEY" ``` ```json [ {"watermark_ms": 1747300000123, "sha": "3f9e8b21..."}, {"watermark_ms": 1747299600045, "sha": "a1c5b09f..."} ] ``` | Query param | Default | Purpose | | --- | --- | --- | | `limit` | 50 | Maximum entries returned. Capped at 500. | | `before` | none | Return entries older than this SHA. 7-char prefixes are accepted. | The history endpoint lists S3 keys only; it does not read every snapshot body. ## Snapshot body ```python body = await client.get_namespace_snapshot("products", "3f9e8b2") ``` ```go body, err := client.GetNamespaceSnapshot(ctx, "products", "3f9e8b2") ``` ```typescript const body = await client.getNamespaceSnapshot("products", "3f9e8b2"); ``` ```bash curl "$LAYER_GATEWAY_URL/v2/namespaces/products/snapshots/3f9e8b2" \ -H "Authorization: Bearer $LAYER_GATEWAY_API_KEY" ``` ```json { "namespace": "products", "watermark_ms": 1747300000123, "sha": "3f9e8b21", "row_count": 12500, "fields": [ { "name": "category", "values": [ {"v": "books", "n": 1240}, {"v": "electronics", "n": 873} ] } ], "fields_skipped": [ { "name": "tags", "reason": "exceeded_cap", "distinct_observed": 247000, "cap": 10000 } ] } ``` `fields[].values[].v` is the facet listing. `fields[].values[].n` is the facet count. `row_count` is the number of rows scanned into the snapshot; for vector namespaces, [namespace metadata](/docs/api/namespace-metadata) compares it with the upstream namespace row count to report `indexed` and `index_lag_rows`. Fields present in `fields[]` are complete. Fields above the 10,000 distinct-value cap are listed in `fields_skipped[]` instead of being partially materialized. A skipped field is still enumerable on demand with a [values scan](/docs/api/scans#values-mode), which carries a 1,000,000-value cap instead. ## Activity ```python activity = await client.list_snapshot_activity(since=1747200000000, limit=50) ``` ```go activity, err := client.ListSnapshotActivity(ctx, &hevlayer.ListSnapshotActivityParams{Since: 1747200000000, Limit: 50}) ``` ```typescript const activity = await client.listSnapshotActivity({ since: 1747200000000, limit: 50, }); ``` ```bash curl "$LAYER_GATEWAY_URL/v2/activity/snapshots?since=1747200000000&limit=50" \ -H "Authorization: Bearer $LAYER_GATEWAY_API_KEY" ``` | Query param | Required | Purpose | | --- | --- | --- | | `since` | yes | Epoch-ms lower bound on `ts_ms`. | | `limit` | no | Cap 500, default 50. | | `namespace` | no | Exact namespace filter. | | `cursor` | no | Pagination cursor from `next_cursor`. | Activity is snapshot lifecycle only. Search history and clickstream events have separate feeds. --- # Query History Source: https://hevlayer.com/docs/api/search-history import CodeTabs from "../../../components/docs/CodeTabs.astro"; Layer logs every query the gateway serves into a durable JSONL trail in S3, mirrored into the NVMe cache for fast recent reads. Fetch events that downstream consumers tag back to a query land in a sibling clickstream feed. Together they make a search session reconstructable after the fact — for relevance tuning, A/B comparison, or incident review. Both feeds are Layer-only. ## Routes | Route | Behavior | | --- | --- | | `GET /v2/namespaces/{ns}/search-history` | Per-namespace query log, newest first. | | `GET /v2/namespaces/{ns}/clickstream` | Fetch events correlated to a search, newest first. | The `/v1/` versions of both routes are identical aliases held for client compatibility. ## Search history entry ```json { "entries": [ { "timestamp": "2026-05-22T08:00:00.000Z", "timestamp_nanos": 1747900800000000000, "namespace": "products", "trace_id": "f81d4fae-7dec-11d0-a765-00a0c91e6bf6", "raw_query": "wireless headphones", "stable_as_of": 1747900700000, "query": {"vector": "[…]", "top_k": 10, "filters": "[…]"}, "top_result_ids": ["asin-B08N5WRWNW", "asin-B07PXGQC1Q"], "tags": ["app:hev-shop", "route:search", "surface:storefront"] } ], "next_cursor": "1747900799000000000" } ``` | Field | Meaning | | --- | --- | | `timestamp` / `timestamp_nanos` | Wall-clock and nanosecond timestamps. `timestamp_nanos` is the pagination cursor. | | `trace_id` | Trace context propagated or generated for the query. Joins to the clickstream feed. | | `raw_query` | Caller-supplied query string from the `x-hevlayer-search-query` header (e.g. the BM25 input). Omitted when the header is absent. | | `stable_as_of` | Epoch-ms namespace watermark used by the served response. Omitted on cold-start gateways before the namespace has a watermark. | | `query` | Structured query summary — vector shape, filters, ranking. | | `top_result_ids` | IDs from the served response, in rank order. | | `tags` | Caller-supplied labels propagated through request headers. Used for ad-hoc segmentation. | [Hybrid text](/docs/api/query#hybrid-text-fusion) queries log as a single entry whose `query` carries the `HybridText` expression, not the expanded legs, so re-issuing the logged query reproduces the whole expansion (tokenization, fuzzy legs, fusion) as a unit. [Routed](/docs/api/query#query-routing) queries additionally carry the routing decision (route, policy version, executed), so per-route engagement can be measured against the clickstream and a logged query can be replayed under a forced route. ### Writing metadata Set `x-hevlayer-search-query` on query requests to capture the human input, and set `x-hevlayer-tags` to a comma-separated list of segmentation tags. The Python client exposes these as the `raw_query` and `tags` keyword arguments; the Go client as the `WithSearchQuery` and `WithSearchTags` request options: ```python query = await client.query_namespace( "products", {"vector": embedding, "top_k": 10, "include_attributes": ["title"]}, raw_query="wireless headphones", tags=["app:hev-shop", "surface:storefront", "route:search", "page:first"], ) history = await client.list_search_history( "products", tags=["app:hev-shop", "route:search", "page:first"], limit=20, ) ``` ```go query, err := client.QueryNamespace(ctx, "products", &hevlayer.QueryRequest{Vector: embedding, TopK: 10, IncludeAttributes: []string{"title"}}, hevlayer.WithSearchQuery("wireless headphones"), hevlayer.WithSearchTags([]string{"app:hev-shop", "surface:storefront", "route:search", "page:first"}), ) history, err := client.ListSearchHistory(ctx, "products", &hevlayer.ListSearchHistoryParams{ Tag: []string{"app:hev-shop", "route:search", "page:first"}, Limit: 20, }) ``` ```typescript const query = await client.queryNamespace( "products", { vector: embedding, top_k: 10, include_attributes: ["title"] }, { searchQuery: "wireless headphones", tags: ["app:hev-shop", "surface:storefront", "route:search", "page:first"], }, ); const history = await client.listSearchHistory("products", { tags: ["app:hev-shop", "route:search", "page:first"], limit: 20, }); ``` ```bash curl -X POST "$LAYER_GATEWAY_URL/v2/namespaces/products/query" \ -H "Authorization: Bearer $LAYER_GATEWAY_API_KEY" \ -H "Content-Type: application/json" \ -H "x-hevlayer-search-query: wireless headphones" \ -H "x-hevlayer-tags: app:hev-shop,surface:storefront,route:search,page:first" \ -d '{"vector": [0.0012, -0.043], "top_k": 10, "include_attributes": ["title"]}' curl "$LAYER_GATEWAY_URL/v2/namespaces/products/search-history?tag=app:hev-shop,route:search,page:first&limit=20" \ -H "Authorization: Bearer $LAYER_GATEWAY_API_KEY" ``` Keep the query text in `raw_query`; use tags for segmentation, not for duplicating the query string. ### Tag contract Layer splits `x-hevlayer-tags` and `?tag=` on commas, trims whitespace, drops empty values, then sorts and dedupes tags before storing or matching them. Commas are separators and cannot be escaped. Limits: | Limit | Value | | --- | --- | | Max tags | 32 unique tags per request or filter | | Max tag length | 128 bytes | | Allowed characters | ASCII letters, digits, `:`, `_`, `-`, `.`, `/`, `=`, `+` | The list filter uses AND semantics: `?tag=a,b` returns only entries that carry both `a` and `b`. ### Query parameters | Param | Purpose | | --- | --- | | `tag` | Comma-separated tag filter. AND semantics — every tag must match. | | `from` / `to` | RFC3339 time bounds. | | `before` | Pagination cursor; return entries strictly older than the given `timestamp_nanos`. | | `limit` | Cap 500, default 50. | ## Clickstream entry ```json { "events": [ { "timestamp": "2026-05-22T08:00:02.143Z", "timestamp_nanos": 1747900802143000000, "trace_id": "f81d4fae-7dec-11d0-a765-00a0c91e6bf6", "namespace": "products", "doc_id": "asin-B08N5WRWNW", "tags": ["session:abc123"], "source": "fetch", "served_from": "cache" } ], "next_cursor": "1747900802142000000" } ``` `trace_id` joins to the search-history entry that produced the result; `served_from` distinguishes a cache hit from an upstream fetch. `trace_id` is also a supported query parameter so you can pull every event for a single search session: ```python events = await client.list_clickstream( "products", trace_id="f81d4fae-7dec-11d0-a765-00a0c91e6bf6", ) ``` ```go events, err := client.ListClickstream(ctx, "products", &hevlayer.ListClickstreamParams{ TraceID: "f81d4fae-7dec-11d0-a765-00a0c91e6bf6", }) ``` ```typescript const events = await client.listClickstream("products", { traceId: "f81d4fae-7dec-11d0-a765-00a0c91e6bf6", }); ``` ```bash curl "$LAYER_GATEWAY_URL/v2/namespaces/products/clickstream?trace_id=f81d4fae-7dec-11d0-a765-00a0c91e6bf6" \ -H "Authorization: Bearer $LAYER_GATEWAY_API_KEY" ``` ## Storage ```text search-history/{namespace}/{YYYY-MM-DD}/{timestamp_nanos}.jsonl ``` Writes are best-effort and never block the query response. Aerospike holds a recent window for fast reads; S3 is the durable store. A cache outage degrades read latency but not durability — list calls walk the S3 prefix and merge inline.