Operations
Install
A hev layer install has two stages. Terraform provisions the required AWS resources: IAM, S3, ECR, networking, cost-read roles, and, for the recommended path, a fresh EKS cluster. Helm installs the gateway, operator, and document cache into that cluster and wires them to the AWS resources Terraform produced.
You can skip Terraform if you already have the AWS resources hev layer needs. At minimum, provide an S3 bucket and gateway IRSA role for snapshots and history. For the full feature set, also provide dashboard cost-read IAM, image registry locations, and cluster-level components equivalent to the Terraform outputs.
Install shape
An install is one Helm release per environment with one S3 bucket
for snapshot and history data. The chart renders a default
VectorStore from the credential you
provide; an install can define additional VectorStore resources, each
with its own upstream credential and inbound auth policy, and route
namespaces between them with Index.spec.backend.storeRef. Scoped
gateway-only bearer keys are available through the keys inbound auth
mode described below.
Terraform
The Terraform configuration in infra/terraform/ provisions the AWS
resources that the gateway and operator need. It is opinionated about
the resources hev layer needs to behave correctly and conservative about
resources around it. Route53 hosted zones and ACM certificates are
opt-in; most installs bring existing DNS and TLS.
What it sets up
| Resource | Purpose |
|---|---|
| S3 bucket | Durable storage for namespace snapshots, search history, and clickstream events. |
| IAM roles + IRSA policies | Gateway S3 access, dashboard cost-read access, and worker/operator AWS access. |
| ECR repositories | Image registry for the gateway, operator, and customer-built function images. |
| EKS + VPC + node pools | Recommended fresh-cluster runtime for design partners. |
| Route53 + ACM | Optional DNS zones, records, and TLS certificates when manage_public_dns=true. |
Cluster: recommended
Design-partner installs should use a fresh EKS cluster unless there is a specific reason to bind hev layer to an existing one. The cluster path provisions:
- a VPC with the subnets and endpoints hev layer expects
- an EKS control plane and one always-on
systemnode group, defaulting to ani4i.largeso the serving path and document cache share local NVMe - public worker subnets by default, with no NAT Gateway in the fresh cluster path
- Karpenter for scale-from-zero
worker-cpuandworker-gpuindexing capacity - the AWS Load Balancer Controller for ingress
- EFS for shared persistent volumes
If you already operate an EKS cluster, you can disable the cluster modules and point hev layer at the existing cluster. You are still responsible for the functional prerequisites: an S3 bucket for snapshots/history, gateway IRSA that can read/write that bucket, dashboard IRSA for AWS cost and pricing reads, image registry access, Karpenter or equivalent node autoscaling for workers, and the AWS Load Balancer Controller if you use public ingress.
Cost notes
The Terraform is designed to deploy a cost-efficient AWS footprint with
autoscaling for on-demand indexing work. At rest, the fixed costs are
mostly EKS, one i4i system node, the shared ALB, and small storage
lines. On current us-east-1 on-demand pricing, that baseline is roughly
the low hundreds of dollars per month before variable traffic, object
storage, and upstream vector-store usage. Indexing bursts scale CPU or
GPU worker nodes up through Karpenter and back down when queues drain.
If you switch workers to private subnets, enabling NAT adds a standing
hourly and egress cost.
Heavier search use cases may need more read-side infrastructure: additional gateway replicas, larger always-on nodes, or a dedicated document-cache pool for steady cache pressure. Contact hev layer for help sizing read-heavy deployments.
Outputs
Terraform emits the values the Helm chart needs to install: the S3 bucket name, gateway IRSA role ARN, dashboard cost-read role ARN, ECR image URLs, and cluster metadata. Pass these into the Helm values file described below.
Helm
The Helm chart at infra/helm/layer/ installs the gateway, operator, and
document cache into a cluster that already has the AWS resources from
Terraform or equivalent resources you manage.
Required values
Most of the chart is opinionated defaults. In a typical install the
credential you bring from outside the cluster becomes the default
VectorStore credential.
| Value | Required | Notes |
|---|---|---|
vectorStore.credential.apiKey | yes | Upstream store credential. With the default deriveFromStore auth mode, clients also send this as the gateway bearer token. |
vectorStore.endpoint.url | yes | Upstream store API base URL. Defaults to Turbopuffer’s AWS us-east-1 endpoint. |
vectorStore.endpoint.region | yes | Region label for the rendered VectorStore. |
vectorStore.inboundAuth.mode | no | deriveFromStore, keys, or open. Defaults to deriveFromStore. |
vectorStore.inboundAuth.keys | for keys mode | Gateway-only bearer keys with read, write, and admin scopes. |
gateway.image | yes | Gateway image URL — Terraform emits this as an ECR output. |
s3.bucket | yes | S3 bucket Terraform created for snapshots and history. |
serviceAccount.roleArn | yes | IRSA role ARN that grants the gateway access to the S3 bucket. |
gateway.indexNamespace | no | Namespace containing Index CRs. Blank follows operator.discovery.indexNamespace, then the Helm release namespace. |
gateway.indexConfig.enabled | no | Enables gateway reads of Index CR routing and policy such as spec.backend.storeRef, spec.snapshot.facetFields, and spec.scan.threads. |
gateway.indexGc.enabled | no | Enables namespace hard-delete cleanup of operator-discovered Index CRs. |
gateway.consistency.stablePollIntervalMs | no | Slow polling cadence for namespaces last observed stable. Defaults to 60000; cold and updating namespaces keep the fast gateway default. |
dashboard.serviceAccount.roleArn | for cost tab | IRSA role ARN with AWS pricing, CloudWatch, and cost read access. |
ingress.host | optional | Set when you want a public ingress; use your DNS/TLS or enable Terraform-managed Route53/ACM. |
Gateway auth modes
The default deriveFromStore mode is the single-tenant BYOC path:
vectorStore:
credential:
apiKey: tpuf_...
inboundAuth:
mode: deriveFromStore
For an install that needs a gateway-only bearer, use keys mode. The
chart renders apiKey values into the release Secret and references them
from the VectorStore; omit apiKey when pointing at a pre-created Secret.
vectorStore:
credential:
apiKey: tpuf_...
inboundAuth:
mode: keys
workerSecretKey: layer-inbound-worker-api-key
keys:
- name: worker
scopes: [read, write, admin]
apiKey: layer_worker_...
secretRef:
key: layer-inbound-worker-api-key
In keys mode, operator workers, KEDA, and the dashboard use
workerSecretName / workerSecretKey as their gateway bearer. Blank
workerSecretName uses the release Secret; blank workerSecretKey uses
layer-inbound-worker-api-key.
Run the install
helm upgrade --install layer ./infra/helm/layer \
--namespace layer --create-namespace \
-f values.customer.yaml
The chart is not published to a public Helm repository — install from the source path or from the chart artifact provided during onboarding.
What gets installed
layer-gateway— Rust gateway for Turbopuffer-compatible routes, fetch, scans, snapshots, warm jobs, and pipeline state.layer-operator— reconciler for VectorStore, Index, InfraRules, Pipeline, and Function CRDs documented in Kubernetes.layer-document-cache— Aerospike-backed document cache, scale-to-zero by default, scheduled onto the always-on i4i system node in the baseline profile.- Optional Karpenter
NodePool/EC2NodeClassresources forworker-cpuandworker-gpuindexing capacity whenworkerKarpenter.enabled=true. A dedicateddocument-cachepool is still available for larger installs by settingdocumentCache.nodeRole=document-cacheanddocumentCache.karpenter.enabled=true. - Supporting resources: service accounts, IRSA bindings, ingress, and CRDs.
Default InfraRules
When operator.infraRules.create=true, Helm renders the cluster-scoped
InfraRules/default object used by every Pipeline and Function
spec.scaling.pool reference. If a workload omits scaling.pool, the
operator maps worker.computeClass: cpu or gpu to the stock cpu or
gpu pool.
The default compute pools are:
| Pool | Use |
|---|---|
cpu | General CPU workers such as extraction, ingestion, and lightweight Functions. |
cpu-large | CPU workers that need local ephemeral-storage headroom for per-pod source caches. |
gpu | One-NVIDIA-GPU workers for embedding and model inference. |
The stock pools select layer.hev.dev/node-role=worker-cpu or
worker-gpu, matching the chart’s workerKarpenter NodePools. Override
operator.infraRules.computePools to tune resource requests, limits,
node selectors, tolerations, GPU SKU hints, or per-workload replica
ceilings for your cluster.
See InfraRules CRD for the full field shape.