See it in action on the hev-shop demo store.

Operations

Install

A hev layer install has two stages. Terraform provisions the required AWS resources: IAM, S3, ECR, networking, cost-read roles, and, for the recommended path, a fresh EKS cluster. Helm installs the gateway, operator, and document cache into that cluster and wires them to the AWS resources Terraform produced.

You can skip Terraform if you already have the AWS resources hev layer needs. At minimum, provide an S3 bucket and gateway IRSA role for snapshots and history. For the full feature set, also provide dashboard cost-read IAM, image registry locations, and cluster-level components equivalent to the Terraform outputs.

Install shape

An install is one Helm release per environment with one S3 bucket for snapshot and history data. The chart renders a default VectorStore from the credential you provide; an install can define additional VectorStore resources, each with its own upstream credential and inbound auth policy, and route namespaces between them with Index.spec.backend.storeRef. Scoped gateway-only bearer keys are available through the keys inbound auth mode described below.

Terraform

The Terraform configuration in infra/terraform/ provisions the AWS resources that the gateway and operator need. It is opinionated about the resources hev layer needs to behave correctly and conservative about resources around it. Route53 hosted zones and ACM certificates are opt-in; most installs bring existing DNS and TLS.

What it sets up

ResourcePurpose
S3 bucketDurable storage for namespace snapshots, search history, and clickstream events.
IAM roles + IRSA policiesGateway S3 access, dashboard cost-read access, and worker/operator AWS access.
ECR repositoriesImage registry for the gateway, operator, and customer-built function images.
EKS + VPC + node poolsRecommended fresh-cluster runtime for design partners.
Route53 + ACMOptional DNS zones, records, and TLS certificates when manage_public_dns=true.

Design-partner installs should use a fresh EKS cluster unless there is a specific reason to bind hev layer to an existing one. The cluster path provisions:

  • a VPC with the subnets and endpoints hev layer expects
  • an EKS control plane and one always-on system node group, defaulting to an i4i.large so the serving path and document cache share local NVMe
  • public worker subnets by default, with no NAT Gateway in the fresh cluster path
  • Karpenter for scale-from-zero worker-cpu and worker-gpu indexing capacity
  • the AWS Load Balancer Controller for ingress
  • EFS for shared persistent volumes

If you already operate an EKS cluster, you can disable the cluster modules and point hev layer at the existing cluster. You are still responsible for the functional prerequisites: an S3 bucket for snapshots/history, gateway IRSA that can read/write that bucket, dashboard IRSA for AWS cost and pricing reads, image registry access, Karpenter or equivalent node autoscaling for workers, and the AWS Load Balancer Controller if you use public ingress.

Cost notes

The Terraform is designed to deploy a cost-efficient AWS footprint with autoscaling for on-demand indexing work. At rest, the fixed costs are mostly EKS, one i4i system node, the shared ALB, and small storage lines. On current us-east-1 on-demand pricing, that baseline is roughly the low hundreds of dollars per month before variable traffic, object storage, and upstream vector-store usage. Indexing bursts scale CPU or GPU worker nodes up through Karpenter and back down when queues drain. If you switch workers to private subnets, enabling NAT adds a standing hourly and egress cost.

Heavier search use cases may need more read-side infrastructure: additional gateway replicas, larger always-on nodes, or a dedicated document-cache pool for steady cache pressure. Contact hev layer for help sizing read-heavy deployments.

Outputs

Terraform emits the values the Helm chart needs to install: the S3 bucket name, gateway IRSA role ARN, dashboard cost-read role ARN, ECR image URLs, and cluster metadata. Pass these into the Helm values file described below.

Helm

The Helm chart at infra/helm/layer/ installs the gateway, operator, and document cache into a cluster that already has the AWS resources from Terraform or equivalent resources you manage.

Required values

Most of the chart is opinionated defaults. In a typical install the credential you bring from outside the cluster becomes the default VectorStore credential.

ValueRequiredNotes
vectorStore.credential.apiKeyyesUpstream store credential. With the default deriveFromStore auth mode, clients also send this as the gateway bearer token.
vectorStore.endpoint.urlyesUpstream store API base URL. Defaults to Turbopuffer’s AWS us-east-1 endpoint.
vectorStore.endpoint.regionyesRegion label for the rendered VectorStore.
vectorStore.inboundAuth.modenoderiveFromStore, keys, or open. Defaults to deriveFromStore.
vectorStore.inboundAuth.keysfor keys modeGateway-only bearer keys with read, write, and admin scopes.
gateway.imageyesGateway image URL — Terraform emits this as an ECR output.
s3.bucketyesS3 bucket Terraform created for snapshots and history.
serviceAccount.roleArnyesIRSA role ARN that grants the gateway access to the S3 bucket.
gateway.indexNamespacenoNamespace containing Index CRs. Blank follows operator.discovery.indexNamespace, then the Helm release namespace.
gateway.indexConfig.enablednoEnables gateway reads of Index CR routing and policy such as spec.backend.storeRef, spec.snapshot.facetFields, and spec.scan.threads.
gateway.indexGc.enablednoEnables namespace hard-delete cleanup of operator-discovered Index CRs.
gateway.consistency.stablePollIntervalMsnoSlow polling cadence for namespaces last observed stable. Defaults to 60000; cold and updating namespaces keep the fast gateway default.
dashboard.serviceAccount.roleArnfor cost tabIRSA role ARN with AWS pricing, CloudWatch, and cost read access.
ingress.hostoptionalSet when you want a public ingress; use your DNS/TLS or enable Terraform-managed Route53/ACM.

Gateway auth modes

The default deriveFromStore mode is the single-tenant BYOC path:

vectorStore:
  credential:
    apiKey: tpuf_...
  inboundAuth:
    mode: deriveFromStore

For an install that needs a gateway-only bearer, use keys mode. The chart renders apiKey values into the release Secret and references them from the VectorStore; omit apiKey when pointing at a pre-created Secret.

vectorStore:
  credential:
    apiKey: tpuf_...
  inboundAuth:
    mode: keys
    workerSecretKey: layer-inbound-worker-api-key
    keys:
      - name: worker
        scopes: [read, write, admin]
        apiKey: layer_worker_...
        secretRef:
          key: layer-inbound-worker-api-key

In keys mode, operator workers, KEDA, and the dashboard use workerSecretName / workerSecretKey as their gateway bearer. Blank workerSecretName uses the release Secret; blank workerSecretKey uses layer-inbound-worker-api-key.

Run the install

helm upgrade --install layer ./infra/helm/layer \
  --namespace layer --create-namespace \
  -f values.customer.yaml

The chart is not published to a public Helm repository — install from the source path or from the chart artifact provided during onboarding.

What gets installed

  • layer-gateway — Rust gateway for Turbopuffer-compatible routes, fetch, scans, snapshots, warm jobs, and pipeline state.
  • layer-operator — reconciler for VectorStore, Index, InfraRules, Pipeline, and Function CRDs documented in Kubernetes.
  • layer-document-cache — Aerospike-backed document cache, scale-to-zero by default, scheduled onto the always-on i4i system node in the baseline profile.
  • Optional Karpenter NodePool / EC2NodeClass resources for worker-cpu and worker-gpu indexing capacity when workerKarpenter.enabled=true. A dedicated document-cache pool is still available for larger installs by setting documentCache.nodeRole=document-cache and documentCache.karpenter.enabled=true.
  • Supporting resources: service accounts, IRSA bindings, ingress, and CRDs.

Default InfraRules

When operator.infraRules.create=true, Helm renders the cluster-scoped InfraRules/default object used by every Pipeline and Function spec.scaling.pool reference. If a workload omits scaling.pool, the operator maps worker.computeClass: cpu or gpu to the stock cpu or gpu pool.

The default compute pools are:

PoolUse
cpuGeneral CPU workers such as extraction, ingestion, and lightweight Functions.
cpu-largeCPU workers that need local ephemeral-storage headroom for per-pod source caches.
gpuOne-NVIDIA-GPU workers for embedding and model inference.

The stock pools select layer.hev.dev/node-role=worker-cpu or worker-gpu, matching the chart’s workerKarpenter NodePools. Override operator.infraRules.computePools to tune resource requests, limits, node selectors, tolerations, GPU SKU hints, or per-workload replica ceilings for your cluster.

See InfraRules CRD for the full field shape.

esc