Docs / kubernetes-runtime-gateway

Kubernetes deployment of the Runtime AI Gateway

← All docs

Deploy the Penaxtra Runtime AI Gateway as a sidecar container or as a cluster-internal Deployment + Service. Both patterns keep prompt content inside the customer VPC.

Pattern A: cluster-internal Deployment + Service

Recommended for shared use across multiple LLM client applications.

apiVersion: v1
kind: Secret
metadata:
  name: penaxtra-enroll
  namespace: ai-platform
type: Opaque
stringData:
  enroll-token: "<paste from workspace>"
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: penaxtra-gateway
  namespace: ai-platform
spec:
  replicas: 2
  selector:
    matchLabels: { app: penaxtra-gateway }
  template:
    metadata:
      labels: { app: penaxtra-gateway }
    spec:
      containers:
        - name: gateway
          image: registry.penaxtra.com/runtime-gateway:latest
          args:
            - "--listen=0.0.0.0:8443"
            - "--upstream=https://api.upstream-llm.example"
          env:
            - name: PNX_ENROLL_TOKEN
              valueFrom:
                secretKeyRef:
                  name: penaxtra-enroll
                  key: enroll-token
          resources:
            requests: { cpu: "100m", memory: "128Mi" }
            limits:   { cpu: "500m", memory: "256Mi" }
          readinessProbe:
            httpGet: { path: /healthz, port: 8443 }
            initialDelaySeconds: 3
---
apiVersion: v1
kind: Service
metadata:
  name: penaxtra-gateway
  namespace: ai-platform
spec:
  selector: { app: penaxtra-gateway }
  ports:
    - port: 443
      targetPort: 8443

LLM client applications inside the cluster point to https://penaxtra-gateway.ai-platform.svc.cluster.local.

Pattern B: sidecar container

Useful when an application owns its own gateway lifecycle. Add the container to the application Pod spec with the same args and a shared localhost:8443 target.

NetworkPolicy

Restrict the upstream LLM domain to the gateway Pod only.

apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: penaxtra-gateway-egress
  namespace: ai-platform
spec:
  podSelector: { matchLabels: { app: penaxtra-gateway } }
  policyTypes: ["Egress"]
  egress:
    - to:
        - ipBlock: { cidr: 0.0.0.0/0, except: [10.0.0.0/8, 172.16.0.0/12, 192.168.0.0/16] }
      ports:
        - { protocol: TCP, port: 443 }

Security notes

  • The enrollment token is stored as a Kubernetes Secret. Rotate via a Job that calls the workspace API after each rotation.
  • Use readOnlyRootFilesystem: true and a non-root securityContext (runAsNonRoot: true, runAsUser: 65532).
  • For multi-tenant cluster sharing, deploy one gateway per tenant namespace to keep policy isolated.

Related

Last reviewed: 2026-06-13. Reviewed by: Engineering. Content type: Developer documentation. Reach the maintainers: [email protected] .