Docs / runtime-gateway-api

Runtime gateway API

← All docs

The runtime gateway is an on-prem Go agent that sits in front of the customer's LLM endpoint. It emits two consumer-readable streams: enrolled-agent inventory and block-event records. Both stream into the API behind tenant-scoped bearer tokens.

GET /api/v2/gateway-agents

curl -sS "https://penaxtra.com/api/v2/gateway-agents" \
  -H "Authorization: Bearer $TOKEN"

Required scope: gateway:read.

Response

{
  "data": [
    {
      "id": "<uuid>",
      "uid": "gw-c34a",
      "hostname": "edge-llm-1",
      "os": "linux",
      "arch": "amd64",
      "agent_version": "0.9.4",
      "last_seen_at": "2026-05-22T10:14:00Z",
      "enrolled_at": "2026-04-08T08:00:00Z",
      "revoked_at": null,
      "current_blob_version": 4521,
      "local_port": 9087
    }
  ]
}

GET /api/v2/gateway-events

curl -sS "https://penaxtra.com/api/v2/gateway-events?severity=high" \
  -H "Authorization: Bearer $TOKEN"

Required scope: gateway:read. Optional severity filter: critical, high, medium, low, info.

Response

{
  "data": [
    {
      "id": 18421,
      "agent_id": "<uuid>",
      "rule_id": "<uuid>",
      "scope": "request",
      "matched_in": "body",
      "request_method": "POST",
      "occurred_at": "2026-05-22T10:14:00Z",
      "reason_code": "DLP.PII.IBAN",
      "severity": "high",
      "retry": false,
      "layer": "post-template"
    }
  ]
}

Fields

FieldNotes
reason_codeStable taxonomy across forty-plus codes (DLP.PII.*, INJECTION.*, EGRESS.*, POLICY.*).
scoperequest or response; which leg of the call matched the rule.
matched_inheaders, body, url, tools.
layerpre-template, post-template, tool-call.
retryWhether the agent will retransmit with a sanitised payload (only valid for request-scope blocks).
request_methodThe HTTP verb used by the upstream caller (no full URL or body is persisted).

Notes

  • Block events redact the prompt and response bodies. Only the matched rule and the request method are retained; the request path is stored as a hash so cardinality stays bounded.
  • Newest event first; the events endpoint caps at 200 rows per call. Use the from query parameter on the planned cursor build for time-window scans larger than that.
  • Rule CRUD via API is planned; today rules are managed from the console.

Related

Last reviewed: 2026-06-13. Reviewed by: Engineering. Content type: Developer documentation. Reach the maintainers: [email protected] .