Error Envelope

Every gateway error is returned in the OpenAI-compatible envelope:

{
  "error": {
    "type":    "<type>",
    "code":    "<code>",
    "message": "<human-readable>",
    "param":   "<optional field name>"
  }
}

This matches what the openai Python and TypeScript SDKs — plus the Anthropic SDK (which parses a superset) — expect. Existing client code raises its usual typed exceptions unchanged.

Response headers

Errors (like successes) always carry:

X-LangWatch-Request-Id: grq_<ULID> — use this when filing support tickets.
X-LangWatch-Provider — present when the error originated from an upstream provider (absent for gateway-internal errors).

Type enum

`type`	HTTP	When
`invalid_api_key`	`401`	The `Authorization: Bearer …` / `x-api-key` / `api-key` header is missing, malformed, or points to a non-existent virtual key.
`virtual_key_revoked`	`403`	The VK exists but has been revoked.
`model_not_allowed`	`403`	The VK has a `models_allowed` allowlist and the requested model is not in it, or the model matched a `policy_rules.models` deny regex (or fell outside its allow regex). Also used when the model/alias doesn’t resolve to any configured provider.
`permission_denied`	`403`	The principal lacks the RBAC permission required for the endpoint.
`budget_exceeded`	`402`	Any hard-cap budget scope that applies to this request is over its limit. `message` names the scope (e.g. `Budget exceeded for scope=project window=month`).
`rate_limit_exceeded`	`429`	Gateway-level (per-VK RPM/RPD) or upstream. Gateway-level adds `code = vk_rate_limit_exceeded`, `Retry-After: <seconds>` (RFC 7231), and `X-LangWatch-RateLimit-Dimension: rpm\|rpd` telling you which ceiling fired. TPM is deferred to v1.1.
`guardrail_blocked`	`403`	A pre- or post-call guardrail returned `block`. `message` references which guardrail and why. Post-block also records a zero-cost `blocked_by_guardrail` debit on the budget ledger.
`guardrail_upstream_unavailable`	`503`	A pre- or post-call guardrail’s evaluator service was unreachable or errored, and the VK’s `guardrails.{request,response}_fail_open` is `false` (the fail-closed default). Flip to fail-open on the VK to pass through on guardrail outages.
`tool_not_allowed`	`403`	The request references a tool name matched by the VK’s `policy_rules.tools.deny` (or absent from `allow` if set).
`url_not_allowed`	`403`	Any `http://` / `https://` URL extracted from the request body (user messages, tool-call args, system prompts — anywhere) matched `policy_rules.urls.deny` or fell outside a non-null `allow` list.
`cache_override_invalid`	`400`	The `X-LangWatch-Cache` header was malformed or used an unknown mode.
`cache_override_not_implemented`	`400`	The `X-LangWatch-Cache` header was well-formed but named a mode deferred to v1.1 (`force` or `ttl=NNN`). `respect` and `disable` are the v1 modes.
`provider_error`	`502`	An upstream provider returned a non-recoverable error and fallback (if any) was exhausted.
`upstream_timeout`	`504`	An upstream provider exceeded `timeout_ms` and fallback (if any) was exhausted.
`bad_request`	`400`	Validation error on the incoming payload (e.g. missing `model`).
`payload_too_large`	`413`	Request body exceeded `GATEWAY_MAX_REQUEST_BODY_BYTES` (default 10 MiB). Rejected at the edge — before auth, resolve-key, or any upstream dispatch — so a 1 GB drive-by scan never pressures the pod memory limit. Declared `Content-Length` above the cap returns 413 immediately without draining the socket; chunked unknown-length bodies trip a `*http.MaxBytesError` at the cap.
`internal_error`	`500`	Unclassified gateway error. `X-LangWatch-Request-Id` is how we trace it.

Budget-warning headers (not errors)

These are soft signals on successful responses:

X-LangWatch-Budget-Warning: <scope>:<pct> — a budget scope is over its soft threshold. Multiple can be present.

A warn breach never turns into an error envelope; it’s only a header.

Examples

Invalid key

HTTP/1.1 401 Unauthorized
Content-Type: application/json
X-LangWatch-Request-Id: grq_01HZX9K3MNM...

{
  "error": {
    "type":    "invalid_api_key",
    "code":    "invalid_api_key",
    "message": "No active virtual key matches the presented credential.",
    "param":   null
  }
}

Budget exceeded

HTTP/1.1 402 Payment Required
Content-Type: application/json
X-LangWatch-Request-Id: grq_01HZX9K3MNN...

{
  "error": {
    "type":    "budget_exceeded",
    "code":    "budget_exceeded",
    "message": "Budget exceeded for scope=project window=month",
    "param":   null
  }
}

Blocked tool

HTTP/1.1 403 Forbidden
Content-Type: application/json
X-LangWatch-Request-Id: grq_01HZX9K3MNO...

{
  "error": {
    "type":    "tool_not_allowed",
    "code":    "tool_not_allowed",
    "message": "Tool 'shell.exec' is blocked by VK policy policy_rules.tools.",
    "param":   "tools[0].function.name"
  }
}

Upstream timeout after fallback exhaustion

HTTP/1.1 504 Gateway Timeout
Content-Type: application/json
X-LangWatch-Request-Id: grq_01HZX9K3MNP...
X-LangWatch-Provider: anthropic
X-LangWatch-Fallback-Count: 2

{
  "error": {
    "type":    "upstream_timeout",
    "code":    "upstream_timeout",
    "message": "All 3 providers in the fallback chain timed out after 30000ms.",
    "param":   null
  }
}

Streaming errors

For SSE streaming, a terminal event: error frame carries the same envelope and the stream ends:

event: error
data: {"error":{"type":"provider_error","code":"upstream_mid_stream_failure","message":"Upstream connection reset after 2 chunks","param":null}}

Clients that receive chunks and then an error frame should treat the response as incomplete (partial) and X-LangWatch-Request-Id still identifies the session in traces.

Mid-stream `code` values

Once bytes are flowing, the HTTP status is already 200 — so the distinguishing signal for clients is the code field inside the terminal frame:

`code`	Source	Meaning
`upstream_mid_stream_failure`	provider path	Upstream errored, reset, or closed unexpectedly after at least one chunk was emitted. Pre-connection failures fall through the transparent-fallback path and never reach the client, so seeing this code means fallback was either not configured, already exhausted, or the failure happened too late.
`stream_chunk_blocked`	guardrail path	A `stream_chunk` guardrail returned `block` on a visible-text frame before emit. Subsequent upstream chunks are discarded (see Guardrails → stream_chunk). The channel is closed immediately after the frame.
`guardrail_upstream_unavailable`	guardrail path	Terminal path is flag-only for streaming (see Guardrails → fail-open vs fail-closed); you’ll see this code only if a future iter wires post-stream enforcement.

The type on a streaming terminal frame always reflects the category (provider_error, guardrail_blocked) — clients keying off type will already have a usable classification. The code is the granular discriminator if you need it (for example, a retry policy that distinguishes “upstream flaked, retry with a different VK” from “guardrail policy said no, don’t retry”).

Get Started

SDK Integration

Coding CLI Integrations

Virtual Keys & Budgets

Providers

Features

API Reference

Self-Hosting

Cookbooks

Response headers

Type enum

Budget-warning headers (not errors)

Examples

Invalid key

Budget exceeded

Blocked tool

Upstream timeout after fallback exhaustion

Streaming errors

Mid-stream `code` values

Get Started

SDK Integration

Coding CLI Integrations

Virtual Keys & Budgets

Providers

Features

API Reference

Self-Hosting

Cookbooks

Documentation Index

​Response headers

​Type enum

​Budget-warning headers (not errors)

​Examples

​Invalid key

​Budget exceeded

​Blocked tool

​Upstream timeout after fallback exhaustion

​Streaming errors

​Mid-stream code values

Response headers

Type enum

Budget-warning headers (not errors)

Examples

Invalid key

Budget exceeded

Blocked tool

Upstream timeout after fallback exhaustion

Streaming errors

Mid-stream `code` values