Documentation Index
Fetch the complete documentation index at: https://langwatch.ai/docs/llms.txt
Use this file to discover all available pages before exploring further.
The gateway is designed to be a drop-in replacement for provider base URLs — every Python LLM SDK works by setting its base URL env var and using a LangWatch virtual key as the API key.
This page covers the standard setup + how to propagate trace ids through the gateway so you don’t double-count cost.
OpenAI Python SDK
Minimal setup
from openai import OpenAI
client = OpenAI(
base_url="https://gateway.langwatch.ai/v1",
api_key="lw_vk_live_01HZX9K3M...",
)
resp = client.chat.completions.create(
model="gpt-5-mini",
messages=[{"role": "user", "content": "Hi"}],
)
That’s the whole integration. Everything else (streaming, tools, vision, embeddings, images, audio) works unchanged.
Trace propagation
If your application is already traced with the LangWatch SDK, pass the current trace context so the gateway’s span nests inside yours — otherwise you’ll have two disconnected traces and double-counted cost in dashboards.
import langwatch
from openai import OpenAI
client = OpenAI(
base_url="https://gateway.langwatch.ai/v1",
api_key="lw_vk_live_...",
default_headers=langwatch.get_gateway_headers(), # injects traceparent + X-LangWatch-*
)
with langwatch.trace(name="my-agent-turn"):
resp = client.chat.completions.create(
model="gpt-5-mini",
messages=[{"role": "user", "content": "Hi"}],
)
langwatch.get_gateway_headers() returns a dict with:
traceparent (W3C format) — set from the active LangWatch trace.
X-LangWatch-Trace-Id — LangWatch-native trace id override.
X-LangWatch-Parent-Span-Id — parent span.
X-LangWatch-Thread-Id — if the active trace has a thread id.
The gateway reads these on each request and attaches its own span as a child of your trace. The LangWatch UI shows the LLM call nested under your agent span with no cost duplication.
get_gateway_headers() ships in LangWatch Python SDK ≥ v0.22.0 alongside the gateway GA. Check your installed version with pip show langwatch.
Every gateway response carries these headers so clients can stitch the span back into their own trace without needing the LangWatch SDK:
| Header | Value |
|---|
X-LangWatch-Trace-Id | 32-hex trace id. Matches the incoming traceparent trace id if one was supplied; otherwise a freshly-minted trace id |
X-LangWatch-Span-Id | 16-hex gateway span id |
traceparent | W3C traceparent re-injected for downstream stitching — pass it to any further hop you call |
X-LangWatch-Request-Id | ULID, gateway-scoped. Use this in support tickets |
resp = client.chat.completions.with_raw_response.create(
model="gpt-5-mini",
messages=[{"role": "user", "content": "Hi"}],
)
headers = resp.http_response.headers
print(f"trace_id = {headers['X-LangWatch-Trace-Id']}")
print(f"span_id = {headers['X-LangWatch-Span-Id']}")
# Use the traceparent to stitch a downstream service:
# requests.post(next_service, headers={"traceparent": headers["traceparent"]}, ...)
Without the LangWatch SDK — raw traceparent
If you don’t use the LangWatch Python SDK but still want trace continuity (e.g. you’re traced via OpenTelemetry directly):
from opentelemetry import trace
from opentelemetry.trace.propagation.tracecontext import TraceContextTextMapPropagator
def _traceparent_headers() -> dict[str, str]:
carrier: dict[str, str] = {}
TraceContextTextMapPropagator().inject(carrier)
return carrier # {"traceparent": "00-<tid>-<sid>-01"} when a span is active
client = OpenAI(
base_url="https://gateway.langwatch.ai/v1",
api_key="lw_vk_live_...",
default_headers=_traceparent_headers(),
)
The gateway honours the standard W3C traceparent contract — any OTel-instrumented app already emits this; no LangWatch-specific code needed.
Per-call overrides
Every OpenAI SDK method accepts extra_headers={}:
resp = client.chat.completions.create(
model="gpt-5-mini",
messages=[{"role": "user", "content": "Hi"}],
extra_headers={
"X-LangWatch-Cache": "disable", # cold run, ignore cache
"X-LangWatch-Trace-Metadata": '{"tier":"free"}', # attach metadata to the trace
},
)
These layer on top of default_headers without replacing them.
Response inspection
Grab the X-LangWatch-Request-Id for support tickets or log correlation:
resp = client.chat.completions.with_raw_response.create(
model="gpt-5-mini",
messages=[{"role": "user", "content": "Hi"}],
)
request_id = resp.http_response.headers["X-LangWatch-Request-Id"]
print(f"LangWatch request id: {request_id}")
Anthropic Python SDK
Minimal setup
import anthropic
client = anthropic.Anthropic(
base_url="https://gateway.langwatch.ai",
api_key="lw_vk_live_...", # SDK accepts VK here; gateway also accepts via x-api-key
)
resp = client.messages.create(
model="claude-haiku-4-5-20251001",
max_tokens=64,
messages=[{"role": "user", "content": "Hi"}],
)
Trace propagation (Anthropic)
client = anthropic.Anthropic(
base_url="https://gateway.langwatch.ai",
api_key="lw_vk_live_...",
default_headers=langwatch.get_gateway_headers(),
)
Works identically to the OpenAI SDK.
LangChain / LangGraph
LangChain’s ChatOpenAI and ChatAnthropic accept a base_url or openai_api_base:
from langchain_openai import ChatOpenAI
llm = ChatOpenAI(
base_url="https://gateway.langwatch.ai/v1",
api_key="lw_vk_live_...",
default_headers=langwatch.get_gateway_headers(),
model="gpt-5-mini",
)
For LangGraph agents, set the headers on the LLM node — it propagates to every call the agent makes.
LlamaIndex
from llama_index.llms.openai import OpenAI
llm = OpenAI(
api_base="https://gateway.langwatch.ai/v1",
api_key="lw_vk_live_...",
default_headers=langwatch.get_gateway_headers(),
model="gpt-5-mini",
)
PydanticAI / OpenInference / other OTel-aware frameworks
Frameworks that already emit OpenTelemetry spans will set traceparent automatically on outbound HTTP requests if the opentelemetry-instrumentation-requests / httpx / aiohttp packages are installed. No LangWatch SDK involvement needed.
Self-hosted gateway
Replace the hostname:
client = OpenAI(
base_url="https://langwatch-gateway.your-corp.internal/v1",
api_key="lw_vk_live_...",
)
Rest of the setup is identical.
Troubleshooting
401 invalid_api_key — wrong VK or VK revoked. Check the first 12 characters against the LangWatch UI.
- Cost double-counted — trace propagation not working. Verify
default_headers contains traceparent at request time (client._custom_headers on the OpenAI SDK).
- Anthropic auth header mismatch — the Python Anthropic SDK sets
x-api-key automatically from api_key. The gateway accepts all three (Bearer, x-api-key, api-key), so either works.
See API: Errors for the full error-code list.