Skip to main content

Documentation Index

Fetch the complete documentation index at: https://langwatch.ai/docs/llms.txt

Use this file to discover all available pages before exploring further.

The gateway is designed to be a drop-in replacement for provider base URLs — every Python LLM SDK works by setting its base URL env var and using a LangWatch virtual key as the API key. This page covers the standard setup + how to propagate trace ids through the gateway so you don’t double-count cost.

OpenAI Python SDK

Minimal setup

from openai import OpenAI

client = OpenAI(
    base_url="https://gateway.langwatch.ai/v1",
    api_key="lw_vk_live_01HZX9K3M...",
)

resp = client.chat.completions.create(
    model="gpt-5-mini",
    messages=[{"role": "user", "content": "Hi"}],
)
That’s the whole integration. Everything else (streaming, tools, vision, embeddings, images, audio) works unchanged.

Trace propagation

If your application is already traced with the LangWatch SDK, pass the current trace context so the gateway’s span nests inside yours — otherwise you’ll have two disconnected traces and double-counted cost in dashboards.
import langwatch
from openai import OpenAI

client = OpenAI(
    base_url="https://gateway.langwatch.ai/v1",
    api_key="lw_vk_live_...",
    default_headers=langwatch.get_gateway_headers(),   # injects traceparent + X-LangWatch-*
)

with langwatch.trace(name="my-agent-turn"):
    resp = client.chat.completions.create(
        model="gpt-5-mini",
        messages=[{"role": "user", "content": "Hi"}],
    )
langwatch.get_gateway_headers() returns a dict with:
  • traceparent (W3C format) — set from the active LangWatch trace.
  • X-LangWatch-Trace-Id — LangWatch-native trace id override.
  • X-LangWatch-Parent-Span-Id — parent span.
  • X-LangWatch-Thread-Id — if the active trace has a thread id.
The gateway reads these on each request and attaches its own span as a child of your trace. The LangWatch UI shows the LLM call nested under your agent span with no cost duplication.
get_gateway_headers() ships in LangWatch Python SDK ≥ v0.22.0 alongside the gateway GA. Check your installed version with pip show langwatch.

Response headers for correlation

Every gateway response carries these headers so clients can stitch the span back into their own trace without needing the LangWatch SDK:
HeaderValue
X-LangWatch-Trace-Id32-hex trace id. Matches the incoming traceparent trace id if one was supplied; otherwise a freshly-minted trace id
X-LangWatch-Span-Id16-hex gateway span id
traceparentW3C traceparent re-injected for downstream stitching — pass it to any further hop you call
X-LangWatch-Request-IdULID, gateway-scoped. Use this in support tickets
resp = client.chat.completions.with_raw_response.create(
    model="gpt-5-mini",
    messages=[{"role": "user", "content": "Hi"}],
)
headers = resp.http_response.headers
print(f"trace_id = {headers['X-LangWatch-Trace-Id']}")
print(f"span_id  = {headers['X-LangWatch-Span-Id']}")
# Use the traceparent to stitch a downstream service:
# requests.post(next_service, headers={"traceparent": headers["traceparent"]}, ...)

Without the LangWatch SDK — raw traceparent

If you don’t use the LangWatch Python SDK but still want trace continuity (e.g. you’re traced via OpenTelemetry directly):
from opentelemetry import trace
from opentelemetry.trace.propagation.tracecontext import TraceContextTextMapPropagator

def _traceparent_headers() -> dict[str, str]:
    carrier: dict[str, str] = {}
    TraceContextTextMapPropagator().inject(carrier)
    return carrier  # {"traceparent": "00-<tid>-<sid>-01"} when a span is active

client = OpenAI(
    base_url="https://gateway.langwatch.ai/v1",
    api_key="lw_vk_live_...",
    default_headers=_traceparent_headers(),
)
The gateway honours the standard W3C traceparent contract — any OTel-instrumented app already emits this; no LangWatch-specific code needed.

Per-call overrides

Every OpenAI SDK method accepts extra_headers={}:
resp = client.chat.completions.create(
    model="gpt-5-mini",
    messages=[{"role": "user", "content": "Hi"}],
    extra_headers={
        "X-LangWatch-Cache": "disable",                # cold run, ignore cache
        "X-LangWatch-Trace-Metadata": '{"tier":"free"}',  # attach metadata to the trace
    },
)
These layer on top of default_headers without replacing them.

Response inspection

Grab the X-LangWatch-Request-Id for support tickets or log correlation:
resp = client.chat.completions.with_raw_response.create(
    model="gpt-5-mini",
    messages=[{"role": "user", "content": "Hi"}],
)
request_id = resp.http_response.headers["X-LangWatch-Request-Id"]
print(f"LangWatch request id: {request_id}")

Anthropic Python SDK

Minimal setup

import anthropic

client = anthropic.Anthropic(
    base_url="https://gateway.langwatch.ai",
    api_key="lw_vk_live_...",   # SDK accepts VK here; gateway also accepts via x-api-key
)

resp = client.messages.create(
    model="claude-haiku-4-5-20251001",
    max_tokens=64,
    messages=[{"role": "user", "content": "Hi"}],
)

Trace propagation (Anthropic)

client = anthropic.Anthropic(
    base_url="https://gateway.langwatch.ai",
    api_key="lw_vk_live_...",
    default_headers=langwatch.get_gateway_headers(),
)
Works identically to the OpenAI SDK.

LangChain / LangGraph

LangChain’s ChatOpenAI and ChatAnthropic accept a base_url or openai_api_base:
from langchain_openai import ChatOpenAI

llm = ChatOpenAI(
    base_url="https://gateway.langwatch.ai/v1",
    api_key="lw_vk_live_...",
    default_headers=langwatch.get_gateway_headers(),
    model="gpt-5-mini",
)
For LangGraph agents, set the headers on the LLM node — it propagates to every call the agent makes.

LlamaIndex

from llama_index.llms.openai import OpenAI

llm = OpenAI(
    api_base="https://gateway.langwatch.ai/v1",
    api_key="lw_vk_live_...",
    default_headers=langwatch.get_gateway_headers(),
    model="gpt-5-mini",
)

PydanticAI / OpenInference / other OTel-aware frameworks

Frameworks that already emit OpenTelemetry spans will set traceparent automatically on outbound HTTP requests if the opentelemetry-instrumentation-requests / httpx / aiohttp packages are installed. No LangWatch SDK involvement needed.

Self-hosted gateway

Replace the hostname:
client = OpenAI(
    base_url="https://langwatch-gateway.your-corp.internal/v1",
    api_key="lw_vk_live_...",
)
Rest of the setup is identical.

Troubleshooting

  • 401 invalid_api_key — wrong VK or VK revoked. Check the first 12 characters against the LangWatch UI.
  • Cost double-counted — trace propagation not working. Verify default_headers contains traceparent at request time (client._custom_headers on the OpenAI SDK).
  • Anthropic auth header mismatch — the Python Anthropic SDK sets x-api-key automatically from api_key. The gateway accepts all three (Bearer, x-api-key, api-key), so either works.
See API: Errors for the full error-code list.