Claude Code

Claude Code is Anthropic’s official agentic CLI. It speaks the Anthropic Messages API and streams tool-call deltas with specific ordering that must be preserved end-to-end. The LangWatch AI Gateway’s /v1/messages endpoint is designed to be Claude-Code-compatible: tool-call deltas are forwarded byte-for-byte after the first chunk (see the streaming contract), cache_control blocks are passed through untouched (see caching passthrough — this is load-bearing because Claude Code uses prompt caching aggressively), and the /v1/models endpoint returns Anthropic-model shapes so Claude Code’s model picker works.

Setup

Assuming you’ve minted a VK — lw_vk_live_01HZX… — in the LangWatch UI:

export ANTHROPIC_BASE_URL="https://gateway.langwatch.ai"
export ANTHROPIC_AUTH_TOKEN="lw_vk_live_01HZX..."
claude

That’s it. claude boots, authenticates to the gateway, and every call is traced, budgeted, and governed.

Claude Code reads ANTHROPIC_AUTH_TOKEN if set; otherwise it falls back to ANTHROPIC_API_KEY. Either works with a LangWatch VK. Using AUTH_TOKEN is the upstream recommendation for non-first-party tokens.

Self-hosted gateway

Replace the hostname:

export ANTHROPIC_BASE_URL="https://langwatch-gateway.your-corp.internal"
export ANTHROPIC_AUTH_TOKEN="lw_vk_live_..."

The gateway’s /v1/messages path is the same whether on the LangWatch cloud or self-hosted.

Per-project or per-shell keys

For engineers juggling projects with different budgets or provider allow-lists, use direnv (or your shell-env tool of choice) with a .envrc in each repo:

# .envrc
export ANTHROPIC_AUTH_TOKEN="lw_vk_live_<project-specific-vk>"

Then every claude session inherits the project-scoped VK, its budget, and its guardrails.

Verifying the traffic flows

After one claude session:

Open LangWatch → Messages for the project the VK belongs to.
You should see one trace per Claude Code turn, grouped under the agent name.
Span attributes include langwatch.virtual_key_id, gen_ai.usage.cache_read.input_tokens (OTel GenAI semconv), and the full tool-call tree.

If traffic doesn’t appear:

Run claude --print "say hi" and look at the exit status + the X-LangWatch-Request-Id header. If the request fails, the gateway emits a diagnostic error envelope (not a provider-opaque one).
Check the VK’s allowed providers includes anthropic.
Check that the project’s Anthropic model provider credential is configured (Settings → Model Providers).

Governance recipes

Block shell-exec tools but allow everything else

Edit the VK’s policy_rules.tools.deny:

^shell\\..*

Now if a Claude Code session tries to invoke shell.exec, the gateway returns 403 tool_not_allowed and the session surfaces the error instead of executing the command. Everything else keeps working.

Monthly budget per engineer

Scope: principal, target: the engineer’s user id.
Window: month, limit: $200.
on_breach: block (hard cap) or warn (soft cap + Slack notification via webhook).

Each engineer’s Claude Code usage is metered against their personal cap; no one can outspend another.

Fallback to Bedrock on Anthropic outage

VK fallback chain: anthropic-primary → bedrock-us-east-1.
fallback.on: [5xx, timeout, rate_limit].
fallback.timeout_ms: 30000.

When Anthropic’s API is degraded, Claude Code sessions continue through Bedrock’s Anthropic models transparently — the engineer sees a tiny latency bump but no error.

Known good model aliases for Claude Code

Pin the VK’s model_aliases so Claude Code can use friendly names across providers:

{
  "claude-haiku-4-5-20251001": "anthropic/claude-haiku-4-5-20251001",
  "claude-sonnet-4-6":   "anthropic/claude-sonnet-4-6",
  "claude-opus-4-7":     "anthropic/claude-opus-4-7"
}

If the primary is Bedrock, point those aliases at the Bedrock equivalents and Claude Code won’t know the difference.

Limits and caveats

Pin to a dated model name. Bifrost’s model registry resolves concrete dated names like claude-haiku-4-5-20251001, not bare aliases like claude-haiku-4-5. A bare alias returns 504 provider is required because the gateway can’t resolve which provider to dispatch to. Pin via --model claude-haiku-4-5-20251001 on the command line OR via the VK’s model_aliases so callers can keep the friendly form.
Fallback during streaming sticks to the original provider once the first chunk has streamed, per the streaming contract. If the primary drops mid-stream, the session ends with an SSE error and Claude Code will retry a fresh turn — which may then land on the fallback provider.
Tool-call delta shape changes across provider — e.g. Anthropic vs Bedrock-Anthropic have subtle differences. The gateway normalises what it can via bifrost/core; complex tool flows should be smoke-tested when flipping primaries.
Prompt caching is provider-specific. Claude Code relies on Anthropic’s ephemeral cache blocks. If the VK fails over to a provider without prompt caching (e.g. a generic OpenAI-compatible), expect cost to jump until the primary comes back.
Claude 4.5 prompt caching is in beta. Cache_read may return 0 on claude-haiku-4-5-20251001 even with cache_control: ephemeral set correctly — this is account-level beta-gating, not a gateway issue. Sonnet 4.5 / Opus 4.5 prompt caching is GA and works end-to-end through the gateway’s /v1/messages raw-forward path.

Spawning Claude Code in tests / scripts

If you spawn claude programmatically (CI matrix, scenario tests, automation):

# Strip parent-session context to keep request body under provider edge limits
claude --print --bare --disable-slash-commands \
  --model claude-haiku-4-5-20251001 \
  --dangerously-skip-permissions \
  "your task here"

--bare skips hooks, LSP, plugin sync, attribution, auto-memory, background prefetches, keychain reads, and CLAUDE.md auto-discovery. Without it, the spawned claude inherits the parent session’s full skills + plugins as system blocks, and the request body can exceed Anthropic’s edge tolerance, returning HTML 4xx error pages that don’t decode cleanly. Combine with --disable-slash-commands for fully clean tests.

Get Started

SDK Integration

Coding CLI Integrations

Virtual Keys & Budgets

Providers

Features

API Reference

Self-Hosting

Cookbooks

Setup

Self-hosted gateway

Per-project or per-shell keys

Verifying the traffic flows

Governance recipes

Block shell-exec tools but allow everything else

Monthly budget per engineer

Fallback to Bedrock on Anthropic outage

Known good model aliases for Claude Code

Limits and caveats

Spawning Claude Code in tests / scripts

Get Started

SDK Integration

Coding CLI Integrations

Virtual Keys & Budgets

Providers

Features

API Reference

Self-Hosting

Cookbooks

Documentation Index

​Setup

​Self-hosted gateway

​Per-project or per-shell keys

​Verifying the traffic flows

​Governance recipes

​Block shell-exec tools but allow everything else

​Monthly budget per engineer

​Fallback to Bedrock on Anthropic outage

​Known good model aliases for Claude Code

​Limits and caveats

​Spawning Claude Code in tests / scripts

Setup

Self-hosted gateway

Per-project or per-shell keys

Verifying the traffic flows

Governance recipes

Block shell-exec tools but allow everything else

Monthly budget per engineer

Fallback to Bedrock on Anthropic outage

Known good model aliases for Claude Code

Limits and caveats

Spawning Claude Code in tests / scripts