Claude Code is Anthropic’s official agentic CLI. It speaks the Anthropic Messages API and streams tool-call deltas with specific ordering that must be preserved end-to-end. The LangWatch AI Gateway’sDocumentation Index
Fetch the complete documentation index at: https://langwatch.ai/docs/llms.txt
Use this file to discover all available pages before exploring further.
/v1/messages endpoint is designed to be Claude-Code-compatible: tool-call deltas are forwarded byte-for-byte after the first chunk (see the streaming contract), cache_control blocks are passed through untouched (see caching passthrough — this is load-bearing because Claude Code uses prompt caching aggressively), and the /v1/models endpoint returns Anthropic-model shapes so Claude Code’s model picker works.
Setup
Assuming you’ve minted a VK —lw_vk_live_01HZX… — in the LangWatch UI:
claude boots, authenticates to the gateway, and every call is traced, budgeted, and governed.
Claude Code reads
ANTHROPIC_AUTH_TOKEN if set; otherwise it falls back to ANTHROPIC_API_KEY. Either works with a LangWatch VK. Using AUTH_TOKEN is the upstream recommendation for non-first-party tokens.Self-hosted gateway
Replace the hostname:/v1/messages path is the same whether on the LangWatch cloud or self-hosted.
Per-project or per-shell keys
For engineers juggling projects with different budgets or provider allow-lists, usedirenv (or your shell-env tool of choice) with a .envrc in each repo:
claude session inherits the project-scoped VK, its budget, and its guardrails.
Verifying the traffic flows
After oneclaude session:
- Open LangWatch → Messages for the project the VK belongs to.
- You should see one trace per Claude Code turn, grouped under the agent name.
- Span attributes include
langwatch.virtual_key_id,gen_ai.usage.cache_read.input_tokens(OTel GenAI semconv), and the full tool-call tree.
- Run
claude --print "say hi"and look at the exit status + theX-LangWatch-Request-Idheader. If the request fails, the gateway emits a diagnostic error envelope (not a provider-opaque one). - Check the VK’s allowed providers includes
anthropic. - Check that the project’s Anthropic model provider credential is configured (Settings → Model Providers).
Governance recipes
Block shell-exec tools but allow everything else
Edit the VK’spolicy_rules.tools.deny:
shell.exec, the gateway returns 403 tool_not_allowed and the session surfaces the error instead of executing the command. Everything else keeps working.
Monthly budget per engineer
- Scope:
principal, target: the engineer’s user id. - Window:
month, limit:$200. on_breach:block(hard cap) orwarn(soft cap + Slack notification via webhook).
Fallback to Bedrock on Anthropic outage
- VK fallback chain:
anthropic-primary → bedrock-us-east-1. fallback.on: [5xx, timeout, rate_limit].fallback.timeout_ms: 30000.
Known good model aliases for Claude Code
Pin the VK’smodel_aliases so Claude Code can use friendly names across providers:
Limits and caveats
- Pin to a dated model name. Bifrost’s model registry resolves concrete dated names like
claude-haiku-4-5-20251001, not bare aliases likeclaude-haiku-4-5. A bare alias returns504 provider is requiredbecause the gateway can’t resolve which provider to dispatch to. Pin via--model claude-haiku-4-5-20251001on the command line OR via the VK’smodel_aliasesso callers can keep the friendly form. - Fallback during streaming sticks to the original provider once the first chunk has streamed, per the streaming contract. If the primary drops mid-stream, the session ends with an SSE error and Claude Code will retry a fresh turn — which may then land on the fallback provider.
- Tool-call delta shape changes across provider — e.g. Anthropic vs Bedrock-Anthropic have subtle differences. The gateway normalises what it can via bifrost/core; complex tool flows should be smoke-tested when flipping primaries.
- Prompt caching is provider-specific. Claude Code relies on Anthropic’s ephemeral cache blocks. If the VK fails over to a provider without prompt caching (e.g. a generic OpenAI-compatible), expect cost to jump until the primary comes back.
- Claude 4.5 prompt caching is in beta. Cache_read may return 0 on
claude-haiku-4-5-20251001even withcache_control: ephemeralset correctly — this is account-level beta-gating, not a gateway issue. Sonnet 4.5 / Opus 4.5 prompt caching is GA and works end-to-end through the gateway’s/v1/messagesraw-forward path.
Spawning Claude Code in tests / scripts
If you spawnclaude programmatically (CI matrix, scenario tests, automation):
--bare skips hooks, LSP, plugin sync, attribution, auto-memory, background prefetches, keychain reads, and CLAUDE.md auto-discovery. Without it, the spawned claude inherits the parent session’s full skills + plugins as system blocks, and the request body can exceed Anthropic’s edge tolerance, returning HTML 4xx error pages that don’t decode cleanly. Combine with --disable-slash-commands for fully clean tests.