- Mint a LangWatch virtual key (see Quickstart).
- Set the CLI’s base URL and API key to the gateway + VK.
At a glance
Thelangwatch CLI ships wrappers for the 4 most common coding CLIs, they auto-inject the right env vars from your governance config so you don’t have to. Native env-var setup still works everywhere if you prefer.
| CLI | Wrapper | Endpoint | Native env vars | Notes |
|---|---|---|---|---|
| Claude Code | langwatch claude | /v1/messages | ANTHROPIC_BASE_URL, ANTHROPIC_AUTH_TOKEN | Tool-call deltas preserved byte-for-byte. Pin to dated model name (e.g. claude-haiku-4-5-20251001). |
| Codex CLI | langwatch codex | /v1/responses | ~/.codex/config.toml model_providers, wire_api = "responses" | Codex 0.122+ requires wire_api = "responses" (chat dropped). Pin model to a Bifrost-registered name. |
| Gemini CLI | langwatch gemini | /v1beta/models/{model}:generateContent | GOOGLE_GEMINI_BASE_URL, GEMINI_API_KEY | Native Gemini API passthrough. Works with all Gemini models supported by your VK’s Vertex/Gemini credential. |
| Cursor | langwatch cursor * | /v1/chat/completions | Custom “OpenAI API base URL” in settings | * Cursor is a GUI app, its AI panel reads from settings, not env vars, so the wrapper is mostly cosmetic for terminal launches. The canonical setup is the in-app paste. Agent mode benefits most from budgets. |
| opencode | , | /v1/chat/completions (or /v1/messages) | Per-provider config in opencode.json | No wrapper today, set the per-provider config explicitly. Pin opencode 1.13.x, 1.14.x has a known regression with custom providers. |
| Aider | , | /v1/chat/completions | OPENAI_API_BASE, OPENAI_API_KEY | No wrapper today, manual env-var path. Confirmed working with model aliases. |
Why this matters for the enterprise
Before the gateway, governance of coding CLIs was a choice between:- Ban them: kills productivity, drives shadow usage.
- Allow them with personal provider keys: no visibility, no cost control, leaked credentials in dotfiles and CI logs.
- Cost. Each engineer’s CLI spend debits the org → team → project → principal budgets you’ve set.
- Visibility. Every CLI call shows up in LangWatch traces, scoped to the project the VK belongs to.
- Policy.
policy_rulescan deny shell-exec tools or untrusted MCPs at the gateway level, even if the CLI would otherwise enable them. - Portability. An engineer on Claude Code and a co-worker on Codex hit the same gateway with different VKs but the same budget, you don’t need to pick a winner.
- Revocation. Rotate or revoke a VK and the CLI stops working globally within 60 seconds. No more “which laptops still have the old key?”
Recommended VK layout for coding CLIs
A workable pattern used by several early customers:- One personal-access VK per engineer (
{engineer}-cli) bound to the engineer’s principal. - Attach a
principal-scoped monthly budget (e.g.$200/monthfor engineers,$1000/monthfor staff+).on_breach: block. policy_rules.tools.deny:^shell\\.exec$,^filesystem\\.write$(or your org’s list).- Fallback chain: Anthropic → OpenAI → Azure OpenAI. CLI autoswitches on outage.
cache.mode: respectso Anthropic prompt caching keeps saving 90%.
Real-time feedback
Each CLI’s trace lands in LangWatch live. You can:- Pin a filter “where
langwatch.vk.tagscontainscli” on the project dashboard. - Page on-call if any engineer crosses 80% of their monthly personal cap.
- Run an offline eval comparing Claude Code vs Codex quality on tickets of a given type.
Verified smoke output
Lane A ran the gateway locally againstpnpm dev on 2026-04-19 to confirm the response shape CLIs will see. Pinning the transcripts here so integrators can diff their actual output against known-good.
Start the gateway pointed at a running LangWatch control plane on :5560:
/healthz, always 200 once the process is up
X-Langwatch-Gateway-Version is set from the binary’s main.Version build-arg, production deploys carry the commit SHA so operators can answer “which pod served this” straight from the response.
/v1/models and /v1/chat/completions with no auth, 401 OpenAI-compat envelope
Authorization: Bearer vk-lw-... (OpenAI SDK, Claude Code, Cursor, Aider) AND x-api-key (Anthropic SDK). Either works against either endpoint.
Traceparent + X-Langwatch-Span-Id + X-Langwatch-Trace-Id are present even on unauth 401s, observability of probe-abuse, misconfigured CLIs is available without inspecting access logs.
/startupz and /readyz behavior at cold boot
/startupz and /readyz go to 200 as soon as the gateway has finished its startup initialisers and (if configured) the network-check probe has succeeded. They do NOT block on the auth cache observing a VK, a cold pod with no traffic and a fresh control-plane install with zero VKs will still go ready, and auth resolution happens on demand at request time.