Skip to main content

Documentation Index

Fetch the complete documentation index at: https://langwatch.ai/docs/llms.txt

Use this file to discover all available pages before exploring further.

The gateway delegates provider-specific dispatch to bifrost/core, which LangWatch embeds as a Go library. This page lists the providers the gateway can talk to today and the path to configuring each.

Configure a provider credential once, use from many VKs

Every provider credential lives in the LangWatch Model Providers table (under Settings → Model Providers). It’s the same surface that powers evaluators and the playground — no separate “gateway providers” store. To use a credential in the gateway, create a GatewayProviderCredential binding that references it + layers on gateway-only settings (rate limits, rotation, extra headers). Each VK then binds to one or more GatewayProviderCredentials as its primary + fallback chain.
ModelProvider (raw credential)
  └── GatewayProviderCredential (gateway-specific binding)
       └── VirtualKey.providers[] (ordered primary + fallback)
This layering means: rotating the raw OpenAI key rotates it everywhere; changing a rate limit or extra header only affects gateway traffic; binding a VK to a different provider is a one-click edit.

Supported providers (v1)

ProviderRoutes supportedAuth methodsCaching passthrough
OpenAIchat / responses / embeddings / images / audio / moderationsAPI keyauto-prefix
Anthropicmessages / chat (translated) / streamingAPI keycache_control blocks
Azure OpenAIchat / responses / embeddings / images / audioAPI key / AADauto-prefix
AWS Bedrockmessages / chat / embeddings (Titan)AWS SigV4 / IRSAcachePoint
Google Vertex AImessages / chatGCP ADC / SA JSONimplicit context cache
Google GeminichatAPI keyimplicit context cache
Custom OpenAI-compatiblechat / embeddings (depends on upstream)Bearer / custom headeropaque
Behind the scenes bifrost/core also supports Groq, Cohere, Mistral, Ollama, vLLM, SGLang, Perplexity, ElevenLabs, and others. These will light up in the UI as LangWatch ships the per-provider configuration forms.

Picking a primary + fallback

Most VKs end up with 1-2 fallback providers. Guidelines:
  • Anthropic-first with Bedrock-Anthropic fallback. Same models on both sides; Bedrock is a warm backup during Anthropic direct outages.
  • OpenAI-first with Anthropic fallback. Different model families but Claude Haiku can serve gpt-5-mini traffic acceptably for coding tasks.
  • Azure-first with OpenAI direct fallback. Pins traffic inside Azure data-residency most of the time but tolerates regional Azure outages.
See Fallback Chains for trigger semantics.

Multi-region / multi-account

Each provider binding pins a region (Azure endpoint, Bedrock region, Vertex project). If an organisation spans multiple regions, create one GatewayProviderCredential per region with a naming convention (e.g. openai-eu, openai-us) and use VK model_aliases to route gpt-5-mini-eu vs gpt-5-mini-us to the right credential.