AWS Bedrock is the enterprise-favourite path to Anthropic (and several other model families) with VPC-native security, data-residency guarantees, and AWS IAM governance. The gateway uses bifrost/core under the hood, which handles Bedrock’s idiosyncratic authentication, cross-region inference profiles, and model-id quirks.Documentation Index
Fetch the complete documentation index at: https://langwatch.ai/docs/llms.txt
Use this file to discover all available pages before exploring further.
Configure the provider credential
Under Settings → Model Providers:- Add provider → AWS Bedrock.
- Choose authentication mode:
- Access keys — paste AWS Access Key ID + Secret Access Key + optional session token.
- Instance profile / IRSA — leave credentials empty; the gateway process uses its pod-level IAM role.
- Bedrock API key — new Bedrock-first auth mode (preview).
- Pick a default region (e.g.
us-east-1). Requests can override per-call via the model id. - (Optional) Provide a cross-region inference profile ARN for automatic regional failover on capacity issues.
- Save.
Model id format
Bedrock model ids include the provider prefix and optionally a region tag:anthropic.claude-haiku-4-5-20251001— Anthropic Claude Haiku 4.5.anthropic.claude-sonnet-4-6— Anthropic Claude Sonnet 4.6.anthropic.claude-opus-4-7— Anthropic Claude Opus 4.7.amazon.titan-text-express-v1— Amazon Titan.meta.llama3-1-70b-instruct-v1:0— Meta Llama.us.anthropic.claude-haiku-4-5-20251001— region-prefixed (cross-region inference profile).
model_aliases to expose friendly names to clients:
Supported endpoints
POST /v1/messages— Anthropic-shape dispatched to Bedrock’s Converse API for Anthropic models.POST /v1/chat/completions— translated to Bedrock Converse / InvokeModel depending on the model family.
/v1/embeddings) supports Titan-Embeddings via Bedrock as well.
cache_control passthrough
Anthropic-on-Bedrock supports prompt caching via the Converse API’s cachePoint markers. The gateway forwards cache_control blocks and bifrost/core translates them into cachePoint objects. Cache-read and cache-write token counts flow back the same way as direct Anthropic.
Cross-region inference profiles
Bedrock supports inference profiles that route to multiple regions automatically. Configure the ModelProvider’s default region and the inference profile ARN; bifrost/core dispatches to the profile and Bedrock handles regional routing. Useful for high-availability setups without configuring an explicit fallback chain at the VK level.Known quirks
- Region-coupled quotas — Bedrock throughput is provisioned per-region per-model. A model that works in
us-east-1may 429 or 403 ineu-west-1. Set the VK’s default region to match where your quota lives. - Model access enablement — Bedrock requires manual model-access opt-in per account + region (console: Bedrock → Model access). A 403 with body “you don’t have access to the model” means the AWS account needs to enable that model.
- IAM permissions — at minimum
bedrock:InvokeModel,bedrock:InvokeModelWithResponseStream, andbedrock:Converse+bedrock:ConverseStream. Inference profiles also needbedrock:InvokeModelon the profile ARN. - Streaming format — Bedrock’s event stream is AWS-specific (
application/vnd.amazon.eventstream); bifrost/core normalises it to SSE for the client. The gateway’s byte-for-byte invariant applies to the normalised SSE, not the upstream AWS format. - Cold-start latency — Bedrock frequently has higher cold latency than direct Anthropic. Use
/readyzstartup probes generously on self-hosted deployments if Bedrock is the primary.