Azure OpenAI is enterprise OpenAI with data-residency, AAD integration, and SLA guarantees — but with a deployment-based model-id convention that breaks most OpenAI SDKs out of the box. The gateway handles Azure’s quirks via bifrost/core so any OpenAI-shaped client just works.Documentation Index
Fetch the complete documentation index at: https://langwatch.ai/docs/llms.txt
Use this file to discover all available pages before exploring further.
Configure the provider credential
Under Settings → Model Providers:- Add provider → Azure OpenAI.
- Paste:
- Endpoint — e.g.
https://my-resource.openai.azure.com. - API key — Azure OpenAI key (hex string).
- API version — e.g.
2024-08-01-preview(pin explicitly; don’t rely on “latest”).
- Endpoint — e.g.
- (Optional) Default deployment name — if clients send a bare model name, the gateway maps it to this deployment.
- Save.
Deployment-based model routing
Azure OpenAI binds models to deployments, which have customer-chosen names.gpt-4o in the OpenAI SDK needs to become my-gpt-4o-deployment at Azure — a mapping step. The gateway handles this via VK model_aliases:
model: "gpt-4o" reaches the correct Azure deployment without knowing Azure exists. This is a key value-prop of aliases — the VK owner maps generic model names to their Azure infrastructure, and application code never changes.
Supported endpoints
Identical to OpenAI direct:/v1/chat/completions,/v1/responses,/v1/embeddings,/v1/images/generations,/v1/audio/*,/v1/moderations.
/openai/deployments/{deployment}/chat/completions?api-version=...).
Auth headers
OpenAI usesAuthorization: Bearer, Azure uses api-key: <hex>. Bifrost/core swaps headers automatically based on the ModelProvider type. The client-facing VK auth is always Authorization: Bearer lw_vk_live_… (or one of the alternates — see Virtual Keys).
AAD / Managed Identity
For self-hosted gateway in Azure (AKS, App Service), you can omit the API key and use AAD auth:- Assign the gateway’s managed identity the “Cognitive Services OpenAI User” role on the Azure OpenAI resource.
- Set the ModelProvider’s API key to
__use_aad__(sentinel). - The gateway uses
DefaultAzureCredentialto fetch tokens.
Known quirks
- API version drift. Azure changes API versions frequently;
2024-02-01and2024-08-01-previewhave different request/response shapes. Pin the version on the ModelProvider; don’t rely on the newest. - Content filtering. Azure runs its own content filter before and after the model. A request blocked by Azure’s filter returns 400 with a
content_filter_resultbody; the gateway surfaces this asprovider_errorwith the filter metadata in the OTel trace. - Streaming SSE format. Identical to OpenAI direct —
data: {...}\n\nframes terminated bydata: [DONE]. Byte-for-byte passthrough applies. - Region naming. Azure regions use dash-case (
eastus,westeurope) vs AWS’s dot-case. The gateway does not enforce a region — it’s baked into the Endpoint URL. - Rate limits. Azure has per-deployment quota (PTU or TPM-based). A 429 shows
Retry-Afterlike OpenAI; treat it the same way (fallback trigger or client retry). - Unsupported features. Some OpenAI features (e.g.
/v1/audio/speechTTS) are not yet available on Azure OpenAI depending on region.
Bifrost/core handles the ceremony
The key-selling-point of thebifrost/core library: it hides the Azure idiosyncrasies (deployment URL construction, api-key vs Bearer, API-version query param, AAD vs key auth, regional endpoint detection). Users of the gateway see a plain OpenAI-shape API.