OpenAI-compatible embeddings. Works against OpenAI, Azure OpenAI, Bedrock (Titan), Vertex, Gemini, and Custom OpenAI-compatible providers that implementDocumentation Index
Fetch the complete documentation index at: https://langwatch.ai/docs/llms.txt
Use this file to discover all available pages before exploring further.
/v1/embeddings.
Request
input accepts a single string or an array of strings. Token pricing is per-input-token summed across the array.
Response
OpenAI-shape. Additional LangWatch headers:Provider compatibility
| Provider | Supported | Models |
|---|---|---|
| OpenAI | ✅ | text-embedding-3-small, text-embedding-3-large, text-embedding-ada-002 |
| Azure OpenAI | ✅ | Same as OpenAI (via deployment names) |
| Bedrock | ✅ Titan | amazon.titan-embed-text-v2:0 |
| Vertex AI | ✅ Gemini embeddings | textembedding-gecko, gemini-embedding-001 |
| Gemini (AI Studio) | ✅ | text-embedding-004, gemini-embedding-001 |
| Anthropic | ❌ | No embeddings endpoint |
| Custom OpenAI-compatible | varies | depends on upstream |
Fallback and caching
Subject to the same fallback rules as chat (see Fallback Chains). Embedding models are less prone to provider outages than chat models but the chain is there if needed. Caching: deterministic — identical(model, input) tuples produce identical vectors, so aggressive semantic caching is feasible. V1 does not enable gateway-level caching for embeddings; use a local cache in your application layer.