Skip to main content
OpenAI-compatible embeddings. Works against OpenAI, Azure OpenAI, Bedrock (Titan), Vertex, Gemini, and Custom OpenAI-compatible providers that implement /v1/embeddings.

Request

POST /v1/embeddings
Authorization: Bearer vk-lw-<ULID>
Content-Type: application/json
{
  "model": "text-embedding-3-small",
  "input": ["Hello", "World"]
}
input accepts a single string or an array of strings. Token pricing is per-input-token summed across the array.

Response

OpenAI-shape. Additional LangWatch headers:
X-LangWatch-Request-Id: grq_01HZX9K3M...
X-LangWatch-Provider: openai
X-LangWatch-Model: text-embedding-3-small

{
  "object": "list",
  "data": [
    { "object": "embedding", "index": 0, "embedding": [0.012, -0.033, ...] },
    { "object": "embedding", "index": 1, "embedding": [0.089, 0.041, ...] }
  ],
  "model": "text-embedding-3-small",
  "usage": { "prompt_tokens": 2, "total_tokens": 2 }
}

Provider compatibility

ProviderSupportedModels
OpenAItext-embedding-3-small, text-embedding-3-large, text-embedding-ada-002
Azure OpenAISame as OpenAI (via deployment names)
Bedrock✅ Titanamazon.titan-embed-text-v2:0
Vertex AI✅ Gemini embeddingstextembedding-gecko, gemini-embedding-001
Gemini (AI Studio)text-embedding-004, gemini-embedding-001
AnthropicNo embeddings endpoint
Custom OpenAI-compatiblevariesdepends on upstream

Fallback and caching

Subject to the same fallback rules as chat (see Fallback Chains). Embedding models are less prone to provider outages than chat models but the chain is there if needed. Caching: deterministic, identical (model, input) tuples produce identical vectors, so aggressive semantic caching is feasible. V1 does not enable gateway-level caching for embeddings; use a local cache in your application layer.

Errors

See API: Errors.