Configure the provider credential
Under Settings → Model Providers:- Add provider → Google Gemini.
- Paste the AI Studio API key (starts with a long alphanumeric string).
- Save.
Supported models
gemini-2.5-flash, fast + cheap. Default recommendation for coding CLIs on a Gemini-first VK.gemini-2.5-pro, bigger, more capable.gemini-2.0-flash, prior generation.gemini-2.0-pro, prior generation.- Embedding models:
text-embedding-004,gemini-embedding-001.
model_aliases:
Supported endpoints
POST /v1/chat/completions, OpenAI-shape dispatched to Gemini’sgenerateContent,streamGenerateContentendpoint.POST /v1/embeddings, for Gemini embedding models.
/v1/messages equivalent; Anthropic-shape clients should use a VK that has Anthropic or Bedrock as primary.
Context caching
Gemini has an implicit context cache for prompts > 32k tokens on most models. The gateway forwards requests untouched; caching behaviour is entirely upstream-managed. For explicit caching, Gemini offers a separatecaches.create API that the gateway does not orchestrate in v1, but clients can call Gemini directly to create a cache and reference it in subsequent gateway calls.
Known quirks
- Safety filters on by default. Gemini applies
safetySettingsthresholds that returnfinish_reason: SAFETYwhen content crosses the threshold. Bifrost/core surfaces this asprovider_errorwith the safety metadata in the OTel trace. - Rate limits. AI Studio’s free tier has aggressive rate limits; paid tier is unlocked via billing-attached Google account. A 429 triggers fallback if configured.
- Tool/function calling. Gemini has its own function-calling format that bifrost/core translates from OpenAI-shape tools. Tool-call streaming deltas are less mature on Gemini, expect byte-level differences vs OpenAI.
- Streaming chunks size. Gemini’s streaming emits larger chunks than OpenAI, fewer, bigger SSE frames. Gateway passes these through unchanged.
- No
systemrole. Gemini usessystemInstructionat the top level of the request. OpenAI-SDK clients that send asystemrole message get it rewritten by the translator.
AI Studio vs Vertex
Google has two paths to Gemini:- AI Studio (this page), quickest, API-key auth, pay-as-you-go on a consumer-ish billing model.
- Vertex AI: GCP-native, IAM-governed, data-residency, enterprise SLAs. See Vertex.