If you’re building a SaaS product on top of LLMs, and you want to let your customers bring their own budget without giving them direct provider credentials, this pattern is for you. You become the upstream provider; the LangWatch AI Gateway becomes your provisioning + enforcement + audit layer.Documentation Index
Fetch the complete documentation index at: https://langwatch.ai/docs/llms.txt
Use this file to discover all available pages before exploring further.
The shape
- Each end-user has their own VK. One per customer, or per seat — your call.
- You set per-customer budgets. Enforced at the gateway. When a customer hits their cap, the gateway returns
402 budget_exceeded. Your code doesn’t need to track spend. - All audit + trace data lands in your LangWatch project. End-users never see LangWatch.
- End-users can’t escape your policy. Policy rules, model allowlists, cache rules all attach to the VK.
Step 1 — Model the tenancy
Decide your scope granularity:| Model | VK per | Budget scope | Use case |
|---|---|---|---|
| Per customer | one VK per tenant | principal or virtual_key | Most SaaS apps, one account = one VK |
| Per seat | one VK per user within tenant | principal | Per-seat billing, per-user rate limits |
| Per team | one VK per sub-org | team-scope budgets | Customers have their own teams/projects |
Step 2 — Provision the provider binding once
Your LangWatch project owns a single provider binding per upstream (OpenAI, Anthropic). End-users don’t see these:$GPC_ID in your backend config.
Step 3 — Provision a VK when a customer signs up
Server-side code (Node.js example):Step 4 — Attach a per-customer budget
Tie the budget to the VK:/changes feed — no restart, no customer impact.
Step 5 — End-user makes a call
Your customer’s app calls the gateway directly with the VK you gave them:- Authenticates the VK.
- Checks the customer’s budget.
- Applies your
models_allowedandpolicy_rules. - Dispatches to OpenAI using your provider credential.
- Meters the cost against the customer’s budget.
- Emits a LangWatch trace into your project, tagged with
principal_user_id = customerId.
Step 6 — Bill the customer
You have two options, matching two mental models:(a) Bill from your pricing table
You define a markup over what the gateway shows you spent. Use/api/gateway/v1/budgets or the gateway_budget_ledger ClickHouse view to read per-VK spend:
(b) Passthrough billing
For passthrough, your UI shows the same numbers you see in LangWatch + a per-customer markup or flat management fee. The BudgetLedger row is per-request and includes cost — you can stream it into your reporting.Handling over-limit customers
Whenon_breach: BLOCK, the gateway returns 402 budget_exceeded. Your customer sees:
- Show an upgrade CTA in the end-user’s UI.
- Fall back to a free-tier response (“you’ve hit your monthly cap; upgrade for unlimited”).
error.type.
Rotation & revocation
Customer resets their API key in your UI → you calllangwatch.rotate(vkId), get a new secret, send it to the customer. The old secret stops working within the gateway’s cache TTL (~60 s).
Customer cancels their subscription → langwatch.revoke(vkId). The next request returns 403 virtual_key_revoked. If you want a grace period, schedule the revoke job for 24–48 h after cancellation instead.
Audit — who did what
Every write through your backend token is audited withactor = "svc_<your_project_id>", action, target, and metadata. Filter the audit log on resource_type = "virtualKey" to get a per-customer provisioning history. Under /settings/audit-log in the LangWatch UI.
If you need SIEM export: the Postgres AuditLog table is queryable for gateway-shape rows (filter on targetKind IN ('virtual_key', 'budget', 'provider_binding', 'cache_rule')). Set up a daily pg_dump-then-ship pipeline into your SIEM’s ingestion path. There is no public REST audit-export endpoint in v1 — see Audit log → Querying programmatically for the supported paths (UI CSV download, direct SQL).
Gotchas
- Never let the customer see your backend API token. It has
virtualKeys:create; they’d provision more VKs charged to you. - Rotate VKs when an end-user leaves your customer’s org. Otherwise the ex-user keeps spend access until the budget resets.
- Set
principal_user_idat VK creation time, not later. Audit attribution is based on this; filling it in after-the-fact doesn’t retroactively re-tag old traces. - Test the
402path in your app before go-live. Many apps have unhandled exceptions on budget breach and crash the user’s flow. - Budgets scoped to
virtual_keyare the right level for per-customer enforcement. Scoping toprincipalworks too but traces get messier becauseprincipal_user_idvalues from different customers can collide if you’re not careful with namespacing.
Rate limits at your gateway level
If you want to throttle a specific customer without moving them to a different plan:429 rate_limit_exceeded. Combine with a short-window budget (hour, minute) for finer control.
See also
- Management REST API — every endpoint used above.
- Virtual Keys — VK lifecycle semantics.
- Budgets — hierarchical scope logic.
- Security — how backend tokens are isolated from VK secrets.
- CI smoke-test cookbook — validate the full flow in CI.