The AI Gateway is a customer-facing HTTPS endpoint. Clients (OpenAI SDKs, Claude Code, Cursor, curl) resolve a hostname, open a TLS connection, and send chat-completion requests. This page covers the SaaS posture (Documentation Index
Fetch the complete documentation index at: https://langwatch.ai/docs/llms.txt
Use this file to discover all available pages before exploring further.
gateway.langwatch.ai) and the self-hoster checklist to point your own hostname at a private deployment.
SaaS: dedicated NLB + ACM cert + Cloudflare DNS-only
LangWatch SaaS exposes the gateway atgateway.langwatch.ai on a dedicated AWS Network Load Balancer with its own ACM-managed certificate, separate from the classic ELB that serves app.langwatch.ai. The two hostnames do not share a load balancer.
- NLB instead of nginx-ingress + ELB. The gateway has different traffic shape than the LangWatch app: long-lived SSE streams, layer-4 passthrough, sub-ms latency budget. NLB suits this; the app’s ELB+nginx posture would buffer SSE without the unbuffered annotations and add a layer-7 hop on every request.
- ACM cert instead of cert-manager. The cert is provisioned via AWS Certificate Manager and validated by DNS-01 (CNAME records published in Cloudflare). cert-manager runs on the app cluster but the gateway uses the AWS-native path because TLS terminates on the NLB itself.
- Cloudflare in DNS-only mode (the gray cloud, NOT proxied). Critical — see Cloudflare proxy mode is wrong for the gateway below.
Self-hosted: point your own hostname at the gateway
Pick a hostname under a domain you control (e.g.gateway.acme.internal or gateway.acme.com) and pick the posture that matches your platform:
- AWS NLB + ACM + Cloudflare — recommended for AWS, matches SaaS.
- nginx-ingress + cert-manager — what the chart renders by default; works on any cluster with an nginx-ingress controller.
- AWS ALB — alternative for shops standardised on AWS Load Balancer Controller.
- GCP / Azure / on-prem — same shape, different controller.
- An external DNS record —
gateway.acme.commust resolve to the load balancer in front of the gateway. - A valid TLS certificate — either issued by ACM (NLB/ALB termination), cert-manager (nginx-ingress termination), or your own PEM.
- A Service or Ingress resource that exposes the gateway pod on
:443— the chart renders this fromingress.enabled(nginx posture) or from a Service-level override (NLB+ACM posture).
nginx-ingress + cert-manager (default chart)
The umbrella chart’s default posture lives atcharts/gateway/values.yaml:
proxy-* annotations are load-bearing — they unlock SSE streaming through nginx. Removing them will cause streamed /v1/chat/completions responses to buffer until the full body arrives, which breaks incremental rendering in every OpenAI SDK.
cert-manager with a DNS-01 Let’s Encrypt issuer is the zero-touch path for the cert:
cert-manager.io/cluster-issuer: letsencrypt-prod annotation on the Ingress picks it up — cert-manager observes the Ingress, requests the cert, and stores it in gateway-tls (or whatever ingress.tls.secretName you set).
DNS for this posture: a CNAME from gateway.acme.com to your nginx-ingress LoadBalancer (typically <hash>.elb.<region>.amazonaws.com on EKS, an external IP on bare metal). Cloudflare can be in proxied (orange) or DNS-only (gray) mode here — Cloudflare proxying is compatible with cert-manager-issued certs because both Cloudflare’s edge cert and the origin cert are publicly-signed; no 526. Privacy + latency trade-offs from the proxy-mode-is-wrong discussion still apply, so DNS-only is still the recommendation for the AI Gateway.
AWS NLB + ACM + Cloudflare (recommended for AWS self-hosts)
This matches the SaaS posture: TLS terminates on a dedicated AWS NLB with an ACM cert, Cloudflare resolves DNS only. Best for AWS self-hosts that want sub-ms TLS handshake and layer-4 passthrough.-
Provision the ACM certificate (request before the LoadBalancer Service so the cert ARN is ready when the controller provisions the NLB):
Then describe the cert to find the validation CNAME records you need to publish in Cloudflare:Returns a
NameandValueper domain — add both as CNAME records in Cloudflare (proxy: DNS-only). ACM polls these continuously; status flips fromPENDING_VALIDATION→ISSUEDonce DNS propagates. Typical: 1-5 min, worst-case ~30 min. ⚠️ No manual force-revalidate API. If you wait 30+ min and ACM is still pending, verify the CNAME withdig CNAME _<token>.gateway.acme.com +short— the answer must match the value inaws acm describe-certificate. Ifdigresolves correctly, ACM will pick it up shortly; if not, fix the Cloudflare record. To watch status: -
Configure the gateway Service as
LoadBalancerwith NLB + ACM annotations. This is a Service-level config (not the Ingress); set it via the chart’s Service block or apply directly:The AWS Load Balancer Controller sees these annotations and provisions an NLB with the ACM cert attached. Wait for the NLB hostname to appear:Provisioning is typically 2-5 min after the cert flipsISSUED. -
Add a CNAME in Cloudflare pointing
gateway.acme.com→ the NLB hostname. Proxy status: DNS-only (gray cloud, NOT orange). See the next section for why.Do not use an A record — NLB underlying IPs rotate and any pinned IP will fail within days. -
Verify end-to-end:
/healthzreturns{"status":"ok","version":"..."}when the pod is live.
Cloudflare proxy mode is wrong for the gateway
When the gateway TLS terminates on an AWS NLB (the SaaS + AWS-self-host posture), Cloudflare must stay in DNS-only mode (gray cloud). Four independent reasons, any one of which is enough:- Origin auth complexity. Cloudflare Full (Strict) does accept any publicly-trusted CA at the origin, so an ACM cert on the NLB will validate without a 526 by default — but turning the proxy on opts you into Cloudflare’s origin-auth surface. The day someone enables Authenticated Origin Pulls or expects mTLS between Cloudflare and the NLB, you’ll need to install Cloudflare-issued origin certs (or the Cloudflare CA bundle) on the NLB just to keep the gateway reachable. That’s operational fragility for zero benefit on a data plane Cloudflare shouldn’t be inspecting in the first place.
- Latency. Cloudflare proxy adds ~10-30 ms per request even for cached zero-config setups, on top of the ~50 ms Cloudflare edge → AWS region path. The AI Gateway aims for sub-ms overhead beyond bifrost; halving that budget on every request defeats the design.
- SSE / streaming buffering. Cloudflare’s proxy buffers small response chunks for compression and bot-detection scoring. SSE responses with single-token deltas (every OpenAI-shape stream) can stall by hundreds of milliseconds at the edge before being released to the client. The result: streaming feels broken even when it’s working.
- Privacy. With Cloudflare proxying, the edge sees every customer’s plaintext LLM prompt and completion (TLS terminates at Cloudflare). For enterprise customers under data-residency or PII constraints, that’s a non-starter — the whole reason to run a gateway is to keep prompt content within your own boundary.
Cache-Control: no-store), DDoS protection lives at the AWS Shield layer for NLB, and WAF rules belong on the application — not on the LLM data path.
AWS ALB (alternative)
If you use AWS Load Balancer Controller instead of nginx-ingress, swap the ingress class and annotations:alb.ingress.kubernetes.io/load-balancer-attributes: idle_timeout.timeout_seconds=3600 or streaming completions will 408 after a minute. This is a common foot-gun — nginx-ingress is the safer default for AI Gateway deployments.
GCP / Azure / on-prem
Same shape, different controller:- GCP GKE:
ingress.className: gce+ aManagedCertificateresource. No per-request idle timeout limit, but response streaming is capped at 30 s unless you use aBackendConfigwithtimeoutSec: 3600. - Azure AKS:
ingress.className: webapprouting.kubernetes.azure.com— similar pattern, cert-manager compatible. - On-prem bare metal: any ingress controller that supports Kubernetes’ standard
Ingressresource works. nginx-ingress, Traefik, HAProxy all tested.
Why a separate hostname instead of a path prefix?
The chart defaults togateway.acme.com (separate hostname) rather than acme.com/gateway/v1 (shared hostname, path prefix). Three reasons:
- No rewrite middleware. A path prefix would require
nginx.ingress.kubernetes.io/rewrite-targetto strip/gatewaybefore forwarding — doable, but it breaks OpenAI SDK compatibility because SDKs build request URLs by joiningbase_url + "/v1/chat/completions"andbase_url = "https://app.acme.com/gateway"makes SDKs double-path. - Observability and rate limits isolate cleanly. You can pin Grafana dashboards and WAF rules to
gateway.*without chasing path prefixes. - TLS cert lifetimes don’t cross-contaminate. A bad deploy on the app side doesn’t invalidate the gateway’s cert.
ingress.host to match your app hostname and add the rewrite annotation. Not recommended unless you have a concrete reason.
Cloudflare-specific notes
- Proxy mode (orange cloud) is wrong for the AI Gateway. See Cloudflare proxy mode is wrong for the gateway. Always use DNS-only (gray cloud).
- Bot Fight Mode can false-positive on Claude Code and OpenAI SDK traffic patterns (steady POST cadence from a single IP). Even in DNS-only mode, BFM can fire on the zone level — disable on the gateway hostname or the whole zone if the gateway lives in a dedicated zone.
- Cache rules: gateway responses should not be cached. The default “respect origin headers” is correct — the gateway sets
Cache-Control: no-storeon every response. Don’t add a Page Rule that forces caching, and don’t enable Cloudflare’s “Cache Everything” preset. - Universal SSL stays on automatically when the record is in the zone, but is unused in DNS-only mode (TLS terminates at the origin’s cert, not Cloudflare’s). Leave it on — it has no effect when the record is gray-cloud.