gateway.langwatch.ai) and the self-hoster checklist to point your own hostname at a private deployment.
SaaS: dedicated NLB + ACM cert + Cloudflare DNS-only
LangWatch SaaS exposes the gateway atgateway.langwatch.ai on a dedicated AWS Network Load Balancer with its own ACM-managed certificate, separate from the classic ELB that serves app.langwatch.ai. The two hostnames do not share a load balancer.
- NLB instead of nginx-ingress + ELB. The gateway has different traffic shape than the LangWatch app: long-lived SSE streams, layer-4 passthrough, sub-ms latency budget. NLB suits this; the app’s ELB+nginx posture would buffer SSE without the unbuffered annotations and add a layer-7 hop on every request.
- ACM cert instead of cert-manager. The cert is provisioned via AWS Certificate Manager and validated by DNS-01 (CNAME records published in Cloudflare). cert-manager runs on the app cluster but the gateway uses the AWS-native path because TLS terminates on the NLB itself.
- Cloudflare in DNS-only mode (the gray cloud, NOT proxied). Critical, see Cloudflare proxy mode is wrong for the gateway below.
Self-hosted: point your own hostname at the gateway
Pick a hostname under a domain you control (e.g.gateway.acme.internal or gateway.acme.com) and pick the posture that matches your platform:
- AWS NLB + ACM + Cloudflare: recommended for AWS, matches SaaS.
- nginx-ingress + cert-manager: what the chart renders by default; works on any cluster with an nginx-ingress controller.
- AWS ALB: alternative for shops standardised on AWS Load Balancer Controller.
- GCP, Azure, on-prem: same shape, different controller.
- An external DNS record:
gateway.acme.commust resolve to the load balancer in front of the gateway. - A valid TLS certificate: either issued by ACM (NLB/ALB termination), cert-manager (nginx-ingress termination), or your own PEM.
- A Service or Ingress resource that exposes the gateway pod on
:443, the chart renders this fromingress.enabled(nginx posture) or from a Service-level override (NLB+ACM posture).
nginx-ingress + cert-manager (default chart)
The umbrella chart’s default posture lives atcharts/gateway/values.yaml:
proxy-* annotations are load-bearing, they unlock SSE streaming through nginx. Removing them will cause streamed /v1/chat/completions responses to buffer until the full body arrives, which breaks incremental rendering in every OpenAI SDK.
cert-manager with a DNS-01 Let’s Encrypt issuer is the zero-touch path for the cert:
cert-manager.io/cluster-issuer: letsencrypt-prod annotation on the Ingress picks it up, cert-manager observes the Ingress, requests the cert, and stores it in gateway-tls (or whatever ingress.tls.secretName you set).
DNS for this posture: a CNAME from gateway.acme.com to your nginx-ingress LoadBalancer (typically <hash>.elb.<region>.amazonaws.com on EKS, an external IP on bare metal). Cloudflare can be in proxied (orange) or DNS-only (gray) mode here, Cloudflare proxying is compatible with cert-manager-issued certs because both Cloudflare’s edge cert and the origin cert are publicly-signed; no 526. Privacy + latency trade-offs from the proxy-mode-is-wrong discussion still apply, so DNS-only is still the recommendation for the AI Gateway.
AWS NLB + ACM + Cloudflare (recommended for AWS self-hosts)
This matches the SaaS posture: TLS terminates on a dedicated AWS NLB with an ACM cert, Cloudflare resolves DNS only. Best for AWS self-hosts that want sub-ms TLS handshake and layer-4 passthrough.-
Provision the ACM certificate (request before the LoadBalancer Service so the cert ARN is ready when the controller provisions the NLB):
Then describe the cert to find the validation CNAME records you need to publish in Cloudflare:Returns a
NameandValueper domain, add both as CNAME records in Cloudflare (proxy: DNS-only). ACM polls these continuously; status flips fromPENDING_VALIDATION→ISSUEDonce DNS propagates. Typical: 1-5 min, worst-case ~30 min. ⚠️ No manual force-revalidate API. If you wait 30+ min and ACM is still pending, verify the CNAME withdig CNAME _<token>.gateway.acme.com +short, the answer must match the value inaws acm describe-certificate. Ifdigresolves correctly, ACM will pick it up shortly; if not, fix the Cloudflare record. To watch status: -
Configure the gateway Service as
LoadBalancerwith NLB + ACM annotations. This is a Service-level config (not the Ingress); set it via the chart’s Service block or apply directly:The AWS Load Balancer Controller sees these annotations and provisions an NLB with the ACM cert attached. Wait for the NLB hostname to appear:Provisioning is typically 2-5 min after the cert flipsISSUED. -
Add a CNAME in Cloudflare pointing
gateway.acme.com→ the NLB hostname. Proxy status: DNS-only (gray cloud, NOT orange). See the next section for why.Do not use an A record, NLB underlying IPs rotate and any pinned IP will fail within days. -
Verify end-to-end:
/healthzreturns{"status":"ok","version":"..."}when the pod is live.
Cloudflare proxy mode is wrong for the gateway
When the gateway TLS terminates on an AWS NLB (the SaaS + AWS-self-host posture), Cloudflare must stay in DNS-only mode (gray cloud). Four independent reasons, any one of which is enough:- Origin auth complexity. Cloudflare Full (Strict) does accept any publicly-trusted CA at the origin, so an ACM cert on the NLB will validate without a 526 by default, but turning the proxy on opts you into Cloudflare’s origin-auth surface. The day someone enables Authenticated Origin Pulls or expects mTLS between Cloudflare and the NLB, you’ll need to install Cloudflare-issued origin certs (or the Cloudflare CA bundle) on the NLB just to keep the gateway reachable. That’s operational fragility for zero benefit on a data plane Cloudflare shouldn’t be inspecting in the first place.
- Latency. Cloudflare proxy adds ~10-30 ms per request even for cached zero-config setups, on top of the ~50 ms Cloudflare edge → AWS region path. The AI Gateway aims for sub-ms overhead beyond bifrost; halving that budget on every request defeats the design.
- SSE, streaming buffering. Cloudflare’s proxy buffers small response chunks for compression and bot-detection scoring. SSE responses with single-token deltas (every OpenAI-shape stream) can stall by hundreds of milliseconds at the edge before being released to the client. The result: streaming feels broken even when it’s working.
- Privacy. With Cloudflare proxying, the edge sees every customer’s plaintext LLM prompt and completion (TLS terminates at Cloudflare). For enterprise customers under data-residency or PII constraints, that’s a non-starter, the whole reason to run a gateway is to keep prompt content within your own boundary.
Cache-Control: no-store), DDoS protection lives at the AWS Shield layer for NLB, and WAF rules belong on the application, not on the LLM data path.
AWS ALB (alternative)
If you use AWS Load Balancer Controller instead of nginx-ingress, swap the ingress class and annotations:alb.ingress.kubernetes.io/load-balancer-attributes: idle_timeout.timeout_seconds=3600 or streaming completions will 408 after a minute. This is a common foot-gun, nginx-ingress is the safer default for AI Gateway deployments.
GCP, Azure, on-prem
Same shape, different controller:- GCP GKE:
ingress.className: gce+ aManagedCertificateresource. No per-request idle timeout limit, but response streaming is capped at 30 s unless you use aBackendConfigwithtimeoutSec: 3600. - Azure AKS:
ingress.className: webapprouting.kubernetes.azure.com, similar pattern, cert-manager compatible. - On-prem bare metal: any ingress controller that supports Kubernetes’ standard
Ingressresource works. nginx-ingress, Traefik, HAProxy all tested.
Why a separate hostname instead of a path prefix?
The chart defaults togateway.acme.com (separate hostname) rather than acme.com/gateway/v1 (shared hostname, path prefix). Three reasons:
- No rewrite middleware. A path prefix would require
nginx.ingress.kubernetes.io/rewrite-targetto strip/gatewaybefore forwarding, doable, but it breaks OpenAI SDK compatibility because SDKs build request URLs by joiningbase_url + "/v1/chat/completions"andbase_url = "https://app.acme.com/gateway"makes SDKs double-path. - Observability and rate limits isolate cleanly. You can pin Grafana dashboards and WAF rules to
gateway.*without chasing path prefixes. - TLS cert lifetimes don’t cross-contaminate. A bad deploy on the app side doesn’t invalidate the gateway’s cert.
ingress.host to match your app hostname and add the rewrite annotation. Not recommended unless you have a concrete reason.
Cloudflare-specific notes
- Proxy mode (orange cloud) is wrong for the AI Gateway. See Cloudflare proxy mode is wrong for the gateway. Always use DNS-only (gray cloud).
- Bot Fight Mode can false-positive on Claude Code and OpenAI SDK traffic patterns (steady POST cadence from a single IP). Even in DNS-only mode, BFM can fire on the zone level, disable on the gateway hostname or the whole zone if the gateway lives in a dedicated zone.
- Cache rules: gateway responses should not be cached. The default “respect origin headers” is correct, the gateway sets
Cache-Control: no-storeon every response. Don’t add a Page Rule that forces caching, and don’t enable Cloudflare’s “Cache Everything” preset. - Universal SSL stays on automatically when the record is in the zone, but is unused in DNS-only mode (TLS terminates at the origin’s cert, not Cloudflare’s). Leave it on, it has no effect when the record is gray-cloud.