Documentation Index
Fetch the complete documentation index at: https://langwatch.ai/docs/llms.txt
Use this file to discover all available pages before exploring further.
Pairs with: AI Gateway → Observability. The gateway is itself an ingestion source (its spans land in the same trace store); IngestionSources extend that substrate to platforms whose API key path you don’t own (Workato, Copilot Studio, OpenAI/Anthropic compliance APIs). Also pairs with Audit log: S3 audit log feeds (Workato, Copilot Studio, Anthropic Compliance) become OTel-shaped events through this same pipeline, so a single audit log query at
/settings/audit-log covers in-platform mutations and third-party-tool activity uniformly.otel_generic + thirty_days retention are Apache 2.0 floor; multi-source + extended retention require Enterprise. Self-hosted Apache 2.0 deployments can wire one OTel receiver at the default 30-day retention; the additional source types on this page (Workato, S3, Copilot Studio, OpenAI, Anthropic compliance APIs) and the one_year, seven_years retention classes ship under the Enterprise license. Non-enterprise organizations see an upgrade card on /settings/governance/ingestion-sources. See Open-core licensing for the full split.recorded_spans + log_records tables that power the rest of LangWatch, distinguished by langwatch.origin.kind = "ingestion_source" origin metadata stamped at the receiver edge. Same trace viewer, same compliance + RBAC + retention machinery, surfaced in the governance dashboard, the per-source detail page, and the langwatch ingest tail CLI.
Governance ingestion is for third-party AI platforms your org runs. If you want to send your own application’s LLM traces to LangWatch, use
/api/otel/v1/traces with your project API key, that’s the right home for traces from systems you own. See Choosing the right OTel endpoint for the full comparison.sourceType honestly: what’s wired up today, what’s still envelope-only, and what’s a configuration contract awaiting an adapter.
State of each receiver: at a glance
| Source type | Delivery | OTLP shape | Storage | State |
|---|---|---|---|---|
otel_generic | Push (HTTP/OTLP) | Spans | recorded_spans | Production-ready |
claude_cowork | Push (HTTP/OTLP) | Spans | recorded_spans | Receiver works, Cowork-specific attribute mapping pending |
workato | Webhook → OTLP logs | Logs | log_records | Receiver works, deeper audit-shape parser pending |
s3_custom | S3 replay + callback webhook | Logs | log_records | Callback-mode works, S3 puller + DSL parser pending |
copilot_studio | Pull (Purview Audit API) | Logs | log_records | Setup-contract-only: Azure AD app config persisted, puller worker pending |
openai_compliance | Pull (Enterprise Compliance JSONL) | Logs | log_records | Setup-contract-only: S3 + role config persisted, puller worker pending |
claude_compliance | Pull (Anthropic Compliance API) | Logs | log_records | Setup-contract-only: workspace API key persisted, puller worker pending |
active, the unified store has the event with origin metadata stamped, but per-platform deeper attribute extraction beyond the default OTLP/JSON shape is pending. Drill-down still works, but column displays may show generic labels until the platform-specific adapter ships.
Setup-contract-only = the source can be created via the admin composer and persists the per-platform fields the eventual puller worker will use, but no traffic flows yet. Creating a source mints the credential + persists the config, but langwatch ingest health will show zeros indefinitely until the worker ships.
Why two OTLP shapes (spans vs logs)
Different upstreams emit different event shapes:- Span-shaped sources (
otel_generic,claude_cowork) emit native OTLP spans with parent-child relationships, durations, and span kind. They benefit from drill-down in the trace viewer (showing the multi-step agent activity tree). - Flat-event sources (
workato,s3_custom,copilot_studio,openai_compliance,claude_compliance) emit one event = one row, no span tree. Forcing them into the span shape requires synthetictraceId,spanId, duration that carry no information. They land naturally as OTLPlog_recordsand drill into the log detail pane.
parseOtlpBody.ts) and the same trace pipeline downstream. Both are queryable from the same governance dashboard, both pull into the OCSF v1.1 read API, both honour the per-origin retention class.
How a source becomes “active”
Independent of source type, the lifecycle is:- Admin opens
/settings/governance/ingestion-sources, picks a source type, fills the per-type config form (including the retention class), clicks Create. - Backend mints a
lw_is_<base64url>ingest secret, hashes it, persists the source with statusawaiting_first_event. The secret is shown once in a one-time-reveal modal, it’s never returned by the list/get endpoints again. - Admin pastes the secret + per-source endpoint into the upstream platform (OTLP exporter URL, webhook destination, Azure app secret, S3 bucket policy, etc.).
- First event arrives at the receiver → receiver lazy-ensures the hidden Governance Project for the org (if not already), stamps
langwatch.origin.*+langwatch.governance.retention_classattributes, hands off to the existing trace pipeline. Source status flips toactive+lastEventAtupdates. - Web detail page polls every 10s, CLI tail polls every 3s in
--followmode, both see the flip + the new event simultaneously.
awaiting_first_event until the adapter ships.
Auth contract: same shape across every receiver
All push-mode + webhook receivers useAuthorization: Bearer lw_is_<secret> (the same shape the gateway uses for its vk-lw- virtual keys, just a different prefix). Mismatched :sourceId path vs. resolved-secret source ID returns 401 unauthorized without leaking which one exists. The 24-hour rotation grace window is documented in the IngestionSource lifecycle architecture.
Verification: same loop for every active source
Once a source is active, verify with the governance CLI debug helpers:--json mode is contract-stable against api.governance.eventsForSource and api.ingestionSources.healthMetrics).
Origin metadata: what every event carries
Every governance-ingested span, log_record is stamped at the receiver edge with reserved attribute namespaces (rejected if found in user-supplied OTLP):| Attribute | Purpose |
|---|---|
langwatch.origin.kind | Always "ingestion_source" for governance traffic, discriminator |
langwatch.ingestion_source.id | Source identity (matches the lw_is_* Bearer’s resolved IngestionSource) |
langwatch.ingestion_source.organization_id | Org tenancy for cross-source roll-ups |
langwatch.ingestion_source.source_type | Source type (otel_generic, workato, etc.) for filtering |
langwatch.governance.retention_class | thirty_days, one_year, seven_years, drives ClickHouse TTL |
Adapter roadmap
The order in which deeper per-platform adapters land depends on customer pull. The current priority order:- Cowork attribute extraction: Cowork pushes OTLP today; receiver accepts but spans aren’t mapped to Cowork’s
tool_usetaxonomy yet. Smallest gap to close. - Workato audit-shape parser: Workato has a stable JSON envelope for job-completed events; the default mapper retains the raw envelope; deeper Actor, Action, Target extraction is the follow-up.
- Microsoft Copilot Studio (Purview Audit) puller: needs an Azure AD app + cron worker; the config is persisted but the worker isn’t.
- OpenAI, Anthropic Enterprise Compliance pullers: S3 + Anthropic compliance API respectively; config persisted, pullers pending.
s3_customDSL parser, generic-shape parser for homegrown audit logs; the customer-facing DSL design is in flight.
governance/ingestion/adapter-roadmap and we’ll sequence accordingly.
Cross-references
- Trace vs governance ingestion: picking the right OTel URL
- Compliance architecture: how the unified substrate underwrites SOC 2, ISO 27001, EU AI Act, GDPR, HIPAA-most-uses
- Per-origin retention:
thirty_days,one_year,seven_yearsclasses - OCSF, SIEM export: pulling governance events into Splunk, Datadog Security, etc.
- Governance CLI debug:
langwatch ingest list/health/tailcommands