Skip to main content

Documentation Index

Fetch the complete documentation index at: https://langwatch.ai/docs/llms.txt

Use this file to discover all available pages before exploring further.

Pairs with: AI Gateway → Observability. The gateway is itself an ingestion source (its spans land in the same trace store); IngestionSources extend that substrate to platforms whose API key path you don’t own (Workato, Copilot Studio, OpenAI/Anthropic compliance APIs). Also pairs with Audit log: S3 audit log feeds (Workato, Copilot Studio, Anthropic Compliance) become OTel-shaped events through this same pipeline, so a single audit log query at /settings/audit-log covers in-platform mutations and third-party-tool activity uniformly.
otel_generic + thirty_days retention are Apache 2.0 floor; multi-source + extended retention require Enterprise. Self-hosted Apache 2.0 deployments can wire one OTel receiver at the default 30-day retention; the additional source types on this page (Workato, S3, Copilot Studio, OpenAI, Anthropic compliance APIs) and the one_year, seven_years retention classes ship under the Enterprise license. Non-enterprise organizations see an upgrade card on /settings/governance/ingestion-sources. See Open-core licensing for the full split.
LangWatch governance ingest accepts events from upstream agent and audit-log platforms and lands them in the same recorded_spans + log_records tables that power the rest of LangWatch, distinguished by langwatch.origin.kind = "ingestion_source" origin metadata stamped at the receiver edge. Same trace viewer, same compliance + RBAC + retention machinery, surfaced in the governance dashboard, the per-source detail page, and the langwatch ingest tail CLI.
Governance ingestion is for third-party AI platforms your org runs. If you want to send your own application’s LLM traces to LangWatch, use /api/otel/v1/traces with your project API key, that’s the right home for traces from systems you own. See Choosing the right OTel endpoint for the full comparison.
This section documents each supported sourceType honestly: what’s wired up today, what’s still envelope-only, and what’s a configuration contract awaiting an adapter.

State of each receiver: at a glance

Source typeDeliveryOTLP shapeStorageState
otel_genericPush (HTTP/OTLP)Spansrecorded_spansProduction-ready
claude_coworkPush (HTTP/OTLP)Spansrecorded_spansReceiver works, Cowork-specific attribute mapping pending
workatoWebhook → OTLP logsLogslog_recordsReceiver works, deeper audit-shape parser pending
s3_customS3 replay + callback webhookLogslog_recordsCallback-mode works, S3 puller + DSL parser pending
copilot_studioPull (Purview Audit API)Logslog_recordsSetup-contract-only: Azure AD app config persisted, puller worker pending
openai_compliancePull (Enterprise Compliance JSONL)Logslog_recordsSetup-contract-only: S3 + role config persisted, puller worker pending
claude_compliancePull (Anthropic Compliance API)Logslog_recordsSetup-contract-only: workspace API key persisted, puller worker pending
Production-ready = receiver accepts traffic, lands in unified store, web detail page + CLI tail render with full payload + cost, token deltas, drill-down opens the trace viewer (span-shape) or log-detail pane (log-shape). Receiver works = endpoint accepts and ack’s traffic, source flips to active, the unified store has the event with origin metadata stamped, but per-platform deeper attribute extraction beyond the default OTLP/JSON shape is pending. Drill-down still works, but column displays may show generic labels until the platform-specific adapter ships. Setup-contract-only = the source can be created via the admin composer and persists the per-platform fields the eventual puller worker will use, but no traffic flows yet. Creating a source mints the credential + persists the config, but langwatch ingest health will show zeros indefinitely until the worker ships.

Why two OTLP shapes (spans vs logs)

Different upstreams emit different event shapes:
  • Span-shaped sources (otel_generic, claude_cowork) emit native OTLP spans with parent-child relationships, durations, and span kind. They benefit from drill-down in the trace viewer (showing the multi-step agent activity tree).
  • Flat-event sources (workato, s3_custom, copilot_studio, openai_compliance, claude_compliance) emit one event = one row, no span tree. Forcing them into the span shape requires synthetic traceId, spanId, duration that carry no information. They land naturally as OTLP log_records and drill into the log detail pane.
One internal pipeline either way: both shapes pass through the same hardened OTLP parser (parseOtlpBody.ts) and the same trace pipeline downstream. Both are queryable from the same governance dashboard, both pull into the OCSF v1.1 read API, both honour the per-origin retention class.

How a source becomes “active”

Independent of source type, the lifecycle is:
  1. Admin opens /settings/governance/ingestion-sources, picks a source type, fills the per-type config form (including the retention class), clicks Create.
  2. Backend mints a lw_is_<base64url> ingest secret, hashes it, persists the source with status awaiting_first_event. The secret is shown once in a one-time-reveal modal, it’s never returned by the list/get endpoints again.
  3. Admin pastes the secret + per-source endpoint into the upstream platform (OTLP exporter URL, webhook destination, Azure app secret, S3 bucket policy, etc.).
  4. First event arrives at the receiver → receiver lazy-ensures the hidden Governance Project for the org (if not already), stamps langwatch.origin.* + langwatch.governance.retention_class attributes, hands off to the existing trace pipeline. Source status flips to active + lastEventAt updates.
  5. Web detail page polls every 10s, CLI tail polls every 3s in --follow mode, both see the flip + the new event simultaneously.
For source types where the puller worker is not yet wired (the three “setup-contract-only” rows above), step 4 won’t fire; the source stays at awaiting_first_event until the adapter ships.

Auth contract: same shape across every receiver

All push-mode + webhook receivers use Authorization: Bearer lw_is_<secret> (the same shape the gateway uses for its vk-lw- virtual keys, just a different prefix). Mismatched :sourceId path vs. resolved-secret source ID returns 401 unauthorized without leaking which one exists. The 24-hour rotation grace window is documented in the IngestionSource lifecycle architecture.

Verification: same loop for every active source

Once a source is active, verify with the governance CLI debug helpers:
# What does the org see?
langwatch ingest list

# How healthy is one source?
langwatch ingest health <sourceId>

# What events landed in real time?
langwatch ingest tail <sourceId> --follow
The CLI hits the same backend the web detail page hits, byte-for-byte (--json mode is contract-stable against api.governance.eventsForSource and api.ingestionSources.healthMetrics).

Origin metadata: what every event carries

Every governance-ingested span, log_record is stamped at the receiver edge with reserved attribute namespaces (rejected if found in user-supplied OTLP):
AttributePurpose
langwatch.origin.kindAlways "ingestion_source" for governance traffic, discriminator
langwatch.ingestion_source.idSource identity (matches the lw_is_* Bearer’s resolved IngestionSource)
langwatch.ingestion_source.organization_idOrg tenancy for cross-source roll-ups
langwatch.ingestion_source.source_typeSource type (otel_generic, workato, etc.) for filtering
langwatch.governance.retention_classthirty_days, one_year, seven_years, drives ClickHouse TTL
These appear as read-only system metadata in the trace viewer, log detail pane; users cannot supply or edit them.

Adapter roadmap

The order in which deeper per-platform adapters land depends on customer pull. The current priority order:
  1. Cowork attribute extraction: Cowork pushes OTLP today; receiver accepts but spans aren’t mapped to Cowork’s tool_use taxonomy yet. Smallest gap to close.
  2. Workato audit-shape parser: Workato has a stable JSON envelope for job-completed events; the default mapper retains the raw envelope; deeper Actor, Action, Target extraction is the follow-up.
  3. Microsoft Copilot Studio (Purview Audit) puller: needs an Azure AD app + cron worker; the config is persisted but the worker isn’t.
  4. OpenAI, Anthropic Enterprise Compliance pullers: S3 + Anthropic compliance API respectively; config persisted, pullers pending.
  5. s3_custom DSL parser, generic-shape parser for homegrown audit logs; the customer-facing DSL design is in flight.
If you need an unwired adapter, file an issue tagged governance/ingestion/adapter-roadmap and we’ll sequence accordingly.

Cross-references