Skip to main content
⚠️ Experimental: OpenClaw’s diagnostics-otel plugin is under active development. The instrumentation spec is stabilizing but may change. Follow the discussion at PR #11100 for the latest.
OpenClaw is an open-source AI agent framework that runs as a personal assistant or production copilot, handling tasks like code reviews, monitoring, incident response, and more. Its built-in diagnostics-otel plugin emits OpenTelemetry GenAI Semantic Conventions-compliant traces that LangWatch can ingest natively.

Setup

1

Get your LangWatch API Key

Go to app.langwatch.ai/authorize to create your account and project, then grab your API key.
2

Configure OpenClaw

Add the following to your ~/.openclaw/openclaw.json:
{
  "diagnostics": {
    "enabled": true,
    "otel": {
      "enabled": true,
      "endpoint": "https://app.langwatch.ai/api/otel",
      "traces": true,
      "metrics": true,
      "headers": {
        "X-Auth-Token": "sk-lw-YOUR_API_KEY_HERE"
      },
      "serviceName": "my-clawdbot",
      "sampleRate": 1,
      "captureContent": true
    }
  },
  "plugins": {
    "allow": ["diagnostics-otel"],
    "entries": {
      "diagnostics-otel": {
        "enabled": true
      }
    }
  }
}
If you already have an openclaw.json, just merge the diagnostics and plugins sections into your existing config.
3

Restart the gateway

openclaw gateway restart
4

Test the integration

Send a message to your agent and check the LangWatch dashboard. Each agent turn produces a trace with the full execution tree.

What You Get

Each agent turn produces a span tree like this:
run (root span)
├── chat claude-opus-4-6        (LLM inference)
│   ├── tool.Read               (tool execution)
│   └── tool.Edit               (tool execution)
├── chat claude-opus-4-6        (follow-up LLM call)
│   └── tool.exec               (tool execution)
└── chat claude-opus-4-6        (final LLM call)
LLM spans are children of the run. Tool spans are children of the LLM call that invoked them. Each span includes:
  • Model info: which model was requested and which was used
  • Token usage: prompt tokens, completion tokens, cache read/write breakdown
  • Latency: duration of each LLM call and tool execution
  • Cost: calculated from token usage
  • Content: full input/output messages when captureContent is enabled

Content Capture

The captureContent flag controls whether message content is included in traces. When enabled, traces include:
  • gen_ai.input.messages, the full prompt sent to the model
  • gen_ai.output.messages, the model’s response
  • gen_ai.system_instructions, the system prompt
  • gen_ai.request.tools, tool definitions available to the model
  • Tool input/output on execution spans
When disabled, you still get the full trace structure, token counts, latency, and model metadata, just no message content. Useful if you want observability without logging sensitive conversations.

Configuration Reference

OptionDescriptionDefault
diagnostics.otel.enabledTurn OTEL export on/offfalse
diagnostics.otel.endpointOTLP endpoint URL,
diagnostics.otel.tracesExport tracestrue
diagnostics.otel.metricsExport metricstrue
diagnostics.otel.logsExport logsfalse
diagnostics.otel.headersAuth headers (include your API key){}
diagnostics.otel.serviceNameService name in traces"openclaw"
diagnostics.otel.sampleRateSampling rate (0.0–1.0)1.0
diagnostics.otel.captureContentInclude message content in tracesfalse
For high-volume deployments, consider reducing sampleRate to control costs. A rate of 0.1 samples 10% of traces.

GenAI Semantic Conventions

The plugin emits traces compliant with the OTEL GenAI semantic conventions:
AttributeDescription
gen_ai.operation.name"chat" for LLM inference spans
gen_ai.systemProvider identifier (e.g. "anthropic", "openai")
gen_ai.request.modelModel requested
gen_ai.response.modelModel actually used
gen_ai.usage.input_tokensTotal input tokens (including cached)
gen_ai.usage.output_tokensCompletion tokens
gen_ai.usage.cache_read_input_tokensTokens served from cache
gen_ai.usage.cache_creation_input_tokensTokens written to cache
LLM spans use SPAN_KIND_CLIENT per the GenAI spec (outbound RPCs to model providers).
For more information, check out the OpenClaw documentation and the diagnostics-otel discussion.