Voice Agents — Getting Started
Test voice agents end-to-end with the same scenario.run() API you use for text.
Scenario connects to YOUR agent over its native transport, drives a conversation
with a voice user simulator, and judges the result.
Install
pip install langwatch-scenarioRequired env vars
| Variable | Required | Purpose |
|---|---|---|
OPENAI_API_KEY | Yes | OpenAIRealtimeAgentAdapter + UserSimulatorAgent TTS + JudgeAgent LLM |
Set OPENAI_API_KEY in python/.env or export it in your shell before running.
Your first voice scenario
The snippet below uses OpenAIRealtimeAgentAdapter — the OpenAI Realtime API acts
as both the Scenario adapter and the agent under test. It is the lowest-friction
starting point: only OPENAI_API_KEY is required.
getting_started
# Source: https://github.com/langwatch/scenario/blob/main/python/examples/voice/getting_started.py
"""
Getting Started — Scenario voice agents (OpenAI Realtime path).
What this proves:
Scenario can drive a voice conversation end-to-end against an
OpenAI Realtime agent. OpenAIRealtimeAgentAdapter is BOTH the
scenario.run() adapter AND the agent under test. Only requires
OPENAI_API_KEY.
Real users:
Replace the OpenAIRealtimeAgentAdapter with the adapter that
matches your stack (PipecatAgentAdapter for a Pipecat bot,
TwilioAgentAdapter for a Twilio number, ElevenLabsAgentAdapter
for a hosted ElevenLabs agent). See docs/voice/choosing-an-adapter.md.
How to run:
cd python
uv run examples/voice/getting_started.py
Required env vars:
OPENAI_API_KEY — for OpenAIRealtimeAgentAdapter + JudgeAgent LLM
See also:
docs/docs/pages/voice/getting-started.mdx — rendered docs page
specs/voice-agents.feature — full behavioral contract
"""
import asyncio
import os
import sys
from pathlib import Path
try:
from dotenv import load_dotenv
load_dotenv(Path(__file__).resolve().parent.parent.parent / ".env")
except ImportError:
# python-dotenv is optional — OPENAI_API_KEY may already be in the shell.
pass
if not os.environ.get("OPENAI_API_KEY"):
sys.exit("Error: OPENAI_API_KEY required.")
import scenario # noqa: E402
from scenario.config.voice_models import OPENAI_REALTIME_MODEL # noqa: E402
from scenario.types import AgentRole # noqa: E402
scenario.configure(default_model="openai/gpt-4.1-mini")
async def main() -> scenario.ScenarioResult:
"""Run the getting-started voice scenario. Returns the ScenarioResult."""
result = await scenario.run(
name="voice_getting_started",
description=(
"A caller asks the agent a simple question. "
"The agent responds helpfully."
),
agents=[
scenario.OpenAIRealtimeAgentAdapter(
model=OPENAI_REALTIME_MODEL,
voice="alloy",
instructions="You are a helpful assistant. Keep responses brief.",
role=AgentRole.AGENT,
),
scenario.UserSimulatorAgent(voice="openai/nova"),
scenario.JudgeAgent(
criteria=[
"The agent responded helpfully to the user's question",
"The agent and user exchanged real audio turns",
]
),
],
script=[
scenario.user("Hi, can you help me?"),
scenario.agent(),
scenario.judge(),
],
)
print(f"success: {result.success}")
print(f"verdict: {result.reasoning}")
return result
if __name__ == "__main__":
sys.exit(0 if asyncio.run(main()).success else 1)Save this as getting_started.py and run it:
cd python
uv run examples/voice/getting_started.pyYou should see:
success: True
verdict: The agent responded helpfully to the user's question ...Using a different adapter
Replace OpenAIRealtimeAgentAdapter with the adapter that matches your stack:
- Pipecat bot →
PipecatAgentAdapter(url="ws://localhost:8765/stream", ...) - Twilio number →
TwilioAgentAdapter(account_sid=..., auth_token=..., phone_number=...) - ElevenLabs hosted agent →
ElevenLabsAgentAdapter(agent_id=..., api_key=...)
See How to choose an adapter for constructor signatures and worked examples for each.
What's next
- How to choose an adapter — pick the adapter for your stack (Twilio, Pipecat, ElevenLabs, Gemini Live)
- Capability matrix — per-adapter feature support table
- Voice examples on GitHub — runnable demos per adapter and use case
- Voice agents feature file — full behavioral contract
