What is Scenario?

Combine Scenario with evaluations to ensure comprehensive agent quality across three levels:
-
Level 1: Unit Tests
Traditional unit and integration tests that verify agent tools work correctly from a software perspective -
Level 2: Evaluations and Optimization
Measure the performance of individual non-deterministic components, such as maximizing RAG accuracy with evaluations or approximating human preference with GRPO -
Level 3: Agent Simulations
End-to-end testing across different scenarios and edge cases, ensuring the complete agent achieves more than the sum of its parts
Scenario enables you to modify agent prompts, tools, and structure without regressions. The framework works with all AI agent frameworks and does not require datasets.
Getting Started
If you are new to Scenario, you can start by writing your first scenario, then learn how to integrate your agent and dive deeper into the core concepts.
Getting Started
Your First Scenario
Agent Integration
Integrate your agent with Scenario
Core Concepts
Learn the core concepts and capabilities
Why Scenario?
Scenario is the most advanced and flexible agent testing framework, the library's agnostic design makes it incredibly simple to learn and use. Here are some of the key features:
- Test real agent behavior by simulating users in different scenarios and edge cases
- Evaluate and judge at any point of the conversation, powerful multi-turn control
- Combine it with any LLM eval framework or custom evals, agnostic by design
- Integrate your Agent by implementing just one
call()method - Available in Python, TypeScript and Go
Learn more about simulation-based testing principles in Simulation-Based Testing.
Explore how to write effective scenarios, review agent integrations for LangGraph, CrewAI, and Pydantic AI, or explore testing guides for tool calling and fixtures.
