Audio Effects
This recipe shows how to inject background noise, codec degradation, and other audio impairments into a voice scenario. Effects are applied to the user simulator's audio after TTS synthesis — they are never baked into the TTS cache.
Pattern
Pass a list of effect functions to the user simulator's audio_effects /
audioEffects. Effects compose left-to-right: the output of each effect is the
input to the next.
scenario.UserSimulatorAgent(
voice="openai/nova",
audio_effects=[
scenario.effects.background_noise("cafe", volume=0.4),
scenario.effects.phone_quality(),
],
)Available effects
Effect names are snake_case on scenario.effects (Python) and camelCase on
voice.effects (TypeScript). The one non-obvious rename: static →
static_ in TypeScript (static is a reserved word).
Python (scenario.effects) | TypeScript (voice.effects) | Description |
|---|---|---|
background_noise(preset_or_path, volume) | backgroundNoise(presetOrPath, volume?) | Overlay ambient noise. Built-in presets: "cafe", "street", "office", "airport". Pass a .wav path for custom noise. |
static(intensity) | static_(intensity?) | White-noise static at the given fraction of full scale. |
multiple_voices() | multipleVoices(backgroundAudio?) | Mix with a babble sample to simulate background conversation. |
phone_quality() | phoneQuality() | Bandpass 300Hz–3.4kHz + mild compression, mimicking a phone line. |
low_quality(bitrate) | lowQuality(bitrate?) | Downsample to bitrate Hz and back — aliasing and quantisation artefacts. |
packet_loss(probability, chunk_ms) | packetLoss(probability?, chunkMs?) | Zero out random windows at the given probability. |
echo(delay_ms, decay) | echo(delayMs?, decay?) | Overlay a delayed, attenuated copy of the signal. |
robotic() | robotic() | Ring-modulate the signal with a 100Hz carrier for a robotic timbre. |
breaking_up() | breakingUp() | Frequent 100ms dropouts simulating a losing-signal scenario. |
custom(fn) | custom(fn) | Bring your own (bytes) => bytes operating on PCM16 @ 24kHz mono. |
The TypeScript SDK also ships prosody effects not in the Python table above:
lowVolume(factor?), highVolume(factor?), speakingFast(factor?),
speakingSlow(factor?).
Per-step overrides
The simulator-level audio_effects / audioEffects list applies to every user
turn in the scenario.
In TypeScript, you can also override effects (and voice style) for a single
turn by passing options to scenario.user(...) — the override applies to that
turn only, then the simulator's defaults resume:
scenario.user("Hello?", { audioEffects: [voice.effects.lowVolume(0.3)] });
scenario.user("I'm really upset!", { voiceStyle: "angry" });Both SDKs support per-step overrides. Python's scenario.user() accepts
voice_style and audio_effects with the same one-shot semantics as TypeScript:
scenario.user("Hello?", audio_effects=[scenario.effects.low_volume(0.3)])
scenario.user("I'm really upset!", voice_style="angry")Worked example
Python:angry_customer.py
— applies background_noise("cafe", 0.4) and phone_quality() to simulate an angry
caller from a noisy cafe on a poor phone connection. JudgeAgent verifies the agent
handles the emotional tone and noise robustly.
background_handoff.py
— uses background_noise("cafe", 0.5) to simulate overheard conversation during a
handoff; verifies the agent does not respond to the background audio as user speech.
angry-customer.test.ts
— TypeScript counterpart: backgroundNoise("cafe", 0.4) + phoneQuality() effects on
the user simulator; judge verifies emotional tone handling.
background-handoff.test.ts
— background noise during a handoff scenario; verifies the agent ignores background
audio as user speech.
See also
- Capability matrix — adapter support for audio processing
- Interruption recipe — barge-in and VAD interaction with noisy audio
- Observability recipe — capture raw audio chunks for offline analysis
