Skip to content

Audio Effects

This recipe shows how to inject background noise, codec degradation, and other audio impairments into a voice scenario. Effects are applied to the user simulator's audio after TTS synthesis — they are never baked into the TTS cache.

Pattern

Pass a list of effect functions to the user simulator's audio_effects / audioEffects. Effects compose left-to-right: the output of each effect is the input to the next.

python
scenario.UserSimulatorAgent(
    voice="openai/nova",
    audio_effects=[
        scenario.effects.background_noise("cafe", volume=0.4),
        scenario.effects.phone_quality(),
    ],
)

Available effects

Effect names are snake_case on scenario.effects (Python) and camelCase on voice.effects (TypeScript). The one non-obvious rename: staticstatic_ in TypeScript (static is a reserved word).

Python (scenario.effects)TypeScript (voice.effects)Description
background_noise(preset_or_path, volume)backgroundNoise(presetOrPath, volume?)Overlay ambient noise. Built-in presets: "cafe", "street", "office", "airport". Pass a .wav path for custom noise.
static(intensity)static_(intensity?)White-noise static at the given fraction of full scale.
multiple_voices()multipleVoices(backgroundAudio?)Mix with a babble sample to simulate background conversation.
phone_quality()phoneQuality()Bandpass 300Hz–3.4kHz + mild compression, mimicking a phone line.
low_quality(bitrate)lowQuality(bitrate?)Downsample to bitrate Hz and back — aliasing and quantisation artefacts.
packet_loss(probability, chunk_ms)packetLoss(probability?, chunkMs?)Zero out random windows at the given probability.
echo(delay_ms, decay)echo(delayMs?, decay?)Overlay a delayed, attenuated copy of the signal.
robotic()robotic()Ring-modulate the signal with a 100Hz carrier for a robotic timbre.
breaking_up()breakingUp()Frequent 100ms dropouts simulating a losing-signal scenario.
custom(fn)custom(fn)Bring your own (bytes) => bytes operating on PCM16 @ 24kHz mono.

The TypeScript SDK also ships prosody effects not in the Python table above: lowVolume(factor?), highVolume(factor?), speakingFast(factor?), speakingSlow(factor?).

Per-step overrides

The simulator-level audio_effects / audioEffects list applies to every user turn in the scenario.

In TypeScript, you can also override effects (and voice style) for a single turn by passing options to scenario.user(...) — the override applies to that turn only, then the simulator's defaults resume:

scenario.user("Hello?", { audioEffects: [voice.effects.lowVolume(0.3)] });
scenario.user("I'm really upset!", { voiceStyle: "angry" });

Both SDKs support per-step overrides. Python's scenario.user() accepts voice_style and audio_effects with the same one-shot semantics as TypeScript:

scenario.user("Hello?", audio_effects=[scenario.effects.low_volume(0.3)])
scenario.user("I'm really upset!", voice_style="angry")

Worked example

Python:

angry_customer.py — applies background_noise("cafe", 0.4) and phone_quality() to simulate an angry caller from a noisy cafe on a poor phone connection. JudgeAgent verifies the agent handles the emotional tone and noise robustly.

background_handoff.py — uses background_noise("cafe", 0.5) to simulate overheard conversation during a handoff; verifies the agent does not respond to the background audio as user speech.

TypeScript:

angry-customer.test.ts — TypeScript counterpart: backgroundNoise("cafe", 0.4) + phoneQuality() effects on the user simulator; judge verifies emotional tone handling.

background-handoff.test.ts — background noise during a handoff scenario; verifies the agent ignores background audio as user speech.

See also