> ## Documentation Index > Fetch the complete documentation index at: https://langwatch.ai/docs/llms.txt > Use this file to discover all available pages before exploring further. # Guardrails Overview > Block or modify harmful LLM responses in real-time to enforce safety and policy constraints. **Let your agent set this up.** [Copy the evaluations prompt](/docs/skills/code-prompts#set-up-evaluations) into your coding agent to get started automatically. Guardrails are evaluators that run in real-time and **act** on the results - blocking, modifying, or rejecting responses that violate your safety or policy rules. Unlike [monitors](/docs/evaluations/online-evaluation/overview) which only measure and alert, guardrails actively prevent harmful content from reaching users. ## Guardrails vs Monitors | Guardrails | Monitors | | ------------------------------------ | ------------------------------------- | | **Block** harmful content | **Measure** quality metrics | | Run **synchronously** during request | Run **asynchronously** after response | | Return errors or safe responses | Feed dashboards and alerts | | Add latency to requests | No impact on response time | | For **enforcement** | For **observability** | Use guardrails when you need to **prevent** something from happening. Use monitors when you need to **observe** what's happening. ## Common Guardrail Use Cases | Use Case | Evaluator | Action | | -------------------------- | ------------------------- | ------------------------ | | Block jailbreak attempts | Azure Jailbreak Detection | Reject input | | Prevent PII exposure | Presidio PII Detection | Block or redact response | | Enforce content policy | OpenAI Moderation | Return safe response | | Block competitor mentions | Competitor Blocklist | Modify or reject | | Ensure valid output format | Valid Format Evaluator | Retry or reject | ## How Guardrails Work ``` User Input → Guardrail Check → [Pass] → LLM → Response → Guardrail Check → [Pass] → User ↓ ↓ [Fail] → Return Error [Fail] → Return Safe Response ``` Guardrails can run at two points: 1. **Input guardrails** - Check user input before calling your LLM 2. **Output guardrails** - Check LLM response before sending to user ## Getting Started ## Quick Example ```python theme={null} import langwatch @langwatch.trace() def my_chatbot(user_input): # Input guardrail - check for jailbreak attempts jailbreak_check = langwatch.evaluation.evaluate( "azure/jailbreak", name="Jailbreak Detection", as_guardrail=True, data={"input": user_input}, ) if not jailbreak_check.passed: return "I'm sorry, I can't help with that request." # Generate response response = call_llm(user_input) # Output guardrail - check for PII pii_check = langwatch.evaluation.evaluate( "presidio/pii_detection", name="PII Check", as_guardrail=True, data={"output": response}, ) if not pii_check.passed: return "I apologize, but I cannot share that information." return response ``` ```typescript theme={null} import { LangWatch } from "langwatch"; const langwatch = new LangWatch(); async function myChatbot(userInput: string): Promise { // Input guardrail - check for jailbreak attempts const jailbreakCheck = await langwatch.evaluations.evaluate("azure/jailbreak", { name: "Jailbreak Detection", asGuardrail: true, data: { input: userInput }, }); if (!jailbreakCheck.passed) { return "I'm sorry, I can't help with that request."; } // Generate response const response = await callLLM(userInput); // Output guardrail - check for PII const piiCheck = await langwatch.evaluations.evaluate("presidio/pii_detection", { name: "PII Check", asGuardrail: true, data: { output: response }, }); if (!piiCheck.passed) { return "I apologize, but I cannot share that information."; } return response; } ``` ## Best Practices ### 1. Layer your guardrails Use multiple guardrails for defense in depth: ```python theme={null} # Layer 1: Block malicious input jailbreak = evaluate("azure/jailbreak", as_guardrail=True, input=user_input) # Layer 2: Content moderation moderation = evaluate("openai/moderation", as_guardrail=True, input=user_input) # Layer 3: Check output before sending pii = evaluate("presidio/pii_detection", as_guardrail=True, output=response) ``` ### 2. Provide helpful error messages Don't just block - guide users toward acceptable behavior: ```python theme={null} if not guardrail.passed: if guardrail.details: return f"I can't help with that because: {guardrail.details}" return "I'm not able to assist with that request. Could you rephrase?" ``` ### 3. Log guardrail triggers Track when guardrails fire for monitoring and improvement: ```python theme={null} if not guardrail.passed: langwatch.get_current_trace().update( metadata={"guardrail_triggered": guardrail.name} ) ``` ### 4. Consider latency Guardrails add latency. For time-sensitive applications: * Use fast evaluators (regex, blocklists) for input checks * Save heavier evaluators (LLM-based) for output checks * Run multiple guardrails in parallel when possible ## Recommended Evaluators for Guardrails | Evaluator | Best For | Latency | | ------------------------- | ---------------------------- | --------- | | Azure Jailbreak Detection | Blocking prompt injection | Fast | | Azure Prompt Shield | Blocking prompt attacks | Fast | | Presidio PII Detection | Blocking PII exposure | Fast | | OpenAI Moderation | Content policy enforcement | Fast | | Competitor Blocklist | Blocking competitor mentions | Very Fast | | Valid Format | Ensuring structured output | Very Fast | | LLM-as-Judge Boolean | Custom policy checks | Slower | ## Next Steps