> ## Documentation Index > Fetch the complete documentation index at: https://langwatch.ai/docs/llms.txt > Use this file to discover all available pages before exploring further. # Evaluations Overview > Ensure quality and safety for your LLM applications with experiments, online evaluation, guardrails, and evaluators. **Let your agent set this up.** [Copy the evaluations prompt](/skills/code-prompts#set-up-evaluations) into your coding agent to get started automatically. LangWatch provides comprehensive evaluations tools for your LLM applications. Whether you're evaluating before deployment or monitoring in production, we have you covered. ## The Agent Evaluation Lifecycle ``` BUILD → TEST → DEPLOY → MONITOR ↓ ↓ Experiments Online Evaluation ↓ ↓ CI/CD Gate Guardrails ``` ## Core Concepts ## When to Use What | Use Case | Solution | | ------------------------------------------ | --------------------------------------------------------------------------------------------------- | | Test prompt changes before deploying | [Experiments](/evaluations/experiments/overview) | | Compare different models or configurations | [Experiments](/evaluations/experiments/overview) | | Run quality checks in CI/CD | [Experiments CI/CD](/evaluations/experiments/ci-cd) | | Monitor production quality over time | [Online Evaluation](/evaluations/online-evaluation/overview) | | Block harmful or policy-violating content | [Guardrails](/evaluations/guardrails/overview) | | Get alerts when quality drops | [Online Evaluation](/evaluations/online-evaluation/overview) + [Automations](/features/automations) | ## Quick Start ### 1. Run Your First Experiment Test your LLM on a dataset using the Experiments via UI or via code: Go to [Experiments](https://app.langwatch.ai/@project/evaluations) and click "New Experiment" to get started with the UI. ```python theme={null} import langwatch evaluation = langwatch.experiment.init("my-first-experiment") for idx, row in evaluation.loop(dataset.iterrows()): response = my_llm(row["input"]) evaluation.log("quality", index=idx, score=0.95) ``` ```typescript theme={null} import { LangWatch } from "langwatch"; const langwatch = new LangWatch(); const evaluation = await langwatch.experiments.init("my-first-experiment"); await evaluation.run(dataset, async ({ item, index }) => { const response = await myLLM(item.input); evaluation.log("quality", { index, score: 0.95 }); }); ``` ### 2. Set Up Online Evaluation Monitor your production traffic with evaluators that run on every trace: 1. Go to [Monitors](https://app.langwatch.ai/@project/evaluations) 2. Create a new monitor with "When a message arrives" trigger 3. Select evaluators (e.g., PII Detection, Faithfulness) 4. Enable monitoring ### 3. Add Guardrails Protect your users by blocking harmful content in real-time: ```python theme={null} import langwatch @langwatch.trace() def my_llm_call(user_input): # Check input before processing guardrail = langwatch.evaluation.evaluate( "azure/jailbreak", name="Jailbreak Detection", as_guardrail=True, data={"input": user_input}, ) if not guardrail.passed: return "I can't help with that request." # Continue with normal processing... ``` ## Supporting Resources