Blog
News, insights, & more
News, insights, & more
Expert insights on AI agent testing, LLM evaluation, and more.

Article
Trace IDs in AI: LLM Observability and Distributed Tracing
Learn how trace IDs enable observability across LLM apps. Track prompts, tokens, latency, & costs across AI workflows.

Manouk
Aug 22, 2025

Article
Trace IDs in AI: LLM Observability and Distributed Tracing
Learn how trace IDs enable observability across LLM apps. Track prompts, tokens, latency, & costs across AI workflows.

Manouk
Aug 22, 2025

Article
Trace IDs in AI: LLM Observability and Distributed Tracing
Learn how trace IDs enable observability across LLM apps. Track prompts, tokens, latency, & costs across AI workflows.

Manouk
Aug 22, 2025

Article
The 6 context engineering challenges stopping AI from scaling in production
Discover 6 context engineering challenges blocking AI from scaling—and how LangWatch helps enterprises overcome them
Manouk
Aug 19, 2025

Article
The 6 context engineering challenges stopping AI from scaling in production
Discover 6 context engineering challenges blocking AI from scaling—and how LangWatch helps enterprises overcome them
Manouk
Aug 19, 2025

Article
The 6 context engineering challenges stopping AI from scaling in production
Discover 6 context engineering challenges blocking AI from scaling—and how LangWatch helps enterprises overcome them
Manouk
Aug 19, 2025

Article
LLMOps is the new DevOps, here’s what every developer must know
LLMOps is the new DevOps. Discover what every developer must know to manage, test, and scale AI applications with confidence
Manouk
Aug 18, 2025

Article
LLMOps is the new DevOps, here’s what every developer must know
LLMOps is the new DevOps. Discover what every developer must know to manage, test, and scale AI applications with confidence
Manouk
Aug 18, 2025

Article
LLMOps is the new DevOps, here’s what every developer must know
LLMOps is the new DevOps. Discover what every developer must know to manage, test, and scale AI applications with confidence
Manouk
Aug 18, 2025

Article
LLM observability: What is it and why it matters
What is LLM observability, and why is agent observability becoming critical for the future of AI systems?

Manouk Draisma
Aug 14, 2025

Article
LLM observability: What is it and why it matters
What is LLM observability, and why is agent observability becoming critical for the future of AI systems?

Manouk Draisma
Aug 14, 2025

Article
LLM observability: What is it and why it matters
What is LLM observability, and why is agent observability becoming critical for the future of AI systems?

Manouk Draisma
Aug 14, 2025

Article
Observability Framework Design for LLM Apps – The Complete LangWatch Guide
Understand observability framework architecture for LLM apps. Learn design principles and monitoring strategies

Manouk Draisma
Aug 1, 2025

Article
Observability Framework Design for LLM Apps – The Complete LangWatch Guide
Understand observability framework architecture for LLM apps. Learn design principles and monitoring strategies

Manouk Draisma
Aug 1, 2025

Article
Observability Framework Design for LLM Apps – The Complete LangWatch Guide
Understand observability framework architecture for LLM apps. Learn design principles and monitoring strategies

Manouk Draisma
Aug 1, 2025

Article
GPT-5 Release: From Benchmarks to production reality
OpenAI has released its newest flagship model, GPT-5 - Start evaluating the performance within LangWatch available now.

Manouk Draisma
Aug 8, 2025

Article
GPT-5 Release: From Benchmarks to production reality
OpenAI has released its newest flagship model, GPT-5 - Start evaluating the performance within LangWatch available now.

Manouk Draisma
Aug 8, 2025

Article
GPT-5 Release: From Benchmarks to production reality
OpenAI has released its newest flagship model, GPT-5 - Start evaluating the performance within LangWatch available now.

Manouk Draisma
Aug 8, 2025

Article
LLM-as-a-Judge: Using the Panel of Judges Approach to Approximate Human Preference
Discover how multiple LLM as a judge evals create a panel system that match with human preference for subjective AI quality.

Rogerio Chaves
Aug 7, 2025

Article
LLM-as-a-Judge: Using the Panel of Judges Approach to Approximate Human Preference
Discover how multiple LLM as a judge evals create a panel system that match with human preference for subjective AI quality.

Rogerio Chaves
Aug 7, 2025

Article
LLM-as-a-Judge: Using the Panel of Judges Approach to Approximate Human Preference
Discover how multiple LLM as a judge evals create a panel system that match with human preference for subjective AI quality.

Rogerio Chaves
Aug 7, 2025

News
Top 4 Humanloop Alternatives in 2025
Looking for a Humanloop alternative? These are top platforms for LLM evaluation, agent testing, and observability.

Manouk Draisma
Jul 18, 2025

News
Top 4 Humanloop Alternatives in 2025
Looking for a Humanloop alternative? These are top platforms for LLM evaluation, agent testing, and observability.

Manouk Draisma
Jul 18, 2025

News
Top 4 Humanloop Alternatives in 2025
Looking for a Humanloop alternative? These are top platforms for LLM evaluation, agent testing, and observability.

Manouk Draisma
Jul 18, 2025

Article
Why Agent Simulations are the new Unit Tests for AI
Learn why simulation is key to AI agent testing and how LangWatch Scenario brings scenario-based tests to your CI/CD.

Tahmid - AI researcher @ LangWatch
Jul 7, 2025

Article
Why Agent Simulations are the new Unit Tests for AI
Learn why simulation is key to AI agent testing and how LangWatch Scenario brings scenario-based tests to your CI/CD.

Tahmid - AI researcher @ LangWatch
Jul 7, 2025

Article
Why Agent Simulations are the new Unit Tests for AI
Learn why simulation is key to AI agent testing and how LangWatch Scenario brings scenario-based tests to your CI/CD.

Tahmid - AI researcher @ LangWatch
Jul 7, 2025

Article
Multilingual AI Agent Testing: Using Scenario to Simulate, Break, and Improve LLMs
Discover how Scenario enables bulletproof testing of multilingual LLM agents through, ensuring your AI is production-ready

Andrew - Engineer @ LangWatch
Jun 20, 2025

Article
Multilingual AI Agent Testing: Using Scenario to Simulate, Break, and Improve LLMs
Discover how Scenario enables bulletproof testing of multilingual LLM agents through, ensuring your AI is production-ready

Andrew - Engineer @ LangWatch
Jun 20, 2025

Article
Multilingual AI Agent Testing: Using Scenario to Simulate, Break, and Improve LLMs
Discover how Scenario enables bulletproof testing of multilingual LLM agents through, ensuring your AI is production-ready

Andrew - Engineer @ LangWatch
Jun 20, 2025

Article
Real-time simulation visualization and debug mode
Watch simulated conversations play out in real-time with parallel execution, versioned runs, and interactive debugging.

Rogerio Chaves
Jun 27, 2025

Article
Real-time simulation visualization and debug mode
Watch simulated conversations play out in real-time with parallel execution, versioned runs, and interactive debugging.

Rogerio Chaves
Jun 27, 2025

Article
Real-time simulation visualization and debug mode
Watch simulated conversations play out in real-time with parallel execution, versioned runs, and interactive debugging.

Rogerio Chaves
Jun 27, 2025

Article
Scripted simulations, evaluations, and guardrails
Scripted simulations let you orchestrate how conversations unfold, when evaluations occur, and what custom logic runs.

Rogerio Chaves
Jun 26, 2025

Article
Scripted simulations, evaluations, and guardrails
Scripted simulations let you orchestrate how conversations unfold, when evaluations occur, and what custom logic runs.

Rogerio Chaves
Jun 26, 2025

Article
Scripted simulations, evaluations, and guardrails
Scripted simulations let you orchestrate how conversations unfold, when evaluations occur, and what custom logic runs.

Rogerio Chaves
Jun 26, 2025

Article
Test agents on Mastra, Agno, and 10+ other frameworks
Scenario is framework-agnostic, supporting any agent architecture through the AgentAdapter interface.

Rogerio Chaves
Jun 25, 2025

Article
Test agents on Mastra, Agno, and 10+ other frameworks
Scenario is framework-agnostic, supporting any agent architecture through the AgentAdapter interface.

Rogerio Chaves
Jun 25, 2025

Article
Test agents on Mastra, Agno, and 10+ other frameworks
Scenario is framework-agnostic, supporting any agent architecture through the AgentAdapter interface.

Rogerio Chaves
Jun 25, 2025

Article
LangSmith Alternatives: What to use if you need more security and control
Explore top LangSmith alternatives, including LangWatch.ai the ideal platform for optimizing, evaluating, and monitoring.
Manouk
Jun 18, 2025

Article
LangSmith Alternatives: What to use if you need more security and control
Explore top LangSmith alternatives, including LangWatch.ai the ideal platform for optimizing, evaluating, and monitoring.
Manouk
Jun 18, 2025

Article
LangSmith Alternatives: What to use if you need more security and control
Explore top LangSmith alternatives, including LangWatch.ai the ideal platform for optimizing, evaluating, and monitoring.
Manouk
Jun 18, 2025

Article
Introducing simulation-based agent testing
Test your agents with scenarios directly in your codebase using Python and TypeScript.

Rogerio Chaves
Jun 24, 2025

Article
Introducing simulation-based agent testing
Test your agents with scenarios directly in your codebase using Python and TypeScript.

Rogerio Chaves
Jun 24, 2025

Article
Introducing simulation-based agent testing
Test your agents with scenarios directly in your codebase using Python and TypeScript.

Rogerio Chaves
Jun 24, 2025

Article
Why LangWatch Scenarios represents the future of AI agent testing
Agent simulations are the new unit tests. You shouldn’t ship agents without simulations.

Rogerio Chaves
Jun 24, 2025

Article
Why LangWatch Scenarios represents the future of AI agent testing
Agent simulations are the new unit tests. You shouldn’t ship agents without simulations.

Rogerio Chaves
Jun 24, 2025

Article
Why LangWatch Scenarios represents the future of AI agent testing
Agent simulations are the new unit tests. You shouldn’t ship agents without simulations.

Rogerio Chaves
Jun 24, 2025

Article
Best AI Agent Frameworks in 2025: Comparing LangGraph, DSPy, CrewAI, Agno, and More
Explore a detailed, developer-tested comparison of top AI agent frameworks in 2025, including LangGraph, DSPy, Agno and more.

Rogerio Chaves
Jun 21, 2025

Article
Best AI Agent Frameworks in 2025: Comparing LangGraph, DSPy, CrewAI, Agno, and More
Explore a detailed, developer-tested comparison of top AI agent frameworks in 2025, including LangGraph, DSPy, Agno and more.

Rogerio Chaves
Jun 21, 2025

Article
Best AI Agent Frameworks in 2025: Comparing LangGraph, DSPy, CrewAI, Agno, and More
Explore a detailed, developer-tested comparison of top AI agent frameworks in 2025, including LangGraph, DSPy, Agno and more.

Rogerio Chaves
Jun 21, 2025

Article
Customer Story: How Roojoom automates AI Agent Quality Control with LangWatch Scenario
Using LangWatch Scenario, the Rojoom product team built a daily automation way to ship new AI features with confidence.

Manouk Draisma
Jun 25, 2026

Article
Customer Story: How Roojoom automates AI Agent Quality Control with LangWatch Scenario
Using LangWatch Scenario, the Rojoom product team built a daily automation way to ship new AI features with confidence.

Manouk Draisma
Jun 25, 2026

Article
Customer Story: How Roojoom automates AI Agent Quality Control with LangWatch Scenario
Using LangWatch Scenario, the Rojoom product team built a daily automation way to ship new AI features with confidence.

Manouk Draisma
Jun 25, 2026

Article
Intro to Scenario (Testing AI agents)
LLMs make it easy to build agent demos. But building reliable and policy-aware agents takes more than a good prompt.

Tahmid AI researcher @ LangWatch
Jun 13, 2025

Article
Intro to Scenario (Testing AI agents)
LLMs make it easy to build agent demos. But building reliable and policy-aware agents takes more than a good prompt.

Tahmid AI researcher @ LangWatch
Jun 13, 2025

Article
Intro to Scenario (Testing AI agents)
LLMs make it easy to build agent demos. But building reliable and policy-aware agents takes more than a good prompt.

Tahmid AI researcher @ LangWatch
Jun 13, 2025

Article
Agent Evaluation: Framework for Testing AI Agents
Create robust agent evaluation systems that catch AI agent bugs before production. Testing agent behavior and performance.

Tahmid AI researcher @ LangWatch
Jun 11, 2025

Article
Agent Evaluation: Framework for Testing AI Agents
Create robust agent evaluation systems that catch AI agent bugs before production. Testing agent behavior and performance.

Tahmid AI researcher @ LangWatch
Jun 11, 2025

Article
Agent Evaluation: Framework for Testing AI Agents
Create robust agent evaluation systems that catch AI agent bugs before production. Testing agent behavior and performance.

Tahmid AI researcher @ LangWatch
Jun 11, 2025

Article
Simulations from First Principles (How to test your agents)
A practical playbook that frames evals as a CI/CD pipeline.

Tahmid AI researcher @ LangWatch
Jun 12, 2025

Article
Simulations from First Principles (How to test your agents)
A practical playbook that frames evals as a CI/CD pipeline.

Tahmid AI researcher @ LangWatch
Jun 12, 2025

Article
Simulations from First Principles (How to test your agents)
A practical playbook that frames evals as a CI/CD pipeline.

Tahmid AI researcher @ LangWatch
Jun 12, 2025

Article
Simulation Based Eval Framework
AI agents: The real challenge is making sure it works: reliably, accurately, and at scale.

Tahmid - AI research @LangWatch
Jun 6, 2025

Article
Simulation Based Eval Framework
AI agents: The real challenge is making sure it works: reliably, accurately, and at scale.

Tahmid - AI research @LangWatch
Jun 6, 2025

Article
Simulation Based Eval Framework
AI agents: The real challenge is making sure it works: reliably, accurately, and at scale.

Tahmid - AI research @LangWatch
Jun 6, 2025

Article
Introduction: The Real Issue isn’t RL
Why reinforcement learning in LLMs isn't broken our eval methods are. Learn how we can unlock RL's potential in LLMs

Tahmid AI Researcher @ LangWatch
May 30, 2025

Article
Introduction: The Real Issue isn’t RL
Why reinforcement learning in LLMs isn't broken our eval methods are. Learn how we can unlock RL's potential in LLMs

Tahmid AI Researcher @ LangWatch
May 30, 2025

Article
Introduction: The Real Issue isn’t RL
Why reinforcement learning in LLMs isn't broken our eval methods are. Learn how we can unlock RL's potential in LLMs

Tahmid AI Researcher @ LangWatch
May 30, 2025

Article
Simulations to Test My Agent
How I stopped evaluating AI Agents like robots and started testing them like humans (with simulations)

Tahmid, AI Researcher @ LangWatch
May 28, 2025

Article
Simulations to Test My Agent
How I stopped evaluating AI Agents like robots and started testing them like humans (with simulations)

Tahmid, AI Researcher @ LangWatch
May 28, 2025

Article
Simulations to Test My Agent
How I stopped evaluating AI Agents like robots and started testing them like humans (with simulations)

Tahmid, AI Researcher @ LangWatch
May 28, 2025

Article
Webinar recap: LLM Evaluations: Best Practices, LLM Eval types & real-world insights
This post breaks down the core components of LLM evaluation, from datasets to evaluators, explores best practices

Manouk Draisma
Jun 26, 2025

Article
Webinar recap: LLM Evaluations: Best Practices, LLM Eval types & real-world insights
This post breaks down the core components of LLM evaluation, from datasets to evaluators, explores best practices

Manouk Draisma
Jun 26, 2025

Article
Webinar recap: LLM Evaluations: Best Practices, LLM Eval types & real-world insights
This post breaks down the core components of LLM evaluation, from datasets to evaluators, explores best practices

Manouk Draisma
Jun 26, 2025
Article
New Python SDK Brings Native OpenTelemetry to GenAI Observability
Python SDK 0.2 adds native OpenTelemetry for GenAI apps—zero-break upgrades, better tracing, and faster debugging

Alex Forbes-Reed
May 15, 2025
Article
New Python SDK Brings Native OpenTelemetry to GenAI Observability
Python SDK 0.2 adds native OpenTelemetry for GenAI apps—zero-break upgrades, better tracing, and faster debugging

Alex Forbes-Reed
May 15, 2025
Article
New Python SDK Brings Native OpenTelemetry to GenAI Observability
Python SDK 0.2 adds native OpenTelemetry for GenAI apps—zero-break upgrades, better tracing, and faster debugging

Alex Forbes-Reed
May 15, 2025

Article
April Product Recap: Selene Integration, Eval Wizard Upgrades, Prompt Studio & More
LangWatch Selente - Atla, LLM Evaluations, prompt versioning, structured output, OpenTelemetry SDK, LLMops ISO certified

Manouk
May 5, 2025

Article
April Product Recap: Selene Integration, Eval Wizard Upgrades, Prompt Studio & More
LangWatch Selente - Atla, LLM Evaluations, prompt versioning, structured output, OpenTelemetry SDK, LLMops ISO certified

Manouk
May 5, 2025

Article
April Product Recap: Selene Integration, Eval Wizard Upgrades, Prompt Studio & More
LangWatch Selente - Atla, LLM Evaluations, prompt versioning, structured output, OpenTelemetry SDK, LLMops ISO certified

Manouk
May 5, 2025

Article
LLM Monitoring & Evaluation for Real-World Production Use
Key challenges teams face when put LLM-powered apps in production, and why continuous monitoring and evaluation is essential

Manouk
May 5, 2025

Article
LLM Monitoring & Evaluation for Real-World Production Use
Key challenges teams face when put LLM-powered apps in production, and why continuous monitoring and evaluation is essential

Manouk
May 5, 2025

Article
LLM Monitoring & Evaluation for Real-World Production Use
Key challenges teams face when put LLM-powered apps in production, and why continuous monitoring and evaluation is essential

Manouk
May 5, 2025

Article
Systematically Improving RAG Agents
Improving RAG agents: Build a basic system, Create evaluation data, run experiments

Tahmid Tapadar
Apr 24, 2025

Article
Systematically Improving RAG Agents
Improving RAG agents: Build a basic system, Create evaluation data, run experiments

Tahmid Tapadar
Apr 24, 2025

Article
Systematically Improving RAG Agents
Improving RAG agents: Build a basic system, Create evaluation data, run experiments

Tahmid Tapadar
Apr 24, 2025

Article
Introducing the Evaluations Wizard: How to evaluate your LLM: Building an LLM evaluation framework that actually works
Learn how to effectively evaluate and test LLMs with LangWatch's new Evaluations Wizard. Improve your AI model performance

Rogerio
Apr 22, 2025

Article
Introducing the Evaluations Wizard: How to evaluate your LLM: Building an LLM evaluation framework that actually works
Learn how to effectively evaluate and test LLMs with LangWatch's new Evaluations Wizard. Improve your AI model performance

Rogerio
Apr 22, 2025

Article
Introducing the Evaluations Wizard: How to evaluate your LLM: Building an LLM evaluation framework that actually works
Learn how to effectively evaluate and test LLMs with LangWatch's new Evaluations Wizard. Improve your AI model performance

Rogerio
Apr 22, 2025

Article
Function Calling vs. MCP: Why You Need Both—and How LangWatch Makes It Click
What is MCP? What does MCP stand for? And what is Function Calling?

Manouk Draisma
Apr 18, 2025

Article
Function Calling vs. MCP: Why You Need Both—and How LangWatch Makes It Click
What is MCP? What does MCP stand for? And what is Function Calling?

Manouk Draisma
Apr 18, 2025

Article
Function Calling vs. MCP: Why You Need Both—and How LangWatch Makes It Click
What is MCP? What does MCP stand for? And what is Function Calling?

Manouk Draisma
Apr 18, 2025

Article
Why LLM Observability is Now Table Stakes
The start of LLMOps: DevOps for Generative AI

Manouk Draisma
Apr 18, 2025

Article
Why LLM Observability is Now Table Stakes
The start of LLMOps: DevOps for Generative AI

Manouk Draisma
Apr 18, 2025

Article
Why LLM Observability is Now Table Stakes
The start of LLMOps: DevOps for Generative AI

Manouk Draisma
Apr 18, 2025

Article
LangWatch vs. LangSmith vs. Braintrust vs. Langfuse: Choosing the Best LLM Evaluation & Monitoring Tool in 2025
Compare LangWatch, LangSmith, Braintrust, and Langfuse in this 2025 guide to LLM evaluation and monitoring tools

Manouk Draisma
Apr 17, 2025

Article
LangWatch vs. LangSmith vs. Braintrust vs. Langfuse: Choosing the Best LLM Evaluation & Monitoring Tool in 2025
Compare LangWatch, LangSmith, Braintrust, and Langfuse in this 2025 guide to LLM evaluation and monitoring tools

Manouk Draisma
Apr 17, 2025

Article
LangWatch vs. LangSmith vs. Braintrust vs. Langfuse: Choosing the Best LLM Evaluation & Monitoring Tool in 2025
Compare LangWatch, LangSmith, Braintrust, and Langfuse in this 2025 guide to LLM evaluation and monitoring tools

Manouk Draisma
Apr 17, 2025

Article
Introducing Scenario: Use an Agent to Test Your Agent
Scenario is an automated testing library for LLM agents that simulates real user interactions end-to-end.

Rogerio Chaves
Apr 8, 2025

Article
Introducing Scenario: Use an Agent to Test Your Agent
Scenario is an automated testing library for LLM agents that simulates real user interactions end-to-end.

Rogerio Chaves
Apr 8, 2025

Article
Introducing Scenario: Use an Agent to Test Your Agent
Scenario is an automated testing library for LLM agents that simulates real user interactions end-to-end.

Rogerio Chaves
Apr 8, 2025

Article
LLM evaluations at Swis for Dutch government projects by LangWatch
How do we objectively know if the AI output is good? LLM evaluation reports & feedback loops

Manouk
Apr 3, 2025

Article
LLM evaluations at Swis for Dutch government projects by LangWatch
How do we objectively know if the AI output is good? LLM evaluation reports & feedback loops

Manouk
Apr 3, 2025

Article
LLM evaluations at Swis for Dutch government projects by LangWatch
How do we objectively know if the AI output is good? LLM evaluation reports & feedback loops

Manouk
Apr 3, 2025

Article
LangWatch and adesso join forces: Accelerating Secure LLM Adoption for Enterprises
LangWatch partners with Adesso to support Enterprise companies with LLMops

Manouk
Mar 27, 2025

Article
LangWatch and adesso join forces: Accelerating Secure LLM Adoption for Enterprises
LangWatch partners with Adesso to support Enterprise companies with LLMops

Manouk
Mar 27, 2025

Article
LangWatch and adesso join forces: Accelerating Secure LLM Adoption for Enterprises
LangWatch partners with Adesso to support Enterprise companies with LLMops

Manouk
Mar 27, 2025

Article
Why Your AI Team Needs an AI PM (Quality) Lead
The best GenAI teams are now introducing a critical new role: the AI PM (Quality) Lead.

Manouk
Apr 2, 2025

Article
Why Your AI Team Needs an AI PM (Quality) Lead
The best GenAI teams are now introducing a critical new role: the AI PM (Quality) Lead.

Manouk
Apr 2, 2025

Article
Why Your AI Team Needs an AI PM (Quality) Lead
The best GenAI teams are now introducing a critical new role: the AI PM (Quality) Lead.

Manouk
Apr 2, 2025

Article
LLMOps Is Still About People: How to Build AI Teams That Don’t Implode
LLMs can do amazing things, but only if they understand context. That context lives in the heads of domain experts.

Manouk
Mar 25, 2025

Article
LLMOps Is Still About People: How to Build AI Teams That Don’t Implode
LLMs can do amazing things, but only if they understand context. That context lives in the heads of domain experts.

Manouk
Mar 25, 2025

Article
LLMOps Is Still About People: How to Build AI Teams That Don’t Implode
LLMs can do amazing things, but only if they understand context. That context lives in the heads of domain experts.

Manouk
Mar 25, 2025

Article
Practical LLM Evaluation Framework for AI Development Teams
Deploy an LLM evaluation framework that catches issues early. Reduce debugging time and improve AI quality.

Manouk
Mar 20, 2025

Article
Practical LLM Evaluation Framework for AI Development Teams
Deploy an LLM evaluation framework that catches issues early. Reduce debugging time and improve AI quality.

Manouk
Mar 20, 2025

Article
Practical LLM Evaluation Framework for AI Development Teams
Deploy an LLM evaluation framework that catches issues early. Reduce debugging time and improve AI quality.

Manouk
Mar 20, 2025

Article
Tackling LLM Hallucinations with LangWatch: Why Monitoring and Evaluation Matter
What are LLM Hallucinations? What causes LLM hallucinations? How to monitor and evaluate LLM-apps

Manouk
Apr 4, 2025

Article
Tackling LLM Hallucinations with LangWatch: Why Monitoring and Evaluation Matter
What are LLM Hallucinations? What causes LLM hallucinations? How to monitor and evaluate LLM-apps

Manouk
Apr 4, 2025

Article
Tackling LLM Hallucinations with LangWatch: Why Monitoring and Evaluation Matter
What are LLM Hallucinations? What causes LLM hallucinations? How to monitor and evaluate LLM-apps

Manouk
Apr 4, 2025

Article
What is Model Context Protocol (MCP)? And how's LangWatch involved?
The Model Context Protocol is a new standard that lets AI agents easily connect to external tools and data sources.

Manouk
Mar 16, 2025

Article
What is Model Context Protocol (MCP)? And how's LangWatch involved?
The Model Context Protocol is a new standard that lets AI agents easily connect to external tools and data sources.

Manouk
Mar 16, 2025

Article
What is Model Context Protocol (MCP)? And how's LangWatch involved?
The Model Context Protocol is a new standard that lets AI agents easily connect to external tools and data sources.

Manouk
Mar 16, 2025

Article
How PHWL.ai uses LLM Observability and Optimization to Improve AI Coaching with LangWatch
Improve your LLM performance with real-time observability and optimization

Manouk
Mar 14, 2025

Article
How PHWL.ai uses LLM Observability and Optimization to Improve AI Coaching with LangWatch
Improve your LLM performance with real-time observability and optimization

Manouk
Mar 14, 2025

Article
How PHWL.ai uses LLM Observability and Optimization to Improve AI Coaching with LangWatch
Improve your LLM performance with real-time observability and optimization

Manouk
Mar 14, 2025

Article
LangWatch.ai - Announcing - €1M funding round to bring the power of Evaluations and Auto-Optimizations to AI teams.
LangWatch: €1M pre-seed funding round led by Passion Capital, with great support from Volta Ventures and Antler.

Manouk
Feb 25, 2025

Article
LangWatch.ai - Announcing - €1M funding round to bring the power of Evaluations and Auto-Optimizations to AI teams.
LangWatch: €1M pre-seed funding round led by Passion Capital, with great support from Volta Ventures and Antler.

Manouk
Feb 25, 2025

Article
LangWatch.ai - Announcing - €1M funding round to bring the power of Evaluations and Auto-Optimizations to AI teams.
LangWatch: €1M pre-seed funding round led by Passion Capital, with great support from Volta Ventures and Antler.

Manouk
Feb 25, 2025

Article
OpenAI, Anthropic, Deepseek and other LLM Providers keep dropping prices: Should you host your own model?
OpenAI, Anthropic, Deepseek and other LLM Providers keep dropping prices: Should you host your own model?

Manouk
Feb 20, 2025

Article
OpenAI, Anthropic, Deepseek and other LLM Providers keep dropping prices: Should you host your own model?
OpenAI, Anthropic, Deepseek and other LLM Providers keep dropping prices: Should you host your own model?

Manouk
Feb 20, 2025

Article
OpenAI, Anthropic, Deepseek and other LLM Providers keep dropping prices: Should you host your own model?
OpenAI, Anthropic, Deepseek and other LLM Providers keep dropping prices: Should you host your own model?

Manouk
Feb 20, 2025

Article
7 Predictions for AI in 2025: A CTO's, Rogerio Chaves Perspective
AI is evolving at speed, and the landscape in 2025 will be shaped across agents, multimodal data, and model efficiency.

Rogerio
Jan 1, 2025

Article
7 Predictions for AI in 2025: A CTO's, Rogerio Chaves Perspective
AI is evolving at speed, and the landscape in 2025 will be shaped across agents, multimodal data, and model efficiency.

Rogerio
Jan 1, 2025

Article
7 Predictions for AI in 2025: A CTO's, Rogerio Chaves Perspective
AI is evolving at speed, and the landscape in 2025 will be shaped across agents, multimodal data, and model efficiency.

Rogerio
Jan 1, 2025

Article
Customer Stories: HolidayHero AI start-up <> LangWatch
LangWatch has been a part of HolidayHero's LLM production environment for over two months, overseeing thousands of guestchats

CEO of HolidayHero - redated by Manouk
Dec 20, 2024

Article
Customer Stories: HolidayHero AI start-up <> LangWatch
LangWatch has been a part of HolidayHero's LLM production environment for over two months, overseeing thousands of guestchats

CEO of HolidayHero - redated by Manouk
Dec 20, 2024

Article
Customer Stories: HolidayHero AI start-up <> LangWatch
LangWatch has been a part of HolidayHero's LLM production environment for over two months, overseeing thousands of guestchats

CEO of HolidayHero - redated by Manouk
Dec 20, 2024

Article
LangWatch Optimization Studio – Built for AI Engineers, by AI Engineers
LangWatch Optimization Studio – Built for AI Engineers, by AI Engineers

Rogerio
Dec 10, 2024

Article
LangWatch Optimization Studio – Built for AI Engineers, by AI Engineers
LangWatch Optimization Studio – Built for AI Engineers, by AI Engineers

Rogerio
Dec 10, 2024

Article
LangWatch Optimization Studio – Built for AI Engineers, by AI Engineers
LangWatch Optimization Studio – Built for AI Engineers, by AI Engineers

Rogerio
Dec 10, 2024

Article
The power of MIPROv2 (DSPy) in a Low-Code environment with LangWatch’s Optimization Studio
Leverage the power of DSPy’s MIPROv2 without diving into complex code? Enter LangWatch’s Optimization Studio

Manouk
Nov 10, 2024

Article
The power of MIPROv2 (DSPy) in a Low-Code environment with LangWatch’s Optimization Studio
Leverage the power of DSPy’s MIPROv2 without diving into complex code? Enter LangWatch’s Optimization Studio

Manouk
Nov 10, 2024

Article
The power of MIPROv2 (DSPy) in a Low-Code environment with LangWatch’s Optimization Studio
Leverage the power of DSPy’s MIPROv2 without diving into complex code? Enter LangWatch’s Optimization Studio

Manouk
Nov 10, 2024

Article
What is Prompt Optimization? An Introduction to DSPy and Optimization Studio
LangWatch’s Optimization Studio, a more precise, scientific and better approach to prompt optimization

Manouk
Nov 7, 2024

Article
What is Prompt Optimization? An Introduction to DSPy and Optimization Studio
LangWatch’s Optimization Studio, a more precise, scientific and better approach to prompt optimization

Manouk
Nov 7, 2024

Article
What is Prompt Optimization? An Introduction to DSPy and Optimization Studio
LangWatch’s Optimization Studio, a more precise, scientific and better approach to prompt optimization

Manouk
Nov 7, 2024

Article
Deploying an OpenAI RAG Application to AWS ElasticBeanstalk
This tutorial guides you through building chatbots using Retrieval Augmented Generation with OpenAI in Python using FastAPI

Zhenya
Jul 27, 2024

Article
Deploying an OpenAI RAG Application to AWS ElasticBeanstalk
This tutorial guides you through building chatbots using Retrieval Augmented Generation with OpenAI in Python using FastAPI

Zhenya
Jul 27, 2024

Article
Deploying an OpenAI RAG Application to AWS ElasticBeanstalk
This tutorial guides you through building chatbots using Retrieval Augmented Generation with OpenAI in Python using FastAPI

Zhenya
Jul 27, 2024

Article
The complete guide for TDD with LLMs
How can we test in a probabilistic environment? Test Driven Development for LLM's

Rogerio - CTO
Jul 3, 2024

Article
The complete guide for TDD with LLMs
How can we test in a probabilistic environment? Test Driven Development for LLM's

Rogerio - CTO
Jul 3, 2024

Article
The complete guide for TDD with LLMs
How can we test in a probabilistic environment? Test Driven Development for LLM's

Rogerio - CTO
Jul 3, 2024

Article
Data Flywheel: Using your production data to build better LLM products
Data Flywheel: using your production data to build better LLM products

Rogerio - CTO
Jun 27, 2024

Article
Data Flywheel: Using your production data to build better LLM products
Data Flywheel: using your production data to build better LLM products

Rogerio - CTO
Jun 27, 2024

Article
Data Flywheel: Using your production data to build better LLM products
Data Flywheel: using your production data to build better LLM products

Rogerio - CTO
Jun 27, 2024

Article
How Algomo reduced AI hallucinations with LangWatch
How Algomo increased the quality of their AI app with LangWatch

Manouk
Jun 11, 2024

Article
How Algomo reduced AI hallucinations with LangWatch
How Algomo increased the quality of their AI app with LangWatch

Manouk
Jun 11, 2024

Article
How Algomo reduced AI hallucinations with LangWatch
How Algomo increased the quality of their AI app with LangWatch

Manouk
Jun 11, 2024

Article
The AI Team: Integrating User and Domain Expert Feedback to Enhance LLM-Powered Applications
Understand what is The AI Team and what are Their Roles

Manouk
Jun 10, 2024

Article
The AI Team: Integrating User and Domain Expert Feedback to Enhance LLM-Powered Applications
Understand what is The AI Team and what are Their Roles

Manouk
Jun 10, 2024

Article
The AI Team: Integrating User and Domain Expert Feedback to Enhance LLM-Powered Applications
Understand what is The AI Team and what are Their Roles

Manouk
Jun 10, 2024

Article
Unit Testing Your LLM: The Power of Datasets
Understand how to leverage datasets for LLM unit testing

Rogerio Chaves - CTO
Jun 10, 2024

Article
Unit Testing Your LLM: The Power of Datasets
Understand how to leverage datasets for LLM unit testing

Rogerio Chaves - CTO
Jun 10, 2024

Article
Unit Testing Your LLM: The Power of Datasets
Understand how to leverage datasets for LLM unit testing

Rogerio Chaves - CTO
Jun 10, 2024

Article
Introducing DSPy Visualizer
DSPy and LangWatch: Log and track DSPy training sessions, evaluate performance, compare runs, and debug LLM pipelines.

Rogerio - CTO
Jun 3, 2024

Article
Introducing DSPy Visualizer
DSPy and LangWatch: Log and track DSPy training sessions, evaluate performance, compare runs, and debug LLM pipelines.

Rogerio - CTO
Jun 3, 2024

Article
Introducing DSPy Visualizer
DSPy and LangWatch: Log and track DSPy training sessions, evaluate performance, compare runs, and debug LLM pipelines.

Rogerio - CTO
Jun 3, 2024

Article
New Dutch Startup, LangWatch, brings much-needed quality control to GenAI
LangWatch, a new innovative Amsterdam-based startup: Meet the Team

Manouk
May 20, 2024

Article
New Dutch Startup, LangWatch, brings much-needed quality control to GenAI
LangWatch, a new innovative Amsterdam-based startup: Meet the Team

Manouk
May 20, 2024

Article
New Dutch Startup, LangWatch, brings much-needed quality control to GenAI
LangWatch, a new innovative Amsterdam-based startup: Meet the Team

Manouk
May 20, 2024

Article
How to build a RAG application from scratch with the least possible AI Hallucinations
Driving to help AI leaders create RAG chatbots with minimal hallucinations

Zhenya
May 14, 2024

Article
How to build a RAG application from scratch with the least possible AI Hallucinations
Driving to help AI leaders create RAG chatbots with minimal hallucinations

Zhenya
May 14, 2024

Article
How to build a RAG application from scratch with the least possible AI Hallucinations
Driving to help AI leaders create RAG chatbots with minimal hallucinations

Zhenya
May 14, 2024

Article
Safeguarding Your First LLM-Powered Innovation: Essential Practices for Security
Journey of launching your first LLM-powered product is filled with potential and challenges.

Manouk
May 13, 2024

Article
Safeguarding Your First LLM-Powered Innovation: Essential Practices for Security
Journey of launching your first LLM-powered product is filled with potential and challenges.

Manouk
May 13, 2024

Article
Safeguarding Your First LLM-Powered Innovation: Essential Practices for Security
Journey of launching your first LLM-powered product is filled with potential and challenges.

Manouk
May 13, 2024

Article
LLM Reliability with Retrieval-Augmented Generation
Retrieval Augmented Generation. Its popularity continues to surge, offering various methods for its successful implementation

Manouk
May 13, 2024

Article
LLM Reliability with Retrieval-Augmented Generation
Retrieval Augmented Generation. Its popularity continues to surge, offering various methods for its successful implementation

Manouk
May 13, 2024

Article
LLM Reliability with Retrieval-Augmented Generation
Retrieval Augmented Generation. Its popularity continues to surge, offering various methods for its successful implementation

Manouk
May 13, 2024

Article
What is User Analytics for LLMs, The Difference With Traditional Analytics, And Why is it Important?
Discover how User Analytics for LLMs can transform AI interactions, revealing user behavior

Manouk
May 10, 2024

Article
What is User Analytics for LLMs, The Difference With Traditional Analytics, And Why is it Important?
Discover how User Analytics for LLMs can transform AI interactions, revealing user behavior

Manouk
May 10, 2024

Article
What is User Analytics for LLMs, The Difference With Traditional Analytics, And Why is it Important?
Discover how User Analytics for LLMs can transform AI interactions, revealing user behavior

Manouk
May 10, 2024

Article
Unlocking the Potential of Large Language Models: The LLM's Beyond the Hype
Successfully integrating LLMs into your business requires careful monitoring and evaluation of options

Manouk
May 8, 2024

Article
Unlocking the Potential of Large Language Models: The LLM's Beyond the Hype
Successfully integrating LLMs into your business requires careful monitoring and evaluation of options

Manouk
May 8, 2024

Article
Unlocking the Potential of Large Language Models: The LLM's Beyond the Hype
Successfully integrating LLMs into your business requires careful monitoring and evaluation of options

Manouk
May 8, 2024

Article
The 8 Types of LLM Hallucinations
Delve into the challenges of LLM hallucinations, explore their types, causes, and effective mitigation strategies

Manouk
May 6, 2024

Article
The 8 Types of LLM Hallucinations
Delve into the challenges of LLM hallucinations, explore their types, causes, and effective mitigation strategies

Manouk
May 6, 2024

Article
The 8 Types of LLM Hallucinations
Delve into the challenges of LLM hallucinations, explore their types, causes, and effective mitigation strategies

Manouk
May 6, 2024

Article
Navigating the Complexities of AI-Powered Products
Learn valuable insights from the frontlines of GenAI product development

Manouk
May 1, 2024

Article
Navigating the Complexities of AI-Powered Products
Learn valuable insights from the frontlines of GenAI product development

Manouk
May 1, 2024

Article
Navigating the Complexities of AI-Powered Products
Learn valuable insights from the frontlines of GenAI product development

Manouk
May 1, 2024

Article
5 Things You Must Consider Before Putting Your Chatbot Live in Production
Prevent AI chatbots from handling out-of-scope questions, being manipulated, and addressing sensitive topics

Manouk
May 1, 2024

Article
5 Things You Must Consider Before Putting Your Chatbot Live in Production
Prevent AI chatbots from handling out-of-scope questions, being manipulated, and addressing sensitive topics

Manouk
May 1, 2024

Article
5 Things You Must Consider Before Putting Your Chatbot Live in Production
Prevent AI chatbots from handling out-of-scope questions, being manipulated, and addressing sensitive topics

Manouk
May 1, 2024

Article
Understanding Hallucinations: What are they?
Explore how to minimize AI hallucinations in LLMs

Manouk
Apr 29, 2024

Article
Understanding Hallucinations: What are they?
Explore how to minimize AI hallucinations in LLMs

Manouk
Apr 29, 2024

Article
Understanding Hallucinations: What are they?
Explore how to minimize AI hallucinations in LLMs

Manouk
Apr 29, 2024

Article
Mastering the GenAI Wave: Strategies for Success in AI Adoption
Explore the race of generative AI

Manouk
Apr 18, 2024

Article
Mastering the GenAI Wave: Strategies for Success in AI Adoption
Explore the race of generative AI

Manouk
Apr 18, 2024

Article
Mastering the GenAI Wave: Strategies for Success in AI Adoption
Explore the race of generative AI

Manouk
Apr 18, 2024

Article
Successfully building an AI Startup in the current booming industry
Learn how AI start-ups can succeed by creating targeted generative AI solutions and effectively monitoring LLMs.

Manouk
Apr 18, 2024

Article
Successfully building an AI Startup in the current booming industry
Learn how AI start-ups can succeed by creating targeted generative AI solutions and effectively monitoring LLMs.

Manouk
Apr 18, 2024

Article
Successfully building an AI Startup in the current booming industry
Learn how AI start-ups can succeed by creating targeted generative AI solutions and effectively monitoring LLMs.

Manouk
Apr 18, 2024

Article
How Struck.build improved AI Performance with LangWatch
Struck.build + LangWatch = Improved AI Perfomance

Manouk
Apr 17, 2024

Article
How Struck.build improved AI Performance with LangWatch
Struck.build + LangWatch = Improved AI Perfomance

Manouk
Apr 17, 2024

Article
How Struck.build improved AI Performance with LangWatch
Struck.build + LangWatch = Improved AI Perfomance

Manouk
Apr 17, 2024

Article
Journey Through Innovation: The LLM Adventure
Dive into a customer's journey with LangWatch, revealing how to successfully integrate AI into your organization.

Manouk
Apr 8, 2024

Article
Journey Through Innovation: The LLM Adventure
Dive into a customer's journey with LangWatch, revealing how to successfully integrate AI into your organization.

Manouk
Apr 8, 2024

Article
Journey Through Innovation: The LLM Adventure
Dive into a customer's journey with LangWatch, revealing how to successfully integrate AI into your organization.

Manouk
Apr 8, 2024
Ship agents with confidence, not crossed fingers
Get up and running with LangWatch in as little as 5 minutes.
Ship agents with confidence, not crossed fingers
Get up and running with LangWatch in as little as 5 minutes.
Ship agents with confidence, not crossed fingers
Get up and running with LangWatch in as little as 5 minutes.
Platform
Integrations
Platform
Integrations
Platform
Integrations