Usage
After adding records to your dataset, created within the dataset section of LangWatch, you can proceed to select the dataset for batch evaluation along with the desired evaluations. You have the option to choose from predefined evaluations or any custom evaluations you’ve set up in the Evaluation and Guardrails section of LangWatch.Screenshots examples
In the below in screenshot you will see the datasets section in LangWatch, you can get your batch evaluation python snippet by clicking on on the Batch Evaluation button.

BatchEvaluation include your chosen dataset and an array of selected evaluations to run against it.

Python snippet
When executing the snippet, you’ll encounter a callback function at your disposal. This function contains the original entry data, allowing you to run it against your own Large Language Model (LLM). You can utilize this response to compare results within your evaluation process. Ensure that you return theoutput as some evaluations may require it. As you create your code snippet in the evaluations tab, you’ll notice indications of which evaluations necessitate particular information. Utilize this guidance as a reference to kickstart your workflow effectively.