This evaluator assesses the extent to which the generated answer is consistent with the provided context. Higher scores indicate better faithfulness to the context, useful for detecting hallucinations.
POST
/
ragas
/
faithfulness
/
evaluate
Copy
import langwatch
df = langwatch.datasets.get_dataset("dataset-id").to_pandas()
experiment = langwatch.experiment.init("my-experiment")
for index, row in experiment.loop(df.iterrows()):
# your execution code here
experiment.evaluate(
"ragas/faithfulness",
index=index,
data={
"output": output,
"contexts": row["contexts"],
"input": row["input"],
},
settings={}
)