DeepEval

Open-source LLM evaluation with 14+ metrics and pytest integration

toolFree

Tools & MCP#evaluation#open-source#testing#hallucination-detection

DeepEval is an open-source LLM evaluation framework with 14+ research-backed metrics including hallucination detection, bias, toxicity, and task-specific evaluations. It integrates with pytest for CI/CD workflows, supports A/B testing of prompts, and offers a cloud dashboard for tracking evaluation results over time. Designed to make LLM testing as natural as unit testing.

Visit Website →GitHub

3 views0 clicksAdded 3/14/2026

Reviews

No reviews yet. Be the first!

Loading reviews...