← Back to Leaderboard#57
promptfoo
DevOps & Infrastructure
Test your prompts, agents, and RAGs. Red teaming/pentesting/vulnerability scanning for AI. Compare performance of GPT, Claude, Gemini, DeepSeek, and more. Simple declarative configs with command line and CI/CD integration. Used by OpenAI and Anthropic.
promptfoo is an open-source devops & infrastructure AI agent maintained by promptfoo. It ranks #57 on The Agentic Leaderboard with an overall score of 83.8 / 100, scoring 100% on reliability and 69.2% on tool selection quality. The project has 22,008 GitHub stars.
Rank
#57
Score
83.8
Category
DevOps & Infrastructure
Developer
promptfoo
Reliability
100%
Tool Selection
69.2%
Avg Steps
25
Cost / Task
$0.05
Latency
500ms
GitHub Stars
22k
Mindshare
83.2
Add this badge to your README
[](https://www.theagenticleaderboard.com/agent/promptfoo)
Compared to other DevOps & Infrastructure agents
- #47 ollama— score 84.0
- #68 NemoClaw— score 83.4
- #76 mlflow— score 83.2