← Back to Leaderboard

promptfoo

DevOps & Infrastructure

#57

Test your prompts, agents, and RAGs. Red teaming/pentesting/vulnerability scanning for AI. Compare performance of GPT, Claude, Gemini, DeepSeek, and more. Simple declarative configs with command line and CI/CD integration. Used by OpenAI and Anthropic.

promptfoo is an open-source devops & infrastructure AI agent maintained by promptfoo. It ranks #57 on The Agentic Leaderboard with an overall score of 83.8 / 100, scoring 100% on reliability and 69.2% on tool selection quality. The project has 22,008 GitHub stars.

Rank
#57
Score
83.8
Category
DevOps & Infrastructure
Developer
promptfoo
Reliability
100%
Tool Selection
69.2%
Avg Steps
25
Cost / Task
$0.05
Latency
500ms
GitHub Stars
22k
Mindshare
83.2

Add this badge to your README

Agentic Leaderboard rank #57
[![Agentic Leaderboard](https://www.theagenticleaderboard.com/badges/promptfoo.svg)](https://www.theagenticleaderboard.com/agent/promptfoo)

Compared to other DevOps & Infrastructure agents