promptfoo/promptfooPublic

NotificationsYou must be signed in to change notification settings
Fork765
Star9k

Test your prompts, agents, and RAGs. AI Red teaming, pentesting, and vulnerability scanning for LLMs. Compare performance of GPT, Claude, Gemini, Llama, and more. Simple declarative configs with command line and CI/CD integration.

License

MIT license

9k stars 765 forks Branches Tags Activity

Star

Notifications

You must be signed in to change notification settings

Branches Tags

Folders and files

Name		Name	Last commit message	Last commit date
Latest commit History 6,015 Commits
.cursor		.cursor
.devcontainer		.devcontainer
.github		.github
.jest		.jest
.vscode		.vscode
drizzle		drizzle
examples		examples
helm/chart/promptfoo		helm/chart/promptfoo
scripts		scripts
site		site
src		src
test		test
.biomeignore		.biomeignore
.coderabbit.yaml		.coderabbit.yaml
.dockerignore		.dockerignore
.gitignore		.gitignore
.npmignore		.npmignore
.nvmrc		.nvmrc
.prettierignore		.prettierignore
.prettierrc.yaml		.prettierrc.yaml
.rubocop.yml		.rubocop.yml
.ruff.toml		.ruff.toml
AGENTS.md		AGENTS.md
CHANGELOG.md		CHANGELOG.md
CITATION.cff		CITATION.cff
CLAUDE.md		CLAUDE.md
CONTRIBUTING.md		CONTRIBUTING.md
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
SECURITY.md		SECURITY.md
biome.json		biome.json
conductor-setup.sh		conductor-setup.sh
conductor.json		conductor.json
drizzle.config.ts		drizzle.config.ts
jest.config.ts		jest.config.ts
jest.integration.config.ts		jest.integration.config.ts
jest.setup.ts		jest.setup.ts
knip.json		knip.json
nodemon.json		nodemon.json
package-lock.json		package-lock.json
package.json		package.json
pnpm-workspace.yaml		pnpm-workspace.yaml
tsconfig.json		tsconfig.json

Repository files navigation

Promptfoo: LLM evals & red teaming

promptfoo is a developer-friendly local tool for testing LLM applications. Stop the trial-and-error approach - start shipping secure, reliable AI apps.

Website ·Getting Started ·Red Teaming ·Documentation ·Discord

Quick Start

# Install and initialize projectnpx promptfoo@latest init# Run your first evaluationnpx promptfooeval

SeeGetting Started (evals) orRed Teaming (vulnerability scanning) for more.

What can you do with Promptfoo?

Test your prompts and models withautomated evaluations
Secure your LLM apps withred teaming and vulnerability scanning
Compare models side-by-side (OpenAI, Anthropic, Azure, Bedrock, Ollama, andmore)
Automate checks inCI/CD
Share results with your team

Here's what it looks like in action:

It works on the command line too:

It also can generatesecurity vulnerability reports:

Why Promptfoo?

🚀Developer-first: Fast, with features like live reload and caching
🔒Private: LLM evals run 100% locally - your prompts never leave your machine
🔧Flexible: Works with any LLM API or programming language
💪Battle-tested: Powers LLM apps serving 10M+ users in production
📊Data-driven: Make decisions based on metrics, not gut feel
🤝Open source: MIT licensed, with an active community