prompt-testing

Star

Here are 23 public repositories matching this topic...

Language:All

Filter by language

All23 TypeScript10 Python7 HTML1 JavaScript1

Sort:Most stars

Sort options

Most stars Fewest stars Most forks Fewest forks Recently updated Least recently updated

promptfoo /promptfoo

Star9.5k

Test your prompts, agents, and RAGs. AI Red teaming, pentesting, and vulnerability scanning for LLMs. Compare performance of GPT, Claude, Gemini, Llama, and more. Simple declarative configs with command line and CI/CD integration.

testing ci evaluation ci-cd pentesting cicd vulnerability-scanners prompts evaluation-framework red-teaming rag llm prompt-engineering llmops prompt-testing llm-eval llm-evaluation llm-evaluation-framework

UpdatedDec 18, 2025
TypeScript

msoedov /agentic_security

Star1.7k

Agentic LLM Vulnerability Scanner / AI red teaming kit 🧪

agent-framework ai-red-team prompt-testing llm-security llm-vulnerabilities llm-evaluation llm-fuzzing llm-evaluation-framework llm-guardrails llm-scanner llm-jailbreaks llm-fuzzer llm-fuzzer-aggregator agent-security

UpdatedNov 30, 2025
Python

babelcloud /LLM-RGB

Star164

LLM Reasoning and Generation Benchmark. Evaluate LLMs in complex scenarios systematically.

benchmark prompt llm prompt-engineering prompt-testing

UpdatedMay 25, 2025
TypeScript

jhd3197 /Prompture

Sponsor

Star8

Prompture is an API-first library for requesting structured JSON output from LLMs (or any structure), validating it against a schema, and running comparative tests between models.

openai toon json-validation structured-output pydantic llm prompt-engineering ai-testing prompt-testing

UpdatedNov 22, 2025
Python

aralyekta /prompttester

Star8

Test, compare, and optimize your AI prompts in minutes

prompt-testing llm-tools llm-test llm-evaluation prompt-test llm-testing

UpdatedAug 13, 2025
JavaScript

prompt-foundry /typescript-sdk

Star6

The prompt engineering, prompt management, and prompt evaluation tool for TypeScript, JavaScript, and NodeJS.

typescript gpt open-ai gpt-3 gpt-4 llm prompt-engineering llmops prompt-testing prompt-manager prompt-management llm-eval llm-test llm-ops llm-evaluation prompt-evaluation

UpdatedNov 15, 2025
TypeScript

bluewave-labs /evalwise

Sponsor

Star6

EvalWise is a developer-friendly platform for LLM evaluation and red teaming that helps test AI models for safety, compliance, and performance issues

rag llm prompt-engineering llmops prompt-testing evals llm-evaluation rag-evaluation llm-evaluation-toolkit

UpdatedNov 20, 2025
Python

calibrtr /llm-prompt-test

Star5

LLM Prompt Test helps you test Large Language Models (LLMs) prompts to ensure they consistently meet your expectations.

testing tdd test prompt test-automation testing-tools prompts large-language-models llm prompt-engineering prompt-testing

UpdatedMay 22, 2024
TypeScript

yukinagae /genkitx-promptfoo

Star5

Community Plugin for Genkit to use Promptfoo

plugin testing firebase ai evaluation prompt prompts evaluation-framework llm llmops prompt-testing llm-eval llm-evaluation llm-evaluation-framework promptfoo genkit genkitx genkit-plugin

UpdatedJan 3, 2025
TypeScript

syamsasi99 /prompt-evaluator

Star4

prompt-evaluator is an open-source toolkit for evaluating, testing, and comparing LLM prompts. It provides a GUI-driven workflow for running prompt tests, tracking token usage, visualizing results, and ensuring reliability across models like OpenAI, Claude, and Gemini.

electron react typescript datascience developer-tools ai-evaluation llm prompt-engineering prompt-testing promptfoo ai-evaluation-tools ai-evaluation-metrics ai-evaluation-framework

UpdatedDec 4, 2025
TypeScript

amansoomro062 /atelier

Star2

An open-source AI prompt engineering playground with live code execution. Test OpenAI & Claude prompts, execute JavaScript, and iterate in real-time.

playground ai nextjs openai developer-tools claude llm prompt-engineering prompt-testing anthropic prompt-optimization system-prompts

UpdatedNov 8, 2025
TypeScript

yukinagae /promptfoo-sample

Star2

Sample project demonstrates how to use Promptfoo, a test framework for evaluating the output of generative AI models

testing evaluation prompts evaluation-framework llm llmops prompt-testing llm-eval llm-evaluation llm-evaluation-framework promptfoo

UpdatedSep 10, 2024

radoslaw-sz /maia

Star2

A pytest-based framework for testing multi AI agents systems. It provides a flexible and extensible platform for complex multi-agent simulations. Supports many integrations like LiteLLM, CrewAI, LangChain etc.

python framework ai test agents maia llm prompt-engineering ai-testing prompt-testing agentic ai-testing-tool

UpdatedSep 24, 2025
TypeScript

jairerazodev /prompt-testing

Star1

prompt-testing

UpdatedJan 18, 2023

abdullahkhalid00 /prompt-db

Star1

A collection of prompts that I use on a day-to-day basis for work and leisure.

markdown jinja2 text prompts prompt-engineering chatgpt prompt-testing prompt-template

UpdatedSep 9, 2024

GTMVP /modal-llm-evaluator

Star1

Run 1,000 LLM evaluations in 10 minutes. Test prompts across Claude, GPT-4, and Gemini with parallel execution, real-time cost tracking, and beautiful visualizations. Open source.

python testing benchmarking machine-learning automation ai modal developer-tools parallel-execution mlops streamlit llm prompt-engineering llms prompt-testing anthropic llm-evaluation cost-tracking google-gemini

UpdatedDec 12, 2025
Python

srdarkseer /PromptForge

Star1

Visual prompt engineering platform for creating, testing, and versioning LLM prompts across multiple providers (OpenAI, Anthropic, Mistral, Gemini).

ai-tools llm prompt-engineering prompt-testing prompt-optimization

UpdatedNov 5, 2025
TypeScript

ashleysally00 /promptfoo-quickstart-guide

Star1

Quickstart guide for using PromptFoo to evaluate LLM prompts via CLI or Colab.

openai colab model-evaluation cli-tool llm prompt-engineering prompt-testing promptfoo

UpdatedNov 23, 2025

alinaleo27 /ai-rag-eval-qa

Star1

AI RAG evaluation project using Ragas. Includes RAG metrics (precision, recall, faithfulness), retrieval diagnostics, and prompt testing examples for fintech/banking LLM systems. Designed as an AI QA Specialist portfolio project.

ai-qa prompt-testing llm-evaluation rag-evaluation ragas llm-testing

UpdatedNov 17, 2025
Python

yukinagae /genkit-promptfoo-sample

Star0

Sample implementation demonstrating how to use Firebase Genkit with Promptfoo

testing evaluation prompts evaluation-framework llm llmops prompt-testing llm-eval llm-evaluation llm-evaluation-framework promptfoo genkit

UpdatedSep 11, 2024
TypeScript

Improve this page

Add a description, image, and links to theprompt-testing topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with theprompt-testing topic, visit your repo's landing page and select "manage topics."

Learn more

Movatterモバイル変換

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

prompt-testing

Here are 23 public repositories matching this topic...

promptfoo /promptfoo

msoedov /agentic_security

babelcloud /LLM-RGB

jhd3197 /Prompture

aralyekta /prompttester

prompt-foundry /typescript-sdk

bluewave-labs /evalwise

calibrtr /llm-prompt-test

yukinagae /genkitx-promptfoo

syamsasi99 /prompt-evaluator

amansoomro062 /atelier

yukinagae /promptfoo-sample

radoslaw-sz /maia

jairerazodev /prompt-testing

abdullahkhalid00 /prompt-db

GTMVP /modal-llm-evaluator

srdarkseer /PromptForge

ashleysally00 /promptfoo-quickstart-guide

alinaleo27 /ai-rag-eval-qa

yukinagae /genkit-promptfoo-sample

Improve this page

Add this topic to your repo