prompt-evaluation
Here are 12 public repositories matching this topic...
Language:All
Sort:Most stars
My personal prompt library for various LLMs + scripts & tools. Suitable for models from Deepseek, OpenAI, Claude, Meta, Mistral, Google, Grok, and others.
- Updated
Mar 18, 2025 - Python
The prompt engineering, prompt management, and prompt evaluation tool for Python
- Updated
Sep 17, 2024 - Python
Official implementation for "GLaPE: Gold Label-agnostic Prompt Evaluation and Optimization for Large Language Models" (stay tuned & more will be updated)
- Updated
Feb 6, 2024 - Python
The prompt engineering, prompt management, and prompt evaluation tool for TypeScript, JavaScript, and NodeJS.
- Updated
Sep 14, 2024 - TypeScript
The prompt engineering, prompt management, and prompt evaluation tool for Ruby.
- Updated
Jun 16, 2024
The prompt engineering, prompt management, and prompt evaluation tool for Java.
- Updated
Jun 16, 2024
Runs two simple test prompts against 5 Anthropic models. Visually compares speed, capability, costs.
- Updated
Feb 20, 2025 - Jupyter Notebook
An AI-driven system to automatically generate, evaluate, and rank prompts using Monte-Carlo and Elo Ranking system for enterprise-grade Retrieval Augmented Generation (RAG) systems.
- Updated
Aug 1, 2024 - Jupyter Notebook
- Updated
Jan 1, 2025 - C#
The prompt engineering, prompt management, and prompt evaluation tool for Kotlin.
- Updated
Jun 16, 2024
The prompt engineering, prompt management, and prompt evaluation tool for C# and .NET
- Updated
Jun 16, 2024
A few prompts that I am storing in a repo for the purpose of running controlled experiments comparing and benchmarking different LLMs for defined use-cases
- Updated
Dec 4, 2024 - Python
Improve this page
Add a description, image, and links to theprompt-evaluation topic page so that developers can more easily learn about it.
Add this topic to your repo
To associate your repository with theprompt-evaluation topic, visit your repo's landing page and select "manage topics."