PinnedLoading
- openai/mle-bench
openai/mle-bench PublicMLE-bench is a benchmark for measuring how well AI agents perform at machine learning engineering
- openai/evals
openai/evals PublicEvals is a framework for evaluating LLMs and LLM systems, and an open-source registry of benchmarks.
- GPTrue-or-False
GPTrue-or-False Public📝🔍 A browser extension that displays the GPT-2 Log Probability of selected text
- dlml-tutorial
dlml-tutorial Public🤓 A tutorial on the Discretized Logistic Mixture Likelihood (DLML)
Python 8
Something went wrong, please refresh the page to try again.
If the problem persists, check theGitHub status page orcontact support.
If the problem persists, check theGitHub status page orcontact support.