METR is a research nonprofit that works on assessing whether cutting-edge AI systems could pose catastrophic risks to society.
We build the science of accurately assessing risks, so that humanity is informed before developing transformative AI systems.
Read more about our workhere.
- Vivaria
- Public Task Suite
- RE-Bench Task Suite
- Some of our open-source agents can be found atgithub.com/poking-agents
Popular repositoriesLoading
- public-tasks
public-tasks Public - eval-analysis-public
eval-analysis-public PublicPublic repository containing METR's DVC pipeline for eval data analysis
- task-template
task-template Public template
Repositories
Showing 10 of 28 repositories
- inspect_k8s_sandbox Public Forked fromUKGovernmentBEIS/inspect_k8s_sandbox
A Kubernetes sandbox environment for use with inspect_ai
METR/inspect_k8s_sandbox’s past year of commit activity - autonomy-evals-guide Public
METR/autonomy-evals-guide’s past year of commit activity - task-protected-scoring Public
METR/task-protected-scoring’s past year of commit activity - uplift_clone_hypothesis Public Forked fromHypothesisWorks/hypothesis
Hypothesis is a powerful, flexible, and easy to use library for property-based testing.
METR/uplift_clone_hypothesis’s past year of commit activity - hcast-public Public
METR/hcast-public’s past year of commit activity - public-tasks Public
METR/public-tasks’s past year of commit activity - task-assets Public
METR/task-assets’s past year of commit activity
People
This organization has no public members. You must be a member to see who’s a part of this organization.
Top languages
Loading…
Most used topics
Loading…