AI & ML interests
Building breatkthrough AI to solve the world's biggest problems.
Recent Activity
View all activity
Articles
Organization Card
Reward Bench 2
Datasets, spaces, and models for Reward Bench 2 benchmark and paper!
allenai/reward-bench-2
Viewer•Updated•1.87k•1.63k•18- Running382382
Reward Bench Leaderboard
📐Display and filter model evaluation results
allenai/reward-bench-2-results
Preview•Updated•349•1allenai/Llama-3.1-70B-Instruct-RM-RB2
Text Classification•Updated•193
OLMo 2
Artifacts for the OLMo 2 release.
spaces10
pinned
Running
382
Reward Bench Leaderboard
📐
Display and filter model evaluation results
pinned
Running
87
Zebra Logic Bench
🦓
Render a leaderboard for model evaluation
pinned
Running
3
SUPER Leaderboard
🤖
pinned
Runtime error
2
HREF Leaderboard
📐
pinned
Running
52
ZeroEval Leaderboard
📊
Embed and use ZeroEval for evaluation tasks
pinned
Runtime error
22
BaseChat by URIAL (Chat with base, untuned LLMs)
💬
Chat with advanced language models
models766

allenai/ACE2-ERA5
Updated•4

allenai/ACE2-EAMv3
Updated•1

allenai/ACE-climSST-EAMv2
Updated

allenai/Molmo-72B-0924
Image-Text-to-Text•73B•Updated•2.22k•285

allenai/Molmo-7B-O-0924
Image-Text-to-Text•8B•Updated•6.23k•161

allenai/olmOCR-7B-0225-preview-FP8
Image-Text-to-Text•8B•Updated•932•6

allenai/Llama-3.1-Tulu-3-405B-DPO
Text Generation•Updated•68•6

allenai/Llama-3.1-Tulu-3-70B-DPO
Text Generation•71B•Updated•2.02k•9

allenai/Llama-3.1-Tulu-3-8B-DPO
Text Generation•8B•Updated•7.66k•26

allenai/OLMo-2-1124-7B-RM
Text Generation•Updated•224•3
datasets242
allenai/olmo-mix-1124
Viewer•Updated•99.1M•28.6k•66
allenai/big-reasoning-traces
Viewer•Updated•677k•219•7
allenai/omega-transformative
Viewer•Updated•7.2k•6
allenai/omega-compositional
Viewer•Updated•14.3k•1
allenai/omega-explorative
Viewer•Updated•52.2k•5
allenai/reward-bench-2-results
Preview•Updated•349•1
allenai/IF_sft_data_verified
Viewer•Updated•31.8k•27•4
allenai/IF_multi_constraints_upto5_no_lang
Viewer•Updated•95.4k•38•2
allenai/DataDecide-ppl-results
Viewer•Updated•22.7k•133•2
allenai/ruler_data
Updated•242