We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see ourdocumentation.
There was an error while loading.Please reload this page.
1 parentfc3c77e commit88a15baCopy full SHA for 88a15ba
2025/week-50/AI Model Hallucination Scores.csv
@@ -0,0 +1,18 @@
1
+Model,Accuracy Index (Higher is better),Hallucination Index (Lower is better),Type,Owner
2
+Claude 4.1 Opus,36%,48%,Proprietary,Anthropic
3
+Claude 4.5 Sonnet,31%,48%,Proprietary,Anthropic
4
+DeepSeek R1 0528,29%,83%,Open Weights,DeepSeek
5
+DeepSeek V3.1 Terminus,27%,74%,Open Weights,DeepSeek
6
+DeepSeek V3.2-Exp,27%,80%,Open Weights,DeepSeek
7
+Gemini 2.5 Flash (Sep),27%,88%,Proprietary,Google
8
+Gemini 2.5 Pro,37%,89%,Proprietary,Google
9
+GPT-5 (high),39%,81%,Proprietary,OpenAI
10
+gpt-oss-120B (high),20%,90%,Open Weights,OpenAI
11
+gpt-oss-20B (high),15%,93%,Open Weights,OpenAI
12
+Grok 4,39%,64%,Proprietary,xAI
13
+Grok 4 Fast,22%,67%,Proprietary,xAI
14
+Kimi K2 0905,24%,69%,Open Weights,Moonshot
15
+Llama 4 Maverick,23%,88%,Open Weights,Meta
16
+Llama Nemotron Super 498 v 1.5,16%,76%,Open Weights,Meta
17
+Magistral Medium 1.2,20%,60%,Proprietary,Mistral AI
18
+Qwen 3 25B A22 B507,22%,90%,Open Weights,Alibaba