Gemini 3 Pro
Best for complex tasks and bringing creative concepts to life
Learn, plan, and build like never before with Gemini 3 Pro’s incredible reasoning powers
Our most intelligent model yet
Partner with a pro
With state-of-the-art reasoning and multimodal capabilities
Get started
Build with Gemini 3
Slide 1 of 3
Hands-on
Explore what you can do with Gemini 3 Pro
Slide 1 of 3
Performance
Gemini 3 is state-of-the-art across a wide range of benchmarks
Our most intelligent model yet sets a new bar for AI model performance.
Slide 1 of 3
| Benchmark | Notes | Gemini 3 Pro | Gemini 2.5 Pro | Claude Sonnet 4.5 | GPT-5.1 |
|---|---|---|---|---|---|
| Academic reasoning Humanity's Last Exam | No tools | 37.5% | 21.6% | 13.7% | 26.5% |
| With search and code execution | 45.8% | — | — | — | |
| Visual reasoning puzzles ARC-AGI-2 | ARC Prize Verified | 31.1% | 4.9% | 13.6% | 17.6% |
| Scientific knowledge GPQA Diamond | No tools | 91.9% | 86.4% | 83.4% | 88.1% |
| Mathematics AIME 2025 | No tools | 95.0% | 88.0% | 87.0% | 94.0% |
| With code execution | 100.0% | — | 100.0% | — | |
| Challenging Math Contest problems MathArena Apex | 23.4% | 0.5% | 1.6% | 1.0% | |
| Multimodal understanding and reasoning MMMU-Pro | 81.0% | 68.0% | 68.0% | 76.0% | |
| Screen understanding ScreenSpot-Pro | 72.7% | 11.4% | 36.2% | 3.5% | |
| Information synthesis from complex charts CharXiv Reasoning | 81.4% | 69.6% | 68.5% | 69.5% | |
| OCR OmniDocBench 1.5 | Overall Edit Distance, lower is better | 0.115 | 0.145 | 0.145 | 0.147 |
| Knowledge acquisition from videos Video-MMMU | 87.6% | 83.6% | 77.8% | 80.4% | |
| Competitive coding problems LiveCodeBench Pro | Elo Rating, higher is better | 2,439 | 1,775 | 1,418 | 2,243 |
| Agentic terminal coding Terminal-Bench 2.0 | Terminus-2 agent | 54.2% | 32.6% | 42.8% | 47.6% |
| Agentic coding SWE-Bench Verified | Single attempt | 76.2% | 59.6% | 77.2% | 76.3% |
| Agentic tool use τ2-bench | 85.4% | 54.9% | 84.7% | 80.2% | |
| Long-horizon agentic tasks Vending-Bench 2 | Net worth (mean), higher is better | $5,478.16 | $573.64 | $3,838.74 | $1,473.43 |
| Held out internal grounding, parametric, MM, and search retrieval benchmarks FACTS Benchmark Suite | 70.5% | 63.4% | 50.4% | 50.8% | |
| Parametric knowledge SimpleQA Verified | 72.1% | 54.5% | 29.3% | 34.9% | |
| Multilingual Q&A MMMLU | 91.8% | 89.5% | 89.1% | 91.0% | |
| Commonsense reasoning across 100 Languages and Cultures Global PIQA | 93.4% | 91.5% | 90.1% | 90.9% | |
| Long context performance MRCR v2 (8-needle) | 128k (average) | 77.0% | 58.0% | 47.1% | 61.6% |
| 1M (pointwise) | 26.3% | 16.4% | not supported | not supported |
For details on our evaluation methodology please seedeepmind.google/models/evals-methodology/gemini-3-pro
Slide 1 of 4
Model information
- Name
- 3 Pro
- Status
- Preview
- Input
- Text
- Image
- Video
- Audio
- Output
- Text
- Input tokens
- 1M
- Output tokens
- 64k
- Knowledge cutoff
- January 2025
- Tool use
- Function calling
- Structured output
- Search as a tool
- Code execution
- Best for
- Agentic
- Advanced coding
- Long context understanding
- Multimodal understanding
- Algorithmic development
- Availability
- Gemini App
- Google Cloud / Vertex AI
- Google AI Studio
- Gemini API
- Google AI Mode
- Google Antigravity
- Documentation
- View developer docs
- Model card
- View model card