Google's Gemini 3 Pro Crushes Benchmarks: 91.9% GPQA, 95% AIME, 31.1% ARC-AGI
Google's Gemini 3 Pro sets new records across reasoning, math, and science. Deep Think mode hits 93.8% on GPQA Diamond and 45.1% on ARC-AGI-2. Full benchmark data and evaluation results now available.