AI Flash

Google DeepMind's 'AI Co-Mathematician' Outperforms GPT-5.5 Pro, Solves Decades-Old Problem

1 weeks ago May 9, 2026 · 21:49 22 views
Quick Brief

Google DeepMind has unveiled the AI Co-Mathematician, a multi-Agent interACTive research workbench designed to collaborate with human mathematici...

Google DeepMind has unveiled the AI Co-Mathematician, a multi-Agent interACTive research workbench designed to collaborate with human mathematicians on open-ended research problems. The system has achieved a new state-of-the-art by scoring 47.9% (solving 23 out of 48 problems) on FrontierMath Tier 4, the most challenging research-level mathematics benchmark avAIlable. This performance significantly surpasses the previous record of 39.6% held by GPT-5.5 Pro.
Notably, the AI Co-Mathematician is built upon the Gemini 3.1 Pro Foundation Model, which on its own scores only 19% on the Same benchmark. The Dramatic performance increase is attributed to a sophisticated multi-agent architecture. A top-level "Project Coordinator" agent decomposes complex research tasks into parallel workstreams, delegating them to specialized sub-agents for literature review, code execution, and logical inference. Proposed proofs are then subjected to a rigorous review process by multiple "Reviewer Agents" before final submission. This dEMOnstrates that for top-tier mathematical reasoning, advanced system orchestration can yield greater capability gains than simply upgrading the underlying model.
The benchmark evaluation was conducted in a blind test by Epoch AI, where DeepMind's team had no access to the questions. Each problem was allotted a 48-hour runtime. In addition to topping the leaderboard, the system successfully solved three problems that had stumped all previous AI models.
The system is designed to function more as a creative research partner than a simple tool. In a real-world APPlication, group theory expert Marc Lackenby used the AI Co-Mathematician to solve a long-Standing open conjecture from the Kourovka Notebook. The system's initial proof strategy was flagged as "flawed" by its own internal review agents. However, Lackenby recognized a clever insight within the rejected approach, manually filled the logical gap, and completed the proof, showcasing the power of human-AI collaboration.
Currently, the AI Co-Mathematician is in a limited early-access phase, available to a small group of mathematicians for testing.
★★★★★
★★★★★
Be the first to rate this article.

Comments & Questions (0)

Captcha
Please be respectful — let's keep the conversation friendly.

No comments yet

Be the first to comment!