💬 Join the DecodeAI WhatsApp Channel for more AI updates → Click here

Anthropic Revealed a Possible Path to AGI

Anthropic may have just published the closest thing we've seen to a roadmap for AGI. Not because they announced a new model. But because they explained how AI could eventually learn to improve AI.

Anthropic Revealed a Possible Path to AGI

The Anthropic Institute · In Charts

When AI
Builds Itself

Anthropic is handing more of its own engineering and research to AI. The data says the loop is already starting to close.

80%+
of Anthropic's code now written by Claude (May 2026)
more code merged per engineer vs 2024
~4 mo
how fast AI's task length now doubles
~14 B
code commits GitHub is on pace for in 2026

01 The idea

Recursive self-improvement

An AI capable enough to design and train the next, better version of itself. Anthropic says we are not there yet — and that it isn't inevitable. But the gap is closing from every direction at once, and the charts below are why they think it could arrive sooner than institutions are ready for.

02 How fast it's moving

Two years ago, minutes of work. Now, a full day.

Task length AI can finish on its own

log scale · METR / Anthropic

4 min Mar 2024 Claude 3 90 min Mar 2025 Sonnet 3.7 12 hr Mar 2026 Opus 4.6

By April 2026, Mythos Preview could work 16+ hours — the upper edge of what METR can even measure.

Speedup on a fixed code-optimization test

vs the starting code · same correctness checks

skilled human ≈ 4× ~3× May 2025 Opus 4 ~52× Apr 2026 Mythos Preview

On this narrow task, Claude went from helpful to superhuman in under a year.

03 Inside the lab

Humans now direct and review. Claude writes.

Share of Anthropic's merged code written by Claude

before Claude Code → today

100% 0% <5% early 2025 80%+ May 2026

Leadership puts the all-in figure (scripts, experiments) at 90%+.

Success on the hardest, open-ended tasks

no clear spec · Claude Code sessions

26% Nov 2025 76% May 2026

Up 50 points in six months — on problems with no obvious answer.

Can the model pick a better next step than the human?

% of real research detours where its move was judged better

even with human = 50% 51% Opus 4.5 · Nov '25 64% Mythos · Apr '26

A stress test on detours where the human's choice had room to improve.

Three facts from the same period

 

~4 years
of human work delivered in one push: 800+ fixes that cut a class of API errors a thousandfold (April 2026).
⅓ of bugs
behind past production incidents would have been caught by an automated Claude review before shipping.
4× output
the median researcher's self-reported gain vs working with no AI at all (poll of 130 staff).

04 When AI runs the research

Handed an open safety problem, agents designed every experiment themselves.

How much of the target gap got closed

weak-supervises-strong project · % of floor-to-ceiling gap recovered

23% 2 researchers · ~1 week 97% agents · 800 hrs · ~$18k compute

Humans still chose the problem and the scoring rubric — but the agents ran everything in between.

"The future is now."

An Anthropic researcher, on getting a week's worth of results back from Claude in a day or two — with, in their words, pretty minimal help.

05 The wider reality

It's not just Anthropic, and not just code.

1 B → 14 B
GitHub commits: ~1 billion in all of 2025, now on pace for ~14 billion in 2026. The platform is "pushing incredibly hard" just to keep up.
10,000+
high- and critical-severity software vulnerabilities found across major systems by Mythos Preview in its first weeks (Project Glasswing).
2 saturated
SWE-bench (real bug fixes) and CORE-Bench (reproducing research) both climbed from near-zero to near-perfect inside two years.

06 Three ways it could go

What happens next

I

The curve bends

Exponentials turn into S-curves. Compute, energy, or chips become the ceiling — but today's tools still spread widely.

Considered unlikely
II

Compounding gains

Development is largely automated; humans still set direction. Huge productivity — and real risks of misuse.

The likely path
III

It builds itself

AI gains the ingenuity to design its own successors. Pace is set by compute; humans move to oversight.

Hardest to predict

One caution runs through all three — Amdahl's law: speeding up one part of a process just shifts the bottleneck to whatever hasn't sped up. A lab can run at the speed of compute, but drug trials, elections, and trust still move at human pace.

07 The ask

The world should have the option to slow down

Anthropic's position: safety research and institutions need a way to keep pace. The hard part isn't the will to stop — it's verification. Training runs are easier to hide than missile silos, and whoever quietly continues while others pause could inherit the lead.

A pause by one lab only changes who's ahead. A real one needs several frontier labs, in several countries, stopping under the same conditions — and able to check each other. The window to work that out is now.

Distilled from “When AI Builds Itself” — Marina Favaro & Jack Clark, The Anthropic Institute. Charts redrawn from the article's figures; values as of May–June 2026.
Source: anthropic.com/institute/recursive-self-improvement

💬 Join the DecodeAI WhatsApp Channel
Get AI guides, bite-sized tips & weekly updates delivered where it’s easiest – WhatsApp.
👉 Join Now