Artificial Intelligence Featured

Anthropic Revealed a Possible Path to AGI

Anthropic may have just published the closest thing we've seen to a roadmap for AGI. Not because they announced a new model. But because they explained how AI could eventually learn to improve AI.

Akshay Seth

20 Jun 2026 • 7 min read

The Anthropic Institute · In Charts

When AI
Builds Itself

Anthropic is handing more of its own engineering and research to AI. The data says the loop is already starting to close.

80%+

of Anthropic's code now written by Claude (May 2026)

8×

more code merged per engineer vs 2024

~4 mo

how fast AI's task length now doubles

~14 B

code commits GitHub is on pace for in 2026

01 The idea

Recursive self-improvement

An AI capable enough to design and train the next, better version of itself. Anthropic says we are not there yet — and that it isn't inevitable. But the gap is closing from every direction at once, and the charts below are why they think it could arrive sooner than institutions are ready for.

02 How fast it's moving

Two years ago, minutes of work. Now, a full day.

Task length AI can finish on its own

log scale · METR / Anthropic

By April 2026, Mythos Preview could work 16+ hours — the upper edge of what METR can even measure.

Speedup on a fixed code-optimization test

vs the starting code · same correctness checks

On this narrow task, Claude went from helpful to superhuman in under a year.

03 Inside the lab

Humans now direct and review. Claude writes.

Share of Anthropic's merged code written by Claude

before Claude Code → today

Leadership puts the all-in figure (scripts, experiments) at 90%+.

Success on the hardest, open-ended tasks

no clear spec · Claude Code sessions

Up 50 points in six months — on problems with no obvious answer.

Can the model pick a better next step than the human?

% of real research detours where its move was judged better

A stress test on detours where the human's choice had room to improve.

Three facts from the same period

~4 years

of human work delivered in one push: 800+ fixes that cut a class of API errors a thousandfold (April 2026).

⅓ of bugs

behind past production incidents would have been caught by an automated Claude review before shipping.

4× output

the median researcher's self-reported gain vs working with no AI at all (poll of 130 staff).

04 When AI runs the research

Handed an open safety problem, agents designed every experiment themselves.

How much of the target gap got closed

weak-supervises-strong project · % of floor-to-ceiling gap recovered

Humans still chose the problem and the scoring rubric — but the agents ran everything in between.

"The future is now."

An Anthropic researcher, on getting a week's worth of results back from Claude in a day or two — with, in their words, pretty minimal help.

05 The wider reality

It's not just Anthropic, and not just code.

1 B → 14 B

GitHub commits: ~1 billion in all of 2025, now on pace for ~14 billion in 2026. The platform is "pushing incredibly hard" just to keep up.

10,000+

high- and critical-severity software vulnerabilities found across major systems by Mythos Preview in its first weeks (Project Glasswing).

2 saturated

SWE-bench (real bug fixes) and CORE-Bench (reproducing research) both climbed from near-zero to near-perfect inside two years.

06 Three ways it could go

What happens next

The curve bends

Exponentials turn into S-curves. Compute, energy, or chips become the ceiling — but today's tools still spread widely.

Considered unlikely

Compounding gains

Development is largely automated; humans still set direction. Huge productivity — and real risks of misuse.

The likely path

III

It builds itself

AI gains the ingenuity to design its own successors. Pace is set by compute; humans move to oversight.

Hardest to predict

One caution runs through all three — Amdahl's law: speeding up one part of a process just shifts the bottleneck to whatever hasn't sped up. A lab can run at the speed of compute, but drug trials, elections, and trust still move at human pace.

07 The ask

The world should have the option to slow down

Anthropic's position: safety research and institutions need a way to keep pace. The hard part isn't the will to stop — it's verification. Training runs are easier to hide than missile silos, and whoever quietly continues while others pause could inherit the lead.

A pause by one lab only changes who's ahead. A real one needs several frontier labs, in several countries, stopping under the same conditions — and able to check each other. The window to work that out is now.

Distilled from “When AI Builds Itself” — Marina Favaro & Jack Clark, The Anthropic Institute. Charts redrawn from the article's figures; values as of May–June 2026.
Source: anthropic.com/institute/recursive-self-improvement

When AIBuilds Itself