TL;DR

Donald Knuth — the 87-year-old creator of TeX and author of The Art of Computer Programming — published a paper called “Claude’s Cycles” after Anthropic’s Claude Opus 4.6 solved an open graph theory problem he’d been stuck on for weeks. The AI found a working construction in 31 guided explorations over about an hour. Knuth then wrote the formal proof himself. His opening words: “Shock! Shock!” The paper pulled 635,000 views in hours and reignited the debate about what AI can actually contribute to mathematics.

Why This Story Matters

If a random AI benchmark went up by 3%, nobody would care. But Donald Knuth calling an AI result a “dramatic advance in automatic deduction and creative problem solving” is different. This is the person who literally wrote the book on algorithms — four volumes of it, spanning six decades. He’s been openly skeptical of large language models. In 2023, he dismissed ChatGPT as producing “plausible-sounding but incorrect” outputs.

So when Knuth opens a paper with “Shock! Shock!” and closes it with “Hats off to Claude!” — that’s a signal worth paying attention to.

The Problem: Hamiltonian Cycles in 3D Grids

Here’s the setup. Take a directed graph with m³ vertices arranged in a three-dimensional grid. Each vertex is labeled (i, j, k), where each coordinate runs from 0 to m−1. From every vertex, exactly three arcs leave. The question: can you decompose all those arcs into exactly three Hamiltonian cycles — paths that visit every vertex exactly once before looping back to the start?

Knuth had solved the 3×3×3 case by hand. His colleague Filip Stappers had computationally verified solutions up to 16×16×16. But nobody had a general construction — a formula that works for any odd value of m. This was earmarked for a future volume of The Art of Computer Programming, and Knuth had been grinding on it for weeks without cracking the general case.

What Claude Actually Did

Stappers fed the exact problem statement to Claude Opus 4.6 and ran 31 guided explorations over roughly one hour.

Claude’s process was messy and iterative — the opposite of a clean mathematical proof. It tried linear formulas. It attempted brute-force searches. It built geometric frameworks, applied simulated annealing, hit dead ends, changed strategies, and kept going. At one point, it independently recognized the graph’s structure as a Cayley digraph and reformulated its approach accordingly. Nobody told it to do that.

On the 31st try, Claude identified a set of simple rules based on s = (i + j + k) mod m. Depending on the value of s, you increment either i, j, or k following specific conditions. Claude called it a “serpentine” pattern.

The construction turned out to correspond to the classical modular m-ary Gray code — a known structure in combinatorics. Claude didn’t know that. It derived the construction from scratch through the problem constraints alone.

Stappers tested it for every odd m up to 101. It worked every time.

What Claude Did Not Do

Claude did not prove anything. It found a pattern. Knuth proved it.

That distinction matters. Finding a construction that appears to work through guided search is non-trivial — but it’s a different activity from producing a rigorous mathematical proof. Knuth read Claude’s output, verified the construction, generalized it, and wrote the formal proof himself.

He also went further. By setting up an exact cover problem over the 11,502 Hamiltonian cycles in the 3×3×3 case, Knuth found 4,554 valid decompositions. Of those, 760 are generalizable — meaning 760 “Claude-like” constructions hold for all odd m > 1. Claude found one of 760. Knuth checked several others and “didn’t encounter any that were actually nicer.”

The Human in the Loop

This was not autonomous AI mathematics. Stappers provided continuous guidance throughout the session: prompting Claude to document intermediate results, redirecting when progress stalled, maintaining focus on the generalization goal. A session error even caused the loss of some earlier output, requiring partial restarts.

Think of it as a very capable graduate student who can explore hundreds of approaches per hour but needs an advisor to keep the research on track. The advisor still matters.

What About Even Numbers?

The even-dimension case (m = 2, 4, 6, …) remains completely unsolved. Claude got stuck when pushed toward even m and made no meaningful progress. The case m = 2 was proved impossible back in 1982. A different researcher used GPT-5.3 Codex to make progress on even cases for m ≥ 8, but no general construction exists.

If you want an open problem to throw at your favorite AI model, this is it.

Why Knuth’s Reaction Changes the Conversation

Knuth’s skepticism toward AI has been well-documented and well-reasoned. He hasn’t been dismissive in the way that some academics are — he’s engaged with the technology directly. He’s sent questions to ChatGPT, evaluated the outputs, and published his assessments. His conclusion until now: these models produce fluent text but don’t genuinely reason.

“Claude’s Cycles” is his first public acknowledgment that something has shifted. He wrote: “It seems that I’ll have to revise my opinions about ‘generative AI’ one of these days.” And then the Claude Shannon reference at the end — “Claude Shannon’s spirit is probably proud to know that his name is now being associated with such advances” — reads like genuine respect, not grudging concession.

When a skeptic this calibrated changes their position, the update should be larger than when an AI enthusiast confirms their prior.

What This Means for AI in Research

The “Claude’s Cycles” pattern — human poses problem, AI explores structures, human writes proof — is probably the template for AI-assisted mathematics going forward. It’s not the Hollywood version where the machine solves everything autonomously. It’s more like having a tireless collaborator who can test thousands of approaches while you sleep.

A few things become clear from this case:

AI is good at combinatorial exploration. Claude tested and discarded dozens of approaches in an hour. A human mathematician might try three or four in a day. The raw search throughput matters when the solution space is large enough.

AI still needs human direction. Without Stappers guiding the session, Claude would likely have wandered. The 31-exploration trajectory wasn’t random — it was steered.

AI doesn’t know what it’s rediscovering. Claude derived the m-ary Gray code from scratch without recognizing it. That’s both impressive (it found the right structure) and limiting (it couldn’t use existing knowledge about that structure to speed up the search).

Verification is still a human job. Claude’s construction worked empirically. Knuth’s proof made it mathematics. Those are different things, and the gap between them isn’t closing as fast as the gap in pattern-finding.

The Lean 4 Follow-Up

One researcher has already started formalizing Knuth’s proof in Lean 4, the interactive theorem prover. The goal: prove in machine-checked logic that for every odd m ≥ 3, the Cayley digraph on m³ vertices decomposes into exactly three directed Hamiltonian cycles. If completed, this would create a fully verified pipeline — AI finds the construction, human writes the proof, machine verifies the proof. Each step catches different kinds of errors.

FAQ

Did Claude Opus 4.6 actually prove the theorem?

No. Claude found a construction that works — a pattern for decomposing the graph into Hamiltonian cycles. Donald Knuth wrote the mathematical proof showing why that construction is valid for all odd m. Finding a pattern and proving it are different contributions.

How long had Knuth been working on this problem?

Several weeks, by his own account. The problem was related to material earmarked for a future volume of The Art of Computer Programming. His colleague Filip Stappers had been running computational searches on it as well.

Can I read the paper?

Yes. “Claude’s Cycles” is available as a PDF on Knuth’s Stanford faculty page.

Is this the first time AI has solved an open math problem?

It’s not the first, but it’s the highest-profile case by a wide margin. What makes it stand out is Knuth’s stature, his prior skepticism, and the fact that he named the paper after the AI model. Previous AI math results (like DeepMind’s work on knot theory in 2021) were significant but didn’t carry the same cultural weight.

What model was used?

Claude Opus 4.6, Anthropic’s hybrid reasoning model released in February 2026. The session involved 31 guided explorations over approximately one hour.

Bottom Line

“Claude’s Cycles” isn’t proof that AI can replace mathematicians. It’s proof that AI can contribute to mathematics in ways that the best minds in the field now take seriously. When Donald Knuth — a person who has spent 60 years at the absolute frontier of computer science — says he needs to revise his opinions about generative AI, the rest of us should probably pay attention.

The construction Claude found was one of 760 valid approaches. It wasn’t the most elegant. It required human guidance to discover and human expertise to prove. But it was correct, and Knuth couldn’t find it on his own. That’s the part that matters.