Monday AI Radar #12

Link post

This is what takeoff feels like. Anthropic and OpenAI have been explicit about their intention to create an intelligence explosion, and employees at both companies have recently confirmed that their models are significantly accelerating their own development.

This week we’ll talk about what that means, considering the trajectory of future progress, our increasing inability to measure the capabilities and risks of the frontier models, and some ideas for how humanity can successfully navigate what is coming.

Top pick

On Recursive Self-Improvement

The intelligence explosion has begun: AI is meaningfully accelerating its own development. Dean Ball considers what’s happening now and where we’re headed soon.

America’s major frontier AI labs have begun automating large fractions of their research and engineering operations. The pace of this automation will grow during the course of 2026, and within a year or two the effective “workforces” of each frontier lab will grow from the single-digit thousands to tens of thousands, and then hundreds of thousands.[…]

Policymakers would be wise to take especially careful notice of this issue over the coming year or so. But they should also keep the hysterics to a minimum: yes, this really is a thing from science fiction that is happening before our eyes, but that does not mean we should behave theatrically, as an actor in a movie might. Instead, the challenge now is to deal with the legitimately sci-fi issues we face using the comparatively dull idioms of technocratic policymaking.

My writing

A Closer Look at the “Societies of Thought” Paper

A fascinating recent paper argues that reasoning models use internal dialogue to make better decisions. I look at what they found, how they found it, and what it does (and doesn’t) mean.

New releases

Claude Opus 4.6

Anthropic has released Claude Opus 4.6, with strong improvements in all the usual places. Plus, two very interesting new options (at premium prices): a 1 million token context window and a substantially faster version of the model.

GPT-5.3-Codex

OpenAI just released GPT-5.3-Codex, which looks to be a significant upgrade to 5.2 (which just came out two months ago). Related: I expect we’ll see ChatGPT 5.3 very soon, likely this week.

Opus 4.6, Codex 5.3, and the post-benchmark era

Nathan Lambert shares some thoughts after spending time with both Opus 4.6 and Codex 5.3. He still prefers Opus, but the gap has narrowed. My take: both models are excellent—if coding is important to you, you should try both and see which works best for you.

OpenAI Trusted Access for Cyber

All the big models have reached or are very close to reaching dangerous cybersecurity capability levels. With that comes a very hard, very important problem: how do you let people use the defensive capabilities of those models without enabling bad actors to leverage their offensive capabilities? OpenAI is rolling out Trusted Access for Cyber, a program that gives trusted users greater access to dual-use cyber capabilities. Seems like a great idea, but hard to execute well at scale.

Kimi K2.5

Moonshot AI has released Kimi K2.5—possibly the best open model available. Zvi takes a detailed look. There aren’t a lot of surprises here: it’s an excellent model, they’ve apparently put very little effort into safety, and Chinese open models continue to lag the frontier by 6–12 months. You could probably argue they’ve fallen a little further behind lately, but that’s very hard to quantify.

OpenAI Agent Builder

OpenAI describes Agent Builder as “a visual canvas for building multi-step agent workflows.” I haven’t yet had a chance to take it for a spin, but it sounds great for some workflows. (But see Minh Pham’s thoughts about the Bitter Lesson below).

Agents!

More thoughts on OpenClaw and security

Rahul Sood has further thoughts about the security implications of OpenClaw.

Zvi reports on OpenClaw

No surprises: it’s very cool, but not ready for prime time. If you’re gonna try it out for fun or learning, make sure your security game is top-notch.

Related: Zvi is running a weekly series on Claude Code. Well worth your time if you’re using it regularly.

Nicholas Carlini’s robots build a C compiler

Here’s a nice data point on the very impressive capabilities (and significant limitations) of coding agents. Nicholas Carlini uses $20,000 worth of tokens (good thing he works at Anthropic!) to have agents semi-autonomously build a 100,000 line C compiler that can compile the Linux kernel. It’s a very impressive achievement, and far beyond what most humans could have done in that time. But also: it’s not production-ready, and the agents can’t quite seem to get it there.

Best Practices for Claude Code

Anthropic’s Best Practices for Claude Code contains almost everything I’ve personally found useful from all the guides I’ve linked to over the last few weeks.

Most best practices are based on one constraint: Claude’s context window fills up fast, and performance degrades as it fills.

Command line essentials

If you want to use Claude Code but are intimidated by having to use the command line (or want to better understand what your agent is doing), Ado has a nice guide to command line essentials for using agents.

Benchmarks, capabilities, and forecasts

AxiomProver

AxiomProver is back, this time with what they claim is “the first time an AI system has settled an unsolved research problem in theory-building math”.

How close is AI to taking my job?

We have a benchmark crisis: many existing benchmarks are saturated, and it’s hard and expensive to create new evaluations that challenge the frontier models. Epoch’s Anson Ho takes a different approach—instead of creating a formal new benchmark, he asked AI to tackle a couple of his recent work projects. Did they succeed? No, but the nature of their failure is informative.

Codex builds itself

OpenAI is also riding the recursive self-improvement rocket:

Codex now pretty much builds itself, with the help and supervision of a great team. The bottleneck has shifted to being how fast we can help and supervise the outcome.

A new math benchmark

The New York Times talks to a group of mathematicians who are putting together a new benchmark based on open questions in their current research ($).

Are we dead yet?

We are not prepared

Great post from Chris Painter that explains an increasingly serious challenge for AI safety:

My bio says I work on AGI preparedness, so I want to clarify:

We are not prepared.

Over the last year, dangerous capability evaluations have moved into a state where it’s difficult to find any Q&A benchmark that models don’t saturate.

AI manipulation

AI manipulation doesn’t get as much press as biosecurity or cyberwarfare, but there are good reasons to worry about AI manipulating humans. An AI with superhuman persuasion can enable authoritarian rule, cause social chaos, or simply take over the world. AI Policy Perspectives interviews Sasha Brown, Seliem El-Sayed, and Canfer Akbulut about their work studying AI manipulation. Lots of good thoughts about what AI manipulation is, why you should worry about it, and how to study it.

Jobs and the economy

What is the impact of AI on productivity?

How much does AI actually increase worker productivity? And are we seeing evidence of that in economic productivity statistics? Alex Imas looks at the evidence so far.

Here is the summary of the evidence thus far: we now have a growing body of micro studies showing real productivity gains from generative AI. However, the productivity impact of AI has yet to clearly show up in the aggregate data.

Strategy and politics

Three really good ideas from Forethought

Forethought has posted three really good thought pieces:

Design sketches for a more sensible world proposes some tools for improving humanity’s epistemic capability—and therefore, our ability to make good decisions.
The intelligence explosion convention proposes a governance strategy for navigating the beginning of the intelligence explosion, lightly inspired by the formation of the UN.
International AI projects should promote differential AI development argues for differential acceleration (d/acc, or differentially accelerating the development of safer AI technologies relative to more dangerous ones) in international AI projects.

There are lots of good ideas here, and they’re all worth reading. As written, however, I think they all have the same fatal flaw. As it is written in the ancient scrolls:

Everyone will not just

If your solution to some problem relies on “If everyone would just…” then you do not have a solution. Everyone is not going to just. At [no] time in the history of the universe has everyone just, and they’re not going to start now.

Figuring out what everyone should do is (relatively) easy. Figuring out how to get them to do it is the hard but vital part.

Industry news

High-Bandwidth Memory: The Critical Gaps in US Export Controls

High-bandwidth memory (HBM) is a critical part of AI computing hardware, but doesn’t get as much attention as the processors (GPUs) themselves. AI Frontiers explains how HBM works and looks at some critical gaps in US export controls.

Compute expenditures at US and Chinese AI companies

Epoch estimates the percentage of expenses that goes to compute at the big labs. It’s well over 50% in both the US and China.

Technical

As Rocks May Think

This sprawling beast of an essay by Eric Jang takes a thoughtful look at some recent major changes in model architecture and capabilities. Plus speculation about where AI is headed, and a status report on the author’s project to build an open source version of AlphaGo, and… there’s a whole lot here. Long and semi-technical, but very good.

Why Most Agent Harnesses Are Not Bitter Lesson Pilled

Minh Pham has thoughts on the implications of the Bitter Lesson for building agent harnesses:

In 2026 terms: if your “agent harness” primarily scales by adding more human-authored structure, it is probably fighting the Bitter Lesson.

Rationality and coordination

We Are Confused, Maladapted Apes Who Need Enlightenment

Back in December, David Pinsof argued in an insightful but depressing essay that many of humanity’s less agreeable traits are in fact rational and adaptive:

While reflecting on these questions, you may reach an unpleasant conclusion: there’s nothing you can do. The world doesn’t want to be saved.

Dan Williams responded with an equally insightful essay, arguing that traits that might have been rational and adaptive in the ancestral environment are neither in the modern world, and defending the Enlightenment and classical liberalism:

You can’t understand much of humanity’s significant progress over the past several centuries—in life expectancy, living standards, wealth, health, infant mortality, freedom, political governance, and so on—without embracing this fundamental optimism of the Enlightenment.

And Pinsof responded with a really good piece that responds to Williams’ arguments while finding substantial common ground:

My thesis in A Big Misunderstanding has some boundaries and exceptions, as nearly every thesis does, and you’ve done a great job of articulating them here. We’re probably more aligned in our thinking than not, but there are nevertheless a few parts of your post I’d push back on

This is the way.

Bring back RSS

Preach, Andrej, preach:

Finding myself going back to RSS/Atom feeds a lot more recently. There’s a lot more higher quality longform and a lot less slop intended to provoke. Any product that happens to look a bit different today but that has fundamentally the same incentive structures will eventually converge to the same black hole at the center of gravity well.

I agree: RSS is simply a better way of sharing information without the toxicity and walled gardens of social media. Coincidentally, all my writing is available on the free web, with RSS feeds.

Frivolity

How can I communicate better with my mom?

Anthropic would like to remind you that ads in AI could go really badly.