Antoine Maier

Karma: 29

Antoine Maier 1 Jul 2026 12:56 UTC
3 points
0
on: A CERN for AI is a distraction; push for an IAEA instead
I’ve never looked closely at the question of a CERN for AI. I feel like I’ve heard about it from afar, and then the topic mostly disappeared from my bubble, to be replaced by the idea of an IAEA for AI. I assumed that it was simply the initial intuition which, through organically refining itself within the community, had converged from CERN to IAEA for the reasons you mention in your post.
I imagine that what motivated you to write this post is that you somewhat regularly hear people defending the idea of a CERN for AI, so do you know whether the 3 visions a), b) and c) of the CERN for AI that you describe are defended by many people, and in what proportion?

Antoine Maier 9 Jun 2026 12:41 UTC
1 point
0
on: Trees are mostly made of air and a generalizable lesson for AI safety
In AI safety, this can be a serious problem.
I have had one-on-ones or interviewed dozens of students who want a career in AI safety. [...] But when you ask them why they care about AI safety they don’t provide a particularly coherent answer. So I get more specific: “Why should we think AI is an existential risk?” Again, incoherent answer.
I’d say the situation is even worse than that. I’ve recently had one-on-ones with researchers in AI safety governance/policy, and most of them didn’t seem to have engaged with the object-level arguments. I suspect the same holds for most technical AI safety researchers, but my sample is small.
I suspect it’s mainly the lack of object-level thinking about the problem that leads many orgs and researchers to a prioritization that seems miscalibrated to me. The majority focuses on misuse risks, rather than existential risks from loss-of-control scenarios, even though the expected impact is far greater.

Antoine Maier 15 Apr 2026 9:02 UTC
3 points
2
in reply to: Dave Orr’s comment on: Claude Mythos Preview: Analysis of Anthropic’s Public Announcement
Thanks for pointing that out, I edited the post.
Since you’re working on safety at Anthropic, I would be interested to hear from you on two other points:
1. What motivated the removal of threat models related to radiological and nuclear weapons in the RSP v3.0 update?
2. What specific safeguards have been put in place to prevent recurrence of the inclusion of chain-of-thought content in reward computation?