Charbel-Raphaël

Karma: 3,897

Charbel-Raphael Segerie

Executive Director of CeSIA, the French Center for AI Safety,
Founder of ML4Good.

https://crsegerie.com/

Charbel-Raphaël 20 Jul 2026 16:06 UTC
3 points
1
in reply to: Juan Ocampo’s comment on: The current bottleneck is political will, not research
Thanks a lot Juan
I think that the coupling between the public and decision-makers is loose and slow enough, on the timelines that matter here, that treating the two channels as roughly independent is a reasonable first approximation. A few reasons:
- The ~1,000 people who count can often be reached far faster by a roundtable or a direct conversation than by any amount of public-facing content, and those people don’t have time to follow podcasts or YouTube anyway
  - (In France, we catalyzed a 40-minute video on superintelligence that got 5M views, a very big deal for the French ecosystem. Some people inside the administration talked about it, but empirically our concrete policy wins came at the end of face-to-face conversations, not from that video, while the video was honestly close to the best case you can hope for in public advocacy.)
- The public→policymaker transmission is empirically weak right now: US polls have shown majority support for AI regulation for a while, and the White House and political apparatus have largely not acted on it. What people say they want isn’t tracking what the apparatus does.
- The strongest moves have often come from getting one legislator to care a lot. e.g. Wiener with SB 1047 and then SB 53, or Bores with the RAISE Act, more than from broad public salience.
So I’d still hold that at first order the channels are separable.
There’s more public-facing content than there used to be, between Kurzgesagt, The Diary of a CEO (Roman Yampolskiy’s interview did 20M views, for example!), AI in context, AI Species, etc.
But I fully agree the public channel is real, and I don’t exclude that at some point, there is a non-linear effect, a spark, and then the whole thing passes a threshold of attention where the whole thing is mediatized, like a super Mythos moment. I don’t know.
I suspect that your framing questions would nonetheless translate across audiences, and this is the type of underexplored research I was pointing to in 4D. Maybe you could test your 3 frames empirically.

Charbel-Raphaël 20 Jul 2026 15:29 UTC
3 points
0
in reply to: clickyquack’s comment on: The current bottleneck is political will, not research
Thanks for this and for the list of questions!
I think a website might be hard to build for such a preparadigmatic field. Also, quality is as important as quantity and might be difficult to capture with crude KPIs, but it is probably possible to write many new posts like the one above, each focused on examining a particular subdimension. But I think that a good website could probably integrate most of those dimensions.
Concretely, rather than one canonical site, I’d bet more on many focused posts, each examining one subdimension. I’ve added to my to-dos to publish a reading list at some point.
Happy to give feedback if you draft a one-pager

Charbel-Raphaël 17 Jul 2026 6:30 UTC
LW: 2 AF: 1
0
AF
on: The current bottleneck is political will, not research
FAQ for busy people.
1. “So you’re saying safety research is useless?”
No, research can help! Section 4D lists the research I find most useful right now. Some research has been essential, like the agentic misalignment paper, which was used in most of our presentations to policymakers. I’d also like to see research that proves this post wrong! Change my mind!
2. “Are you claiming we know how to align superintelligence?”
Nope. But we’re not even doing the cheap measures that could help. On SaferAI’s risk-management ratings, even Anthropic scored only 35% in their 2024 ratings. DNA synthesis screening still isn’t mandated anywhere. Per Buck Shlegeris, there’s “a list of 40 things, none of which seem that hard” that companies lack the appetite to do that would help enormously. And importantly, implementing those measures is how we’d get the data on whether they’re enough. For example, incident reporting and safety cases, if done properly, could tell us when current methods start to fail. Right now we’re flying blind.
3. “Won’t the evidence speak for itself? Just wait for the warning shot.”
Agreed that’s probably the main objection, and the post spends a full section on it. This is counterintuitive, but a crisis only converts if the ground is prepared. Some people in the AI safety community had the intuition in the past that observing deceptive alignment would be an absolute shut-it-down moment. Then Anthropic published the alignment-faking paper, and within days experts were debating whether it counted, and the moment dissolved. When the people in charge of AI in a government don’t know what a jailbreak is, that tells you how the next warning shot will land.
4. “It’s hopeless anyway, the race is locked in.”
It’s not hopeless; a lot of this is learned helplessness. The wins are small, but they are starting to compound and come from various efforts. One person at ControlAI following a scalable playbook got a cross-party group of Canadian MPs on record and triggered parliamentary hearings on superintelligence risk! Nearly 40 members of US Congress (39 as of mid-2026, split 18 Republicans and 21 Democrats) have now publicly discussed AGI or loss of control, up from a handful in early 2023, and this number roughly doubles every 5.5 months. Out of ~400 alumni of ML4Good (a program I founded, discount accordingly), ~150 now work in the field, some at the EU AI Office and UK AISI, others at MATS. It’s a small miracle that we got competent government agencies, and this is largely a result of field-building work during the last decade.
Don’t forget that the IAEA, the International Atomic Energy Agency, was built in four years by people who had just finished bombing each other; Fable 5 proved that it is possible to get quick action if necessary.
The part of the field working in advocacy is comically small, two orders of magnitude smaller than the one that fought climate change, so marginal efforts with the playbook here are unusually effective. That is not the same as nothing working. I think this just indicates that we need more Dakka.
5. “I read the abstract. What do I get from the full 29 minutes?”
Data from inside the rooms across European and multilateral institutions, to give you a model to think holistically about the present situation. What the median ministerial meeting looks like. Sad trade-offs, such as the story of ~10 civil-society orgs that privately believe in x-risk co-signing a document that names no risk at all. Concrete problems in the field of AI governance, alongside an opinionated list of directions to alleviate the bottleneck in Section 4.
6. “I’m a technical researcher. What do I concretely do?”
I think that a large part of this is an ugh-field around politics. But this can be taught. I see more and more people with technical backgrounds shifting to advocacy organizations, and they often do very well and can be highly productive if they are on the right team.
- Direct engagement. For example, they can be paired with people with experience in institutional engagement who may have less technical knowledge of AI safety. If you’re ready to enter the rooms, start with the insider playbook The Invisible Side of AI Governance or ControlAI’s Direct Institutional Plan.
- If you want to stay technical, see the research directions in Section 4D.
- A cheap first step is to submit to the next open consultation from a government. Only 1% of 1,534 UN Global Dialogue submissions mention x-risks directly, so your submission, if you explain your worldview directly, would be unusually visible.
What links here?
- The current bottleneck is political will, not research by Charbel-Raphaël (11 Jul 2026 21:56 UTC; 280 points)

Charbel-Raphaël 17 Jul 2026 6:23 UTC
3 points
1
in reply to: Eigenbraid’s comment on: The current bottleneck is political will, not research
I think, in general, increasing the amount of communication between think tanks focused on different risks, and to the extent possible, making alliances, and increasing the bandwidth of communication would be positive

Charbel-Raphaël 14 Jul 2026 9:14 UTC
2 points
0
on: Why AI Evaluation Regimes are bad
Great post

Charbel-Raphaël 14 Jul 2026 7:10 UTC
LW: 3 AF: 1
0
AF
in reply to: scasper’s comment on: The current bottleneck is political will, not research
I think that a large part of this is an ugh-field and learned helplessness around politics. But this can be taught. I see more and more people with technical backgrounds shifting to advocacy organizations, and they often do very well and can be highly productive if they are on the right team, especially when paired with people with experience in institutional engagement who may have less technical knowledge of AI safety.

Charbel-Raphaël 14 Jul 2026 6:29 UTC
5 points
0
in reply to: Jonas Hallgren’s comment on: The current bottleneck is political will, not research
Thanks Jonas, happy to see you in the comment section.
Scaling: I’m not claiming current methods scale to superintelligence. I grant they may not, and continual learning is the kind of thing that would probably break a good part of today’s evals. But a good part of risk management is precisely detecting when our present methods start to fail (to be concrete, you can run control evaluations, like this one).
Why feedback needs government: the instruments that generate the evidence needed to escalate (third-party evals, incident reporting, training transparency) are not really enforced without a mandate (the METR risk report is a heroic effort in this direction, but that’s mostly it). Labs really won’t volunteer the data that would justify constraining them, and they are even reticent to create data that could later be used against them. Which is why I think it would be a huge win to enforce those best practices like third-party evals, etc.
Friedman: partly agree, having ideas lying around matters. But when the crisis hits, the plans that get picked up are the ones with champions already in place and decision-makers who already trust the source . If you do no will-building at first, worse ideas win by default.

Charbel-Raphaël 14 Jul 2026 6:20 UTC
2 points
−1
in reply to: kvas_it’s comment on: The current bottleneck is political will, not research
Yes, we did participate in a seminar at the French Senate on “Alignment”, and the people intervening had it all over the place. Alignment means nothing to most people, and they interpret it like they want. So, I’d be in favor of tabooing “alignment”, and instead focusing on presenting very concrete experiments or warning shots that have already happened (like the agentic misalignment paper, or the Alibaba AI training run in which the AI allegedly started mining crypto)

Charbel-Raphaël 14 Jul 2026 5:50 UTC
2 points
0
in reply to: cloud’s comment on: The current bottleneck is political will, not research
Thanks for running an independent analysis.
Many of your quotes are not directly demonstrating x-risk concern per se.
Agreed the keyword probe on takeover was too narrow. But loss of control and x-risk aren’t the same thing, and my “<1%” was about the later.
I went back and read with Opus the 64 submissions our tool tags under “AI pursuing its own goals”, and only ~9 frame it in truly existential terms: extinction, human disempowerment, uncontrollable ASI (Queensland’s “disempowerment or extinction,” CIGI’s “extinction risk to humanity,” Stop AI’s “threat to the human species”). That’s ~0.6% of the 1,534, which is the “<1%” the post refers to.
We talk about this dynamic in the post; very few actors really raise this out loud and dare go to the end of the causal chain, even when they do so privately.
I will still update the relevant sections and add more caveats, so thanks :)

Charbel-Raphaël 14 Jul 2026 5:17 UTC
LW: 2 AF: 1
0
AF
in reply to: ryan_greenblatt’s comment on: The current bottleneck is political will, not research
Thanks Ryan, I’ll update the section to your current numbers (Plan D → A = ~45%→~19% rather than ~45%→~7%).
You also say that a near-ideal Plan A is “roughly 2x lower risk,” so if I understand your view correctly, a large part of the gap is implementation quality. But for a plan that calls for an international agreement, my guess is that implementation quality is, in large part, a function of the depth of the political will behind it.

Charbel-Raphaël 13 Jul 2026 19:25 UTC
LW: 5 AF: 1
−2
AF
in reply to: Chris_Leong’s comment on: The current bottleneck is political will, not research
2 things:
1. The ⁸⁰⁄₂₀ isn’t a claim about sufficiency for superintelligence. I already grant we don’t know how to align a superintelligence, and Plan A might not be enough. The ⁸⁰⁄₂₀ claim says that for the risk at current and near-term capability levels, there’s already a long list of cheap measures we’re simply not doing. Buck’s “~40 things, none of which seem that hard” is the crux.
2. Even if you think best practices are insufficient, implementing them is how we’d find that out. This is the part I’d stress most. Transparency, evals, red-teaming, incident reporting, safety cases, these aren’t just mitigations: in the long run those are the instruments that generate the data on whether we’re on track or not. Right now we’re flying blind: we don’t even have the feedback loop that would tell us “the ⁸⁰⁄₂₀ isn’t enough, we need to escalate.”
Debating the ceiling while we’re this far from the floor is a bit of a luxury problem.
On the researcher-pivot point: agreed, and I’d only add that the proficiency bar for advocacy is lower than people assume (similar to Asya’s post that I reference), and the opportunities are abundant; this can be scaled.

Charbel-Raphaël 13 Jul 2026 13:15 UTC
5 points
0
in reply to: Mass_Driver’s comment on: The current bottleneck is political will, not research
Thanks! Fully agree with this new ladder :)

Charbel-Raphaël 12 Jul 2026 22:43 UTC
LW: 9 AF: 3
0
AF
on: AI 2040: Plan A
If we want to move from Plan D to Plan A, I believe the first step is to collectively agree on the problem. We are far from it, and there is a lot we can do. I wrote about this in this piece: The current bottleneck is political will, not research.

Charbel-Raphaël 1 Jul 2026 15:11 UTC
4 points
0
in reply to: Antoine Maier’s comment on: A CERN for AI is a distraction; push for an IAEA instead
the following reply has been vibe written with Claude.
I hear versions of this regularly, and the three visions aren’t defended in equal proportion. Almost nobody explicitly argues for (a); it’s mostly (b) and (c), usually without distinguishing them.
The Simon Institute recently catalogued 14 prominent proposals and 3 existing projects that have used the “CERN for AI” label, and their conclusion is basically mine — the label points at completely different things depending on who’s using it: a publicly funded counterweight to commercial labs; an international AI safety institute that tests models; a way to secure European compute competitiveness; or a joint project to actually build frontier models. Worth reading that piece directly.
Here’s who’s defending what, sorted by which version they’re actually asking for:
(a) Pause + merge / single global developer
The version I think would help safety — and the one that’s essentially absent from the mainstream because it requires saying “pause” out loud.
- A Narrow Path / MAGIC (Miotti, ControlAI; Hausenloy, Miotti & Dennis, 2023) — the only version honest about the pause. MAGIC would be the only institution permitted to develop advanced AI, enforced by a global moratorium on all other advanced development. This is basically (a), stated explicitly — which is why it’s coherent but treated as politically fringe. FLI even files it under “a ‘CERN for AI’”.
(b) A new lab that races to catch up to the frontier
The politically live default. Pitched as sovereignty or trustworthy AI, ends up as a new lab building models — and, absent a pause, structurally 2–3 generations behind.
- Bengio & the Blueprint for Multinational Advanced AI Development (Oxford Martin AIGI, 2026) — a new multinational developer, not a pause. Premise: the US controls ~75% of global AI compute, China 15%, the EU 5%, so mid-sized economies face insurmountable barriers to independent frontier development — hence a pooled effort. (I’m a co-author; it’s the serious, well-governed version of (b), but it’s still (b) — a new institution that builds models, not a merge of existing ones.)
- CFG — Building CERN for AI (Centre for Future Generations) — by far the most institutionally detailed, and the clearest case of my worry. It’s framed as trustworthy-AI research (c), but a pioneer institution for trustworthy general-purpose AI with a €35 billion, three-year compute budget and sovereign-model ambition is (b) — a frontier lab, whatever the label says. The CFG lineage overlaps with the Europe 2031 authors (Daan Juijn, Alex Petropoulos at Arq) — a coherent cluster pushing sovereignty-through-capacity.
- The Frontier AI Initiative — no longer a “wannabe”; it’s real now. France, Germany and the Commission launched it in November 2025 as a non-profit public-private effort for frontier AI development, billed as the best-funded non-profit frontier AI initiative in the world, with access to Europe’s giga-compute. ECFR explicitly frames middle-power collaboration like this as the possible foundation of the longstanding “CERN for AI” proposal. This is (b) with the mask off — and exactly the trajectory Europe 2031 warns about.
- von der Leyen / the Commission — proposed a “CERN for AI” in her July 2024 political guidelines, then rebranded existing programs (the Gigafactories) as “akin to” one — which even sympathetic observers say bears little resemblance to what CERN actually was. Officially framed as competitiveness, i.e. (b).
(c) A pure research / safety center (no frontier training)
Where the bottleneck is enforcement, not more research — so a center without teeth doesn’t shift incentives.
- aitreaty.org — squarely (c): asks for a collaborative AI safety laboratory akin to CERN, pooling resources for safety research, alongside global compute thresholds and a compliance commission. Note it pairs the CERN-for-safety with an IAEA-style body — which supports my sequencing point, not the standalone-CERN one.
Straddles / ambivalent
- Miles Brundage — genuinely ambivalent, and a strong card to play. He wrote a detailed sketch of a version he likes: pooling many countries’ and companies’ resources into a single, possibly decentralized, civilian development effort — i.e. (a)/(b). But elsewhere he lists “CERN for AI” among the AI-policy proposals he finds too vague to confidently judge. So even a proponent concedes the concept has low “policy readiness.”
- The long tail: Hassabis’s old “CERN for AGI” line, CAIRNE’s decade-old open letter and “moonshot” framing, and Gary Marcus’s international agency.
On your convergence instinct
I’d push back slightly. It hasn’t converged from CERN to IAEA — the two run in parallel for different audiences. IAEA dominates the safety/coordination bubble you’re in; CERN dominates the European industrial-policy bubble, where it’s better-funded and more politically live than ever (von der Leyen, CFG, the Frontier AI Initiative). From inside our bubble it looks settled; where the money and the Commission are, it very much isn’t. That’s why I still think it’s worth arguing against.
And this is also dramatized in the AI 2027 lineage — Europe 2031 (Arq Foundation) is explicitly modeled on it, and it stages precisely the failure mode I’m worried about: Europe doubling down on sovereignty but forgetting to build leverage.

Charbel-Raphaël 29 Jun 2026 6:12 UTC
4 points
2
in reply to: Charbel-Raphaël’s comment on: Existential AI safety needs an effective social movement. PauseAI is building it
Thinking more about this, maybe, the most important critique of PauseAI is not their epistemology; it’s that, empirically, they have a low reputation: the FLI pushing the superintelligence statement has garnered a much bigger support—while PauseAI, with a similar statement, didn’t get the support.
Another argument: the School for Moral Ambition (100k views on this video), or ControlAI (100k subscribers to their newsletter if I remember), seems to be able to get a lot of public support as well, and this might be more effective at getting more people to sign on to something?
So this is a weird phenomenon: People agree that we need to slow down or pause (here’s a survey we conducted at CeSIA to demonstrate this), but it seems they don’t want to associate with PauseAI?
Or maybe there is an implicit rule that advocacy organizations are lower-status than research organizations, and not asking for what you ultimately want in the title of your organizations is what wins in the end?
Fascinating

Charbel-Raphaël 27 Jun 2026 6:11 UTC
11 points
0
in reply to: Ryan Meservey’s comment on: Trees are mostly made of air and a generalizable lesson for AI safety
An alternative to Bluedot’s course is the https://ai-safety-atlas.com/ , which does present the case for risks with the standard arguments made on lesswrong, and which I believe is the sota end to end written explanation of catastrophic risks.
Note: I’m one of the authors

Charbel-Raphaël 27 Jun 2026 6:08 UTC
4 points
−3
in reply to: Ryan Meservey’s comment on: Trees are mostly made of air and a generalizable lesson for AI safety
Fyi, i stopped presenting the theoretical concepts myself to policymakers, because we have enough incidents/concrete experiments that make the case without needing the theoretical scaffolding.

Charbel-Raphaël 26 Jun 2026 18:48 UTC
9 points
2
in reply to: Charbel-Raphaël’s comment on: Existential AI safety needs an effective social movement. PauseAI is building it
Specifically, I agree with most of the points in the posts. Here’s where I agree, and where I’m more uncertain (note that I read the text quickly; and I only read linearly the part of interest to me; I know PauseAI quite well)
Agreements:
- “Existential AI safety needs a civic/social movement”: I think that this would really help ultimately because AI safety is far from consensual, saliency is really low, and prioritization of risks is bad among the public
- “The binding constraint is political will, not more ideas”. I will write a piece on the topic, I basically agree
- “Only PauseAI is building this infrastructure across continents”—mostly true, ControlAI is also doing it. CeSIA is no longer doing this in France.
- “The ecosystem is a division of labour (Moyer’s four roles), and the roles compound.” Yes, see the first paragraph of this comment.
- “A widespread fatalism, the assumption that AI development cannot be stopped, stops people from acting.” + “we should be careful when treating the race narrative as descriptive; it is partly a self-fulfilling story.” Agreed, and the recent executive order should make everyone update on the topic.
- “Some corporate and state actors, and most of the general population, are oblivious to catastrophic AI risk. The bottleneck with regards to those actors is awareness of the issue and of pausing development as a viable solution.” yes
- “Labs underestimate the difficulty of aligning AGI/ASI” yes
- “Other corporate and state actors are just reckless as to the risk and are prepared to gamble everyone’s lives for a shot at winning the race” yes
- “Political leaders and civil societies are largely not AGI/ASI-aware.” : Yes, this matches with my experience of the field, and I have strong data to match this.
- “The China-races-so-we-can’t-stop narrative is partly self-fulfilling and overstated” true
- “Funding PauseAI is among the highest-EV interventions in AI safety” Probably, though I need more time to think deeply about this, and I’m not a grantmaker.
Where I’m less sure:
- “The right terminal goal is a binding international pause treaty with verification/enforcement” --> I agree functionally, ideally a pause for at least a couple of years, or at minimum the capacity to pause. Though I think there are much more palatable ways to present this to policymakers than “pause.”
- “Building a movement powerful and aligned enough, on short timelines, is intractable. ” --> I don’t know, it seems to me that whoever is in contact with the White House in the community has much more leverage, but I don’t know, this is probably a good long-term investment.
- Policymakers need strong enough incentives to act: maybe not. The Fable ban is a counterexample: this happened without any visible public constituency pushing for it, and I can’t see what the reputational incentive was
- “Public opinion is high-leverage on this decision”: Mixed opinion, I think that it is already possible to use surveys, like the one we did with French people, to be able to show the support of the population to policymakers. Depending on the policymakers, this can be a strong signal (for people in the legislative branch, and candidates for elections), or not that important (people in administrations)
- “The movement can grow fast without breaking”—I don’t know
Probably the part that is most debated about PauseAI is “A mass movement will drift, polarise, or turn anti-AI, and end up net-negative.”, on this I agree with the post’s objection, and I’d say that I think it is unavoidable that there will be a public movement on AI; if that’s not PauseAI, it will be another group, be it on the topic of labor or copyright or another risk. PauseAI focuses on catastrophic risks, and that’s good.

Charbel-Raphaël 26 Jun 2026 18:47 UTC
17 points
7
on: Existential AI safety needs an effective social movement. PauseAI is building it
I think it would be a loss for the ecosystem if PauseAI died in 4 months, and I think PauseAI is well worth funding.

Specifically in France, PauseAI has contributed to the discourse in appreciable ways on a shoestring budget (most notably, Maxime made good contributions to the French YouTube discourse), and by pushing epistemically sound analysis that would have been difficult to push by CeSIA-like orgs (notably by red teaming publicly “experts” such as Luc Julia, the second most influential skeptic in France on AI after LeCun—whose credibility on the topic has notably declined ^[1]). I believe that this insider-outsider complementarity is important in the ecosystem.
On YouTube, I often watch and appreciate Doom Debates.
Also, it is a particularly low-cost moment to stand up for a pause after the recent announcements from all 3 top AI companies and some of their employees (see Jurkovic’s comment).
1. ^
  See for example this article and this video.

Charbel-Raphaël 21 Jun 2026 16:19 UTC
8 points
2
in reply to: Tom DAVID’s comment on: The Invisible Side of AI Governance
Thanks Tom.
It’s true that in some parts of the post, I blur two things: the insider/outsider distinction it’s actually about, and my particular CeSIA experience of this.
On attribution, I think the post is actually closer to your position than you seem to imply. The opening list is phrased as open questions, plus the post-scriptum concedes that outside pressure may have created the conditions that insiders leveraged. So “hold your ‘I don’t know’ more centrally” is roughly the stance I’m already taking. The only outcomes I attribute concretely to CeSIA are (or will be soon enough) publicly checkable: specific Hiroshima process amendments, the behaviors section of the UNESCO red-lines draft, the Hugo Décrypte video, the OpinionWay poll.
I suspect your main objection is that the post’s structure (big list of major outcomes, then “we were in some of those rooms”) flatters CeSIA’s actual share. If that’s the critique, it’s fair: To be clear, we were not in the vast majority of those rooms, and I’ve edited this to make it clear. The list is there to show that the category of invisible work exists and is important.
On the side effects of playing the political game without experience: most of the post is precisely about trying not to do that (useful-assistant over bazooka). But yes, it matters to get advice from politically savvy people, and we do.

Charbel-Raphaël

(a) Pause + merge /​ single global developer

(b) A new lab that races to catch up to the frontier

(c) A pure research /​ safety center (no frontier training)

Straddles /​ ambivalent

On your convergence instinct

(a) Pause + merge / single global developer

(c) A pure research / safety center (no frontier training)

Straddles / ambivalent