RSS

Charbel-Raphaël

Karma: 2,452

Charbel-Raphael Segerie

https://​​crsegerie.github.io/​​

Living in Paris

Dis­solv­ing moral philos­o­phy: from pain to meta-ethics

Charbel-Raphaël4 Aug 2025 20:20 UTC
5 points
3 comments2 min readLW link

The bit­ter les­son of mi­suse detection

10 Jul 2025 14:50 UTC
34 points
6 comments7 min readLW link

The 80/​20 play­book for miti­gat­ing AI schem­ing in 2025

Charbel-Raphaël31 May 2025 21:17 UTC
39 points
2 comments4 min readLW link

[Paper] Safety by Mea­sure­ment: A Sys­tem­atic Liter­a­ture Re­view of AI Safety Eval­u­a­tion Methods

19 May 2025 10:38 UTC
26 points
0 comments1 min readLW link

Char­bel-Raphaël’s Shortform

Charbel-Raphaël21 Apr 2025 20:49 UTC
6 points
7 commentsLW link

🇫🇷 An­nounc­ing CeSIA: The French Cen­ter for AI Safety

Charbel-Raphaël20 Dec 2024 14:17 UTC
94 points
2 comments8 min readLW link

Are we drop­ping the ball on Recom­men­da­tion AIs?

Charbel-Raphaël23 Oct 2024 17:48 UTC
49 points
17 comments6 min readLW link

[Question] We might be drop­ping the ball on Au­tonomous Repli­ca­tion and Adap­ta­tion.

31 May 2024 13:49 UTC
63 points
30 comments4 min readLW link

AI Safety Strate­gies Landscape

Charbel-Raphaël9 May 2024 17:33 UTC
34 points
1 comment42 min readLW link

Con­structabil­ity: Plainly-coded AGIs may be fea­si­ble in the near future

27 Apr 2024 16:04 UTC
91 points
15 comments13 min readLW link

[Question] What con­vinc­ing warn­ing shot could help pre­vent ex­tinc­tion from AI?

13 Apr 2024 18:09 UTC
108 points
22 comments2 min readLW link

My in­tel­lec­tual jour­ney to (dis)solve the hard prob­lem of consciousness

Charbel-Raphaël6 Apr 2024 9:32 UTC
48 points
44 comments30 min readLW link

AI Safety 101 : Ca­pa­bil­ities—Hu­man Level AI, What? How? and When?

7 Mar 2024 17:29 UTC
46 points
8 comments54 min readLW link

The case for train­ing fron­tier AIs on Sume­rian-only corpus

15 Jan 2024 16:40 UTC
130 points
16 comments3 min readLW link

aisafety.info, the Table of Content

Charbel-Raphaël31 Dec 2023 13:57 UTC
23 points
1 comment11 min readLW link

AI Safety 101 - Chap­ter 5.2 - Un­re­stricted Ad­ver­sar­ial Training

Charbel-Raphaël31 Oct 2023 14:34 UTC
17 points
0 comments19 min readLW link

AI Safety 101 - Chap­ter 5.1 - Debate

Charbel-Raphaël31 Oct 2023 14:29 UTC
15 points
0 comments13 min readLW link

Char­bel-Raphaël and Lu­cius dis­cuss interpretability

30 Oct 2023 5:50 UTC
112 points
7 comments21 min readLW link

Against Al­most Every The­ory of Im­pact of Interpretability

Charbel-Raphaël17 Aug 2023 18:44 UTC
331 points
92 comments26 min readLW link2 reviews

AIS 101: Task de­com­po­si­tion for scal­able oversight

Charbel-Raphaël25 Jul 2023 13:34 UTC
35 points
0 comments19 min readLW link
(docs.google.com)