RSS

Simon Lermen

Karma: 2,243

Substack: https://​​substack.com/​​@simonlermen

X/​Twitter: @SimonLermenAI

Where does the race to au­to­mate AI re­search end?

Simon Lermen2 Jun 2026 17:21 UTC
16 points
0 comments1 min readLW link
(simonlermen.substack.com)

Does Claude care about oth­ers the same way hu­mans do?

Simon Lermen28 May 2026 18:41 UTC
26 points
24 comments4 min readLW link

An­thropic’s fo­cus on hyperstition

Simon Lermen11 May 2026 14:35 UTC
73 points
39 comments6 min readLW link

What if su­per­in­tel­li­gence is just weak?

Simon Lermen26 Mar 2026 17:45 UTC
30 points
25 comments2 min readLW link
(substack.com)

Large-Scale On­line Deanonymiza­tion with LLMs

24 Feb 2026 17:02 UTC
69 points
5 comments4 min readLW link
(simonlermen.substack.com)

AI can sud­denly be­come dan­ger­ous de­spite grad­ual progress

Simon Lermen22 Jan 2026 16:47 UTC
15 points
0 comments4 min readLW link
(simonlermen.substack.com)

On Own­ing Galaxies

Simon Lermen6 Jan 2026 18:16 UTC
154 points
62 comments3 min readLW link
(simonlermen.substack.com)

Will We Get Align­ment by De­fault? — with Adrià Gar­riga-Alonso

27 Nov 2025 19:19 UTC
50 points
3 comments1 min readLW link
(simonlermen.substack.com)

Com­ment on Nat­u­ral Emer­gent Misal­ign­ment Paper by Anthropic

Simon Lermen23 Nov 2025 4:21 UTC
21 points
0 comments4 min readLW link

Jailbreak­ing AI mod­els to Phish Elderly Victims

18 Nov 2025 23:17 UTC
17 points
0 comments2 min readLW link
(simonlermen.substack.com)

AI 2025 - Last Shipmas

Simon Lermen17 Nov 2025 19:39 UTC
66 points
5 comments7 min readLW link

Univer­sal Ba­sic In­come in an AGI Future

Simon Lermen11 Nov 2025 2:26 UTC
21 points
1 comment2 min readLW link
(simonlermen.substack.com)

An­thropic & Dario’s dream

Simon Lermen8 Nov 2025 1:19 UTC
55 points
1 comment5 min readLW link

Com­par­a­tive ad­van­tage & AI

Simon Lermen3 Nov 2025 21:50 UTC
120 points
28 comments4 min readLW link

Model welfare and open source

Simon Lermen2 Nov 2025 2:29 UTC
15 points
1 comment5 min readLW link

Si­mon Ler­men’s Shortform

Simon Lermen6 Oct 2025 15:04 UTC
5 points
76 comments1 min readLW link

Why I don’t be­lieve Su­per­al­ign­ment will work

Simon Lermen22 Sep 2025 17:10 UTC
47 points
6 comments5 min readLW link

Hu­man study on AI spear phish­ing campaigns

3 Jan 2025 15:11 UTC
81 points
8 comments5 min readLW link

Cur­rent safety train­ing tech­niques do not fully trans­fer to the agent setting

3 Nov 2024 19:24 UTC
162 points
9 comments5 min readLW link

De­cep­tive agents can col­lude to hide dan­ger­ous fea­tures in SAEs

15 Jul 2024 17:07 UTC
33 points
2 comments7 min readLW link