RSS

Axel Højmark

Karma: 156

AI Safety Researcher @ Apollo Research

Stress Test­ing De­liber­a­tive Align­ment for Anti-Schem­ing Training

17 Sep 2025 16:59 UTC
122 points
2 comments1 min readLW link
(antischeming.ai)

Fore­cast­ing Fron­tier Lan­guage Model Agent Capabilities

24 Feb 2025 16:51 UTC
35 points
0 comments5 min readLW link
(www.apolloresearch.ai)

An­a­lyz­ing Deep­Mind’s Prob­a­bil­is­tic Meth­ods for Eval­u­at­ing Agent Capabilities

22 Jul 2024 16:17 UTC
69 points
0 comments16 min readLW link