RSS

Fabien Roger

Karma: 7,950

I am working on empirical AI safety.

Anonymous feedback form.

Nar­row Se­cret Loy­alty Dodges Black-Box Audits

22 Apr 2026 9:41 UTC
37 points
1 comment13 min readLW link

How Un­mon­i­tored Ex­ter­nal Agents can Sab­o­tage AI labs

9 Apr 2026 18:07 UTC
18 points
0 comments9 min readLW link

Mea­sur­ing and im­prov­ing cod­ing au­dit re­al­ism with de­ploy­ment resources

23 Mar 2026 17:20 UTC
42 points
1 comment10 min readLW link
(alignment.anthropic.com)