RSS

Jacob Pfau

Karma: 843

UK AISI Alignment Team and NYU PhD student

An al­ign­ment safety case sketch based on debate

8 May 2025 15:02 UTC
55 points
13 comments25 min readLW link
(arxiv.org)

UK AISI’s Align­ment Team: Re­search Agenda

7 May 2025 16:33 UTC
107 points
2 comments10 min readLW link

Prospects for Align­ment Au­toma­tion: In­ter­pretabil­ity Case Study

21 Mar 2025 14:05 UTC
32 points
5 comments8 min readLW link