RSS

Bronson Schoen

Karma: 1,284

At Apollo Research working on scheming.

Papers:

Re­pro­duc­ing steer­ing against eval­u­a­tion aware­ness in a large open-weight model

10 Apr 2026 10:45 UTC
88 points
15 comments15 min readLW link

A Toy En­vi­ron­ment For Ex­plor­ing Rea­son­ing About Reward

25 Mar 2026 20:29 UTC
55 points
7 comments3 min readLW link

Me­tagam­ing mat­ters for train­ing, eval­u­a­tion, and oversight

18 Mar 2026 21:26 UTC
71 points
5 comments1 min readLW link
(alignment.openai.com)

Stress Test­ing De­liber­a­tive Align­ment for Anti-Schem­ing Training

17 Sep 2025 16:59 UTC
133 points
19 comments1 min readLW link
(antischeming.ai)

Abla­tions for “Fron­tier Models are Ca­pable of In-con­text Schem­ing”

17 Dec 2024 23:58 UTC
116 points
1 comment2 min readLW link

Fron­tier Models are Ca­pable of In-con­text Scheming

5 Dec 2024 22:11 UTC
211 points
24 comments7 min readLW link