RSS

Alex Mallen

Karma: 1,451

Redwood Research

Are AIs more likely to pur­sue on-epi­sode or be­yond-epi­sode re­ward?

12 Mar 2026 17:35 UTC
37 points
0 comments8 min readLW link

The case for sa­ti­at­ing cheaply-satis­fied AI preferences

Alex Mallen10 Mar 2026 18:09 UTC
102 points
7 comments23 min readLW link

Will re­ward-seek­ers re­spond to dis­tant in­cen­tives?

Alex Mallen16 Feb 2026 19:35 UTC
50 points
1 comment10 min readLW link

Fit­ness-Seek­ers: Gen­er­al­iz­ing the Re­ward-Seek­ing Threat Model

Alex Mallen29 Jan 2026 19:42 UTC
84 points
4 comments17 min readLW link