RSS

Scott Garrabrant

Karma: 5,477

[Question] Does Agent-like Be­hav­ior Im­ply Agent-like Ar­chi­tec­ture?

Scott Garrabrant
23 Aug 2019 2:01 UTC
45 points
7 comments1 min readLW link

In­ten­tional Bucket Errors

Scott Garrabrant
22 Aug 2019 20:02 UTC
61 points
6 comments3 min readLW link

Risks from Learned Op­ti­miza­tion: Con­clu­sion and Re­lated Work

7 Jun 2019 19:53 UTC
65 points
2 comments6 min readLW link

De­cep­tive Alignment

5 Jun 2019 20:16 UTC
63 points
7 comments17 min readLW link

The In­ner Align­ment Problem

4 Jun 2019 1:20 UTC
68 points
16 comments13 min readLW link

Con­di­tions for Mesa-Optimization

1 Jun 2019 20:52 UTC
59 points
37 comments12 min readLW link

Risks from Learned Op­ti­miza­tion: Introduction

31 May 2019 23:44 UTC
126 points
32 comments12 min readLW link

Yes Re­quires the Pos­si­bil­ity of No

Scott Garrabrant
17 May 2019 22:39 UTC
127 points
39 comments2 min readLW link