RSS

peterbarnett

Karma: 1,322

Researcher at MIRI

EA and AI safety

https://​​peterbarnett.org/​​

Sum­mary of AI Re­search Con­sid­er­a­tions for Hu­man Ex­is­ten­tial Safety (ARCHES)

peterbarnett9 Dec 2020 23:28 UTC
11 points
0 comments13 min readLW link

Does mak­ing un­steady in­cre­men­tal progress work?

peterbarnett5 Mar 2021 7:23 UTC
8 points
4 comments1 min readLW link
(peterbarnett.org)

When Should the Fire Alarm Go Off: A model for op­ti­mal thresholds

peterbarnett28 Apr 2021 12:27 UTC
40 points
4 comments5 min readLW link
(peterbarnett.org)

Un­der­stand­ing Gra­di­ent Hacking

peterbarnett10 Dec 2021 15:58 UTC
41 points
5 comments30 min readLW link

Some mo­ti­va­tions to gra­di­ent hack

peterbarnett17 Dec 2021 3:06 UTC
8 points
0 comments6 min readLW link

[Question] What ques­tions do you have about do­ing work on AI safety?

peterbarnett21 Dec 2021 16:36 UTC
13 points
8 comments1 min readLW link

Align­ment Prob­lems All the Way Down

peterbarnett22 Jan 2022 0:19 UTC
26 points
7 comments11 min readLW link

Thoughts on Danger­ous Learned Optimization

peterbarnett19 Feb 2022 10:46 UTC
4 points
2 comments4 min readLW link