RSS

Ben Cottier

Karma: 152

Working on modelling beliefs about AI risk in spare time

Model­ing Risks From Learned Optimization

Ben Cottier12 Oct 2021 20:54 UTC
44 points
0 comments12 min readLW link

[Question] Do mesa-op­ti­mizer risk ar­gu­ments rely on the train-test paradigm?

Ben Cottier10 Sep 2020 15:36 UTC
12 points
7 comments1 min readLW link

Ben Cot­tier’s Shortform

Ben Cottier12 May 2020 11:03 UTC
2 points
2 comments1 min readLW link

Clar­ify­ing some key hy­pothe­ses in AI alignment

15 Aug 2019 21:29 UTC
77 points
12 comments9 min readLW link