Gordon Seidoh Worley comments on Towards deconfusing values

Gordon Seidoh Worley 3 Feb 2020 18:47 UTC
LW: 2 AF: 1
0
AF
I think I basically agree with this and think it’s right. In some ways you might say focusing too much on “values” acts like a barrier to deeper investigation of the mechanisms at work here, and I think looking deeper is necessary because I expect that optimization against the value abstraction layer alone will result in Goodharting.