Charlie Steiner comments on An overview of 11 proposals for building safe advanced AI

Charlie Steiner 19 Jun 2020 1:25 UTC
LW: 2 AF: 1
AF
Basically because I think that amplification/recursion, in the current way I think it’s meant, is more trouble than it’s worth. It’s going to produce things that have high fitness according to the selection process applied, which in the limit are going to be bad.

On the other hand, you might see this as me claiming that “narrow reward modeling” includes a lot of important unsolved problems. HCH is well-specified enough that you can talk about doing it with current technology. But fulfilling the verbal description of narrow value learning requires some advances in modeling the real world (unless you literally treat the world as a POMDP and humans as Boltzmann-rational agents, in which case we’re back down to bad computational properties and also bad safety properties), which gives me the wiggle room to be hopeful.