This comment got linked a decade later, and so I thought it’s worth stating my own thoughts on the question:
We can consider a reference class of CEV-seeking procedures; one (massively-underspecified, but that’s not the point) example is “emulate 1000 copies of Paul Christiano living together comfortably and immortally and discussing what the AI should do with the physical universe; once there’s a large supermajority in favor of an enactable plan (which can include further such delegated decisions), the AI does that”.
I agree that this is going to be chaotic, in the sense that even slightly different elements of this reference class might end up steering the AI to different basins of attraction.
I assert, however, that I’d consider it a pretty good outcome overall if the future of the world were determined by a genuinely random draw from this reference class, honestly instantiated. (Again with the massive underspecification, I know.)
CEV may be underdetermined and many-valued, but that doesn’t mean paperclipping is as good an answer as any.
Re: no basins, it would be a bad situation indeed if the vast majority of the reference class never ended up outputting an action plan, instead deferring and delegating forever. I don’t have cached thoughts about that.
This comment got linked a decade later, and so I thought it’s worth stating my own thoughts on the question:
We can consider a reference class of CEV-seeking procedures; one (massively-underspecified, but that’s not the point) example is “emulate 1000 copies of Paul Christiano living together comfortably and immortally and discussing what the AI should do with the physical universe; once there’s a large supermajority in favor of an enactable plan (which can include further such delegated decisions), the AI does that”.
I agree that this is going to be chaotic, in the sense that even slightly different elements of this reference class might end up steering the AI to different basins of attraction.
I assert, however, that I’d consider it a pretty good outcome overall if the future of the world were determined by a genuinely random draw from this reference class, honestly instantiated. (Again with the massive underspecification, I know.)
CEV may be underdetermined and many-valued, but that doesn’t mean paperclipping is as good an answer as any.
Re: no basins, it would be a bad situation indeed if the vast majority of the reference class never ended up outputting an action plan, instead deferring and delegating forever. I don’t have cached thoughts about that.