It seems to me like a potential hidden assumption, is whether AGI is the last invention humanity will ever need to make. In the standard Bostromian/Yudkowskian paradigms, we create the AGI, then the AGI becomes a singleton that determines the fate of the universe, and humans have no more input (so we’d better get it right). Whereas the emphasis of approval-directed agents is that we humans will continue to be the deciders, we’ll just have greatly augmented capability.
It seems to me like a potential hidden assumption, is whether AGI is the last invention humanity will ever need to make. In the standard Bostromian/Yudkowskian paradigms, we create the AGI, then the AGI becomes a singleton that determines the fate of the universe, and humans have no more input (so we’d better get it right). Whereas the emphasis of approval-directed agents is that we humans will continue to be the deciders, we’ll just have greatly augmented capability.
I don’t see those as incompatible. A singleton can take input from humans.