They also separately believe that by the time an AI reaches superintelligence, it will in fact have oriented itself around a particular goal and have something like a goal slot in its cognition—but at that point, it won’t let us touch it, so the problem becomes we can’t put our own objective into it.
My guess is this is a bit stronger than what Nate believes. The corresponding quote (emphasis mine) is
Separately and independently, I believe that by the time an AI has fully completed the transition to hard superintelligence, it will have ironed out a bunch of the wrinkles and will be oriented around a particular goal [...]
and I wouldn’t be surprised myself if by the time an AI is superhuman at basically all tasks, it is still as incoherent as humans, especially if it uses more inference compute than a human brain.
My guess is this is a bit stronger than what Nate believes. The corresponding quote (emphasis mine) is
and I wouldn’t be surprised myself if by the time an AI is superhuman at basically all tasks, it is still as incoherent as humans, especially if it uses more inference compute than a human brain.