If, on my first Internet search, I had found Yudkowsky defining the “Orthogonality Thesis”, then I probably would have used that definition instead. But I didn’t, so here we are.
Maybe a less homunculusy way to explain what I’m getting at is that an embedded world-optimizer must optimize simultaneously toward two distinct objectives: toward a correct world model and toward an optimized world. This applies a constraint to the Orthogonality Thesis, because the world model is embedded in the world itself.
But you can just have the world model as an instrumental subgoal. If you want to do difficult thing Z, then you want to have a better model of the parts of Z, and the things that have causal input to Z, and so on. This motivates having a better world model. You don’t need a separate goal, unless you’re calling all subgoals “separate goals”.
Obviously this doesn’t work as stated because you have to have a world model to start with, which can support the implication that “if I learn about Z and its parts, then I can do Z better”.
If, on my first Internet search, I had found Yudkowsky defining the “Orthogonality Thesis”, then I probably would have used that definition instead. But I didn’t, so here we are.
Maybe a less homunculusy way to explain what I’m getting at is that an embedded world-optimizer must optimize simultaneously toward two distinct objectives: toward a correct world model and toward an optimized world. This applies a constraint to the Orthogonality Thesis, because the world model is embedded in the world itself.
But you can just have the world model as an instrumental subgoal. If you want to do difficult thing Z, then you want to have a better model of the parts of Z, and the things that have causal input to Z, and so on. This motivates having a better world model. You don’t need a separate goal, unless you’re calling all subgoals “separate goals”.
Obviously this doesn’t work as stated because you have to have a world model to start with, which can support the implication that “if I learn about Z and its parts, then I can do Z better”.