Any agent that seeks X as an instrumental goal, with, say, Y as a terminal goal, can easily be outcompeted by an agent that seeks X as a terminal goal.
You offered a lot of arguments for why this is true for humans, but I’m less certain this is true for AIs.
Suppose the first AI devotes 100% of its computation to achieving X, and the second AI devotes 90% of its computation to achieving X and 10% of its computation to monitoring that achieving X is still helpful for achieving Y. All else equal, the first AI is more likely to win. But it’s not necessarily true that all else is equal. For example, if the second AI possessed 20% more computational resources than the first AI, I’d expect the second AI to win even though it only seeks X as an instrumental goal.
Thank you for the correction. Thinking about it, I think that is true even of humans, in a certain sense. I would guess that the ability to hold several goal-nodes in one’s mind would scale with g and/or working memory capacity. Someone who is very smart and has tolerance for ambiguity would be able to aim for a very complex goal while simultaneously maintaining a great performance in the day-to-day mundane tasks they need to accomplish which might have seemingly no resemblance to the original goal at all.
So, both in humans and computers, I would guess this is an ability that requires certain cognitive or computational resources. So I maintain my original claim granted that those resources are controlled for.
Additionally, there may exist sets of goals that if pursued together, one is more likely to achieve all of them, than if any one (or any subset less than the whole) were pursued alone. (To put it a different way, it is possible to work on different things that give you ideas for each other, that you wouldn’t have had if you had been working on only one/a subset of them.)
You offered a lot of arguments for why this is true for humans, but I’m less certain this is true for AIs.
Suppose the first AI devotes 100% of its computation to achieving X, and the second AI devotes 90% of its computation to achieving X and 10% of its computation to monitoring that achieving X is still helpful for achieving Y. All else equal, the first AI is more likely to win. But it’s not necessarily true that all else is equal. For example, if the second AI possessed 20% more computational resources than the first AI, I’d expect the second AI to win even though it only seeks X as an instrumental goal.
Thank you for the correction. Thinking about it, I think that is true even of humans, in a certain sense. I would guess that the ability to hold several goal-nodes in one’s mind would scale with g and/or working memory capacity. Someone who is very smart and has tolerance for ambiguity would be able to aim for a very complex goal while simultaneously maintaining a great performance in the day-to-day mundane tasks they need to accomplish which might have seemingly no resemblance to the original goal at all.
It seems to be a skill that requires “buckets” https://www.lesswrong.com/posts/EEv9JeuY5xfuDDSgF/flinching-away-from-truth-is-often-about-protecting-the
So, both in humans and computers, I would guess this is an ability that requires certain cognitive or computational resources. So I maintain my original claim granted that those resources are controlled for.
Additionally, there may exist sets of goals that if pursued together, one is more likely to achieve all of them, than if any one (or any subset less than the whole) were pursued alone. (To put it a different way, it is possible to work on different things that give you ideas for each other, that you wouldn’t have had if you had been working on only one/a subset of them.)