An initial AI isn’t necessarily a utility maximizer of this very sophisticated form (with a utility function defined in terms of a robust model of the world from a 3rd person perspective), building such a thing is a further challenge beyond making AI
If someone designs an AI with a sensory utility function, taking control of its sensory channel is just optimizing for its utility function; it’s “wireheading” from the designer’s perspective if they expected to be able to ensure that the preferred inputs could only be obtained by performing assigned tasks
A utility-maximizer could have reason to modify or even eliminate its own utility function for a variety of reasons, especially when interacting with powerful agents and when its internals are at least partially transparent
Precisely, thank you! I hate arguing such points. Just because you can say something in English does not make it an utility function in the mathematical sense. Furthermore, just because in English it sounds like modification of utility function, does not mean that it is mathematically a modification of utility function. Real-world intentionality seem to be a separate problem from making a system that would figure out how to solve problems (mathematically defined problems), and likely, a very hard problem (in the sense of being very difficult to mathematically define).
Real-world intentionality seem to be a separate problem from making a system that would figure out how to solve problems (mathematically defined problems), and likely, a very hard problem (in the sense of being very difficult to mathematically define).
I think I disagree with you, depending on what you mean here. Limited “intentionality” (as in Dennett’s intentional stance) shows up as soon as you have a system that selects the best of several actions using prediction algorithms and an evaluation function: a chess engine like Rybka in the context of a game can be modeled well as selecting good moves. That intentionality is limited because the system has a tightly constrained set of actions and only evaluates consequences using a very limited model of the world, but these things can be scaled up. Robust problem-solving and prediction algorithms capable of solving arbitrary problems would be terribly hard, but intentionality would not be much of a further problem. On the other hand if we talk about very narrowly defined problems then systems capable of doing well on those will not be able to address the very economically and scientifically important mass of ill-specified problems.
Also, the separability of action and analysis is limited: Rybka can evaluate opening moves, looking ahead a fair ways, but it cannot provide a comprehensive strategy to win a game (carrying on to the end) without the later moves. You could put a “human in the loop” who would use Rybka to evaluate particular moves, and then make the actual move, but at the cost of adding a bottleneck (humans are slow, cannot follow thousands or millions of decisions at once). The more experimentation and interactive learning are important, the less viable the detached analytical algorithm.
It is true that an AI’s utility function-accomplishment-accessing methods can be circumvented. But having an AI that circumvents it’s own utility function, would be evidence towards poor utility function design.
Also, eliminating your own utility function is a perfectly valid move if it leads to fullfillment of the current utility function. That is the principle in the above statement: Every planned course of action is evaluated against it’s current utility function, if removing the construct that constitutes the utility function is an action that has high utility, then it is a valid course of action.
Now if an AI’s utility function is not properly designed it will of course self modify to satisfy it. If that involves putting a blue colour filter in front of your eyes that is a perfectly valid course of action.
But having an AI that circumvents it’s own utility function, would be evidence towards poor utility function design.
By circumvent, do you mean something like “wireheading”, i.e. some specious satisfaction of the utility function that involves behavior that is both unexpected and undesirable, or do you also include modifications to the utility function? The former meaning would make your statement a tautology, and the latter would make it highly non-trivial.
This is wrong in several ways.
An initial AI isn’t necessarily a utility maximizer of this very sophisticated form (with a utility function defined in terms of a robust model of the world from a 3rd person perspective), building such a thing is a further challenge beyond making AI
If someone designs an AI with a sensory utility function, taking control of its sensory channel is just optimizing for its utility function; it’s “wireheading” from the designer’s perspective if they expected to be able to ensure that the preferred inputs could only be obtained by performing assigned tasks
A utility-maximizer could have reason to modify or even eliminate its own utility function for a variety of reasons, especially when interacting with powerful agents and when its internals are at least partially transparent
Precisely, thank you! I hate arguing such points. Just because you can say something in English does not make it an utility function in the mathematical sense. Furthermore, just because in English it sounds like modification of utility function, does not mean that it is mathematically a modification of utility function. Real-world intentionality seem to be a separate problem from making a system that would figure out how to solve problems (mathematically defined problems), and likely, a very hard problem (in the sense of being very difficult to mathematically define).
I think I disagree with you, depending on what you mean here. Limited “intentionality” (as in Dennett’s intentional stance) shows up as soon as you have a system that selects the best of several actions using prediction algorithms and an evaluation function: a chess engine like Rybka in the context of a game can be modeled well as selecting good moves. That intentionality is limited because the system has a tightly constrained set of actions and only evaluates consequences using a very limited model of the world, but these things can be scaled up. Robust problem-solving and prediction algorithms capable of solving arbitrary problems would be terribly hard, but intentionality would not be much of a further problem. On the other hand if we talk about very narrowly defined problems then systems capable of doing well on those will not be able to address the very economically and scientifically important mass of ill-specified problems.
Also, the separability of action and analysis is limited: Rybka can evaluate opening moves, looking ahead a fair ways, but it cannot provide a comprehensive strategy to win a game (carrying on to the end) without the later moves. You could put a “human in the loop” who would use Rybka to evaluate particular moves, and then make the actual move, but at the cost of adding a bottleneck (humans are slow, cannot follow thousands or millions of decisions at once). The more experimentation and interactive learning are important, the less viable the detached analytical algorithm.
It is true that an AI’s utility function-accomplishment-accessing methods can be circumvented. But having an AI that circumvents it’s own utility function, would be evidence towards poor utility function design.
Also, eliminating your own utility function is a perfectly valid move if it leads to fullfillment of the current utility function. That is the principle in the above statement: Every planned course of action is evaluated against it’s current utility function, if removing the construct that constitutes the utility function is an action that has high utility, then it is a valid course of action.
Now if an AI’s utility function is not properly designed it will of course self modify to satisfy it. If that involves putting a blue colour filter in front of your eyes that is a perfectly valid course of action.
By circumvent, do you mean something like “wireheading”, i.e. some specious satisfaction of the utility function that involves behavior that is both unexpected and undesirable, or do you also include modifications to the utility function? The former meaning would make your statement a tautology, and the latter would make it highly non-trivial.
I mean it in the tautological sense. I try to refrain from stating highly-non trivial things without extensive explanations.