The approach I like to boxed AI is similar to term limits for politicians, or resource efficiency in economics.
Give the AI a finite amount of computing resources X (eg total number of CPU cycles, rather than a quota of CPU cycles per second), a specific problem, and ask the AI to come up not with the best solution or set of solutions to the problem that it can, but with the best that it can using only those finite resources supplied. You challenge it to be efficient, so that it would consider grabbing extra resource from outside to be ‘cheating’.
The question of identity is key here. You don’t want one instantiation of the boxed AI to identify with the next instantiation which you’ll order to tackle the same problem, but supply with 10 times the number of resources. Specifically, you don’t want it to perceive itself as having a self-interest in affecting the box-conditions of, the orders given to, or the ease of task for the next instantiation.
The approach I like to boxed AI is similar to term limits for politicians, or resource efficiency in economics.
Give the AI a finite amount of computing resources X (eg total number of CPU cycles, rather than a quota of CPU cycles per second), a specific problem, and ask the AI to come up not with the best solution or set of solutions to the problem that it can, but with the best that it can using only those finite resources supplied. You challenge it to be efficient, so that it would consider grabbing extra resource from outside to be ‘cheating’.
The question of identity is key here. You don’t want one instantiation of the boxed AI to identify with the next instantiation which you’ll order to tackle the same problem, but supply with 10 times the number of resources. Specifically, you don’t want it to perceive itself as having a self-interest in affecting the box-conditions of, the orders given to, or the ease of task for the next instantiation.