After a bit more thought, I’ve learned that it’s hard to avoid ending back up with EU maximization—it basically happens as soon as you require that strategies be good not just on the true environment, but on some distribution of environments that reflect what we think we’re designing an agent for (or the agent’s initial state of knowledge about states of the world). And since this is such an effective tool at penalizing the “just pick the absolute best answer” strategy, it’s hard for me to avoid circling back to it.
Here’s one possible option, though: look for strategies that are too simple to encode the one best answer in the first place. If the absolute best policy has K-complexity of 10^3 (achievable in the real world by strategies being complicated, or in the multi-armed bandit case by just having 2^1000 possible actions) and your agent is only allowed to start with 10^2 symbols, this might make things interesting.
Maybe optimality relative to the best performer out of some class of algorithms that doesn’t include “just pick the absolute best answer?” You basically prove that in environments with traps, anything that would, absent traps, be guaranteed to find the absolute best answer will instead get trapped. So those aren’t actually very good performers.
I just can’t come up with anything too clever, though, because the obvious classes of algorithms, like “polynomial time,” include the ability to just pick the absolute best answer by luck.
I don’t think there’s really a good answer. Section 6 Theorem 4 is my only suggestion here.
After a bit more thought, I’ve learned that it’s hard to avoid ending back up with EU maximization—it basically happens as soon as you require that strategies be good not just on the true environment, but on some distribution of environments that reflect what we think we’re designing an agent for (or the agent’s initial state of knowledge about states of the world). And since this is such an effective tool at penalizing the “just pick the absolute best answer” strategy, it’s hard for me to avoid circling back to it.
Here’s one possible option, though: look for strategies that are too simple to encode the one best answer in the first place. If the absolute best policy has K-complexity of 10^3 (achievable in the real world by strategies being complicated, or in the multi-armed bandit case by just having 2^1000 possible actions) and your agent is only allowed to start with 10^2 symbols, this might make things interesting.
Maybe optimality relative to the best performer out of some class of algorithms that doesn’t include “just pick the absolute best answer?” You basically prove that in environments with traps, anything that would, absent traps, be guaranteed to find the absolute best answer will instead get trapped. So those aren’t actually very good performers.
I just can’t come up with anything too clever, though, because the obvious classes of algorithms, like “polynomial time,” include the ability to just pick the absolute best answer by luck.