You’re talking about runtime optimizations. Those are fine. You’re totally allowed to run some meta-analysis, figure out you’re spending more time on goal-tree updating than the updates gain you in utility, and scale that process down in frequency, or even make it dependent on how much cputime you need for itme-critical ops in a given moment. Agents with bounded computational resources will never have enough cputime to compute provably optimal actions in any case (the problem is uncomputable); so how much you spend on computation before you draw the line and act out your best guess is always a tradeoff you need to make. This doesn’t mean your ideal top-level goals—the ones you’re trying to implement as best you can—can’t maximize.
Approach #2: May want more goals
For this to work, you’d still need to specify how exactly that algorithm works; how you can tell good new goals from bad ones. Once you do, this turns into yet another optimization problem you can install as a (or the only) final goal, and have it produce subgoals as you continue to evaluate it.
Approach #3: Derive goals?
I may not have understood this at all, but are you talking about something like CEV? In that case, the details of what should be done in the end do depend on fine details of the environment which the AI would have to read out and (possibly expensively) evaluate before going into full optimization mode. That doesn’t mean you can’t just encode the algorithm of how to decide what to ultimately do as the goal, though.
Approach #4: Humans are hard.
You’re right; it is difficult! Especially so if you want it to avoid wireheading (the humans, not itself), and brainwashing, keep society working indefinitely, and not accidentally squash even a few important values. It’s also known as the FAI content problem. That said, I think solving it is still our best bet when choosing what goals to actually give our first potentially powerful AI.
You’re talking about runtime optimizations. Those are fine. You’re totally allowed to run some meta-analysis, figure out you’re spending more time on goal-tree updating than the updates gain you in utility, and scale that process down in frequency, or even make it dependent on how much cputime you need for itme-critical ops in a given moment. Agents with bounded computational resources will never have enough cputime to compute provably optimal actions in any case (the problem is uncomputable); so how much you spend on computation before you draw the line and act out your best guess is always a tradeoff you need to make. This doesn’t mean your ideal top-level goals—the ones you’re trying to implement as best you can—can’t maximize.
For this to work, you’d still need to specify how exactly that algorithm works; how you can tell good new goals from bad ones. Once you do, this turns into yet another optimization problem you can install as a (or the only) final goal, and have it produce subgoals as you continue to evaluate it.
I may not have understood this at all, but are you talking about something like CEV? In that case, the details of what should be done in the end do depend on fine details of the environment which the AI would have to read out and (possibly expensively) evaluate before going into full optimization mode. That doesn’t mean you can’t just encode the algorithm of how to decide what to ultimately do as the goal, though.
You’re right; it is difficult! Especially so if you want it to avoid wireheading (the humans, not itself), and brainwashing, keep society working indefinitely, and not accidentally squash even a few important values. It’s also known as the FAI content problem. That said, I think solving it is still our best bet when choosing what goals to actually give our first potentially powerful AI.