additionally, optimizing for 1 factor makes other factors less “visible”, especially in the short term … so a tendency to try to “improve things slightly” instead of truly optimizing was probably strongly selected for all the cases the biorobot’s value function is only a proxy for unknowable-up-front true reward
additionally, optimizing for 1 factor makes other factors less “visible”, especially in the short term … so a tendency to try to “improve things slightly” instead of truly optimizing was probably strongly selected for all the cases the biorobot’s value function is only a proxy for unknowable-up-front true reward