AGI is not clairvoyant. It WILL get things wrong and accidentally produce outcomes which do not comport with its values.
Corrigibility is a valid line of research, but even if you had an extremely corrigible system, it would still risk making mistakes.
AGI should be cautious, whether it is corrigible or not. It could make a mistake based on bad values, no off-switch OR just because it cannot predict all the outcomes of its actions.
Actually, caution is a different question.
AGI is not clairvoyant. It WILL get things wrong and accidentally produce outcomes which do not comport with its values.
Corrigibility is a valid line of research, but even if you had an extremely corrigible system, it would still risk making mistakes.
AGI should be cautious, whether it is corrigible or not. It could make a mistake based on bad values, no off-switch OR just because it cannot predict all the outcomes of its actions.