Stuart_Armstrong comments on The mathematics of reduced impact: help needed

Stuart_Armstrong 17 Feb 2012 9:14 UTC
2 points

but I’m stumped by the third paragraph.

I’m just saying: here’s a major problem with this approach, let’s put it aside for now.

Are you penalizing the AI for the predictable consequences of it existing, rather than just the actions it takes?

We are penailising the master AI for the predictable consequences of the existence of the particular disciple AI it choose to make.

I’m also not sure the reduced impact intuitions hold for any narrow AIs whose task is to somehow combat existential risk.

No, it doesn’t hold. We could hook it up to something like utility indifference or whatever, but the most likely is that reduced impact AI would be an interim stage on the way to friendly AI.