This is a good idea, and current methods to try to instill this kind of behavior are called [quantilization](https://www.lesswrong.com/tag/quantilization). The problem with these lazy-AI (or bounded-optimization-power AGIs) are threefold:
We’d still need to solve the inner alignment problem
You can’t get the AGI to do anything particularly complex or clever
There is still some probability the AGI kills you, it is just made super small. Also, in cases where the relevant killing-you action is a disjunction of many atomic actions (for instance, if it’s trying to build a bomb, and the probability it outputs the next action in the bomb construction process is 0.1%, and the probability it outputs a not-terrible action is 99.9%, and there’s 10 bomb construction steps), then in the limit of number of actions taken, the probability it completes the bomb goes to 1).
This is a good idea, and current methods to try to instill this kind of behavior are called [quantilization](https://www.lesswrong.com/tag/quantilization). The problem with these lazy-AI (or bounded-optimization-power AGIs) are threefold:
We’d still need to solve the inner alignment problem
You can’t get the AGI to do anything particularly complex or clever
There is still some probability the AGI kills you, it is just made super small. Also, in cases where the relevant killing-you action is a disjunction of many atomic actions (for instance, if it’s trying to build a bomb, and the probability it outputs the next action in the bomb construction process is 0.1%, and the probability it outputs a not-terrible action is 99.9%, and there’s 10 bomb construction steps), then in the limit of number of actions taken, the probability it completes the bomb goes to 1).