The goal of development of FAI is reduction of its existential threat to near 0%, by mathematically proving stability and desirability of its preferences. It’s OK, but it reminds me of zero-risk bias.
How do you think designing and recommending containment system for AGIs will lower existential risks? Compare with condoms.
The stakes are so high in the FAI problem that it’s worth it to get very close to 0 risk. I’m not even sure the FAI program can get us comfortably close to 0 risk: An AI won’t start acting Friendly until CEV has been partially computed, so we’d probably want to first handcraft an approximation to CEV without the use of AGI; there are a number of ways that could go wrong.
In contrast, AGI containment seems almost completely worthless as an existential-risk reducer. If mere humans can break out of a crude AI box, it stands to reason that a self-improving AGI that is capable of outwitting us could break out of any human-designed box.
So strategy is to convince every development team, that no matter what precautions they use P(development-team-j-can-stop-AGI-before-FOOM)~=0. And development of recommendations for AGI containment will suggest that P(development-team-j-can-stop-AGI-before-FOOM) can be made sufficiently high, thus lowering P(development-team-j-will-not-create-AGI-before-FAI-is-developed). Given overconfidence bias it is plausible to assume that latter will increase P(AGI-goes-FOOM).
No—expected value is important. If many successful FAI scenarios could result in negative value, then zero value (universal extinction) would be better.
We should put some thought into whether a negative-value universe is plausible, and what it would look like.
The goal of development of FAI is reduction of its existential threat to near 0%, by mathematically proving stability and desirability of its preferences. It’s OK, but it reminds me of zero-risk bias.
Excellent point. The goal of FAI should be to increase expected value, not to minimize risk.
The goal of development of FAI is reduction of its existential threat to near 0%, by mathematically proving stability and desirability of its preferences. It’s OK, but it reminds me of zero-risk bias.
How do you think designing and recommending containment system for AGIs will lower existential risks? Compare with condoms.
The stakes are so high in the FAI problem that it’s worth it to get very close to 0 risk. I’m not even sure the FAI program can get us comfortably close to 0 risk: An AI won’t start acting Friendly until CEV has been partially computed, so we’d probably want to first handcraft an approximation to CEV without the use of AGI; there are a number of ways that could go wrong.
In contrast, AGI containment seems almost completely worthless as an existential-risk reducer. If mere humans can break out of a crude AI box, it stands to reason that a self-improving AGI that is capable of outwitting us could break out of any human-designed box.
P(extinction-event)~=P(realized-other-extinction-threat)+P(hand-coded-CEV/FAI-goes-terribly-wrong)+P(AGI-goes-FOOM)
P(AGI-goes-FOOM)~= 1 - \prod j [P(development-team-j-will-not-create-AGI-before-FAI-is-developed) + {1-P(development-team-j-will-not-create-AGI-before-FAI-is-developed) } P(development-team-j-can-stop-AGI-before-FOOM) ]
So strategy is to convince every development team, that no matter what precautions they use P(development-team-j-can-stop-AGI-before-FOOM)~=0. And development of recommendations for AGI containment will suggest that P(development-team-j-can-stop-AGI-before-FOOM) can be made sufficiently high, thus lowering P(development-team-j-will-not-create-AGI-before-FAI-is-developed). Given overconfidence bias it is plausible to assume that latter will increase P(AGI-goes-FOOM).
I withdraw suggestion.
No—expected value is important. If many successful FAI scenarios could result in negative value, then zero value (universal extinction) would be better.
We should put some thought into whether a negative-value universe is plausible, and what it would look like.
Excellent point. The goal of FAI should be to increase expected value, not to minimize risk.