The problem is that you not only have to hit hard and first, you have to prevent any possible retaliation because you hitting means that run the risk of being yourself hit. Are you telling me that you can conceive different ways to derail humanity but you can’t imagine a machine concluding that the risk is too high to play that game?
I can certainly imagine a machine concluding that the risk is too high to want to play that game. And I can imagine other reasons a machine might decide not to end humanity. That is why I wind up at maybe instead of definitely (i.e. p(doom) < 99%).
But that ultimately becomes a question of the machine’s goals, motivation, understanding, agency and risk tolerance. I think that there is a wide distribution of these and therefore an unknown but significant chance that the AGI decides not to destroy humanity.
That is very different from the question of whether the AGI could achieve the destruction of humanity. If the AGI couldn’t destroy humanity in practice, p(doom) would be close to 0.
In other words, I think the AGI can kill humanity but may choose not to. You seemed above to think the AGI can’t, but now seem to think it might be able to but may choose not to.
The problem is that you not only have to hit hard and first, you have to prevent any possible retaliation because you hitting means that run the risk of being yourself hit. Are you telling me that you can conceive different ways to derail humanity but you can’t imagine a machine concluding that the risk is too high to play that game?
I can certainly imagine a machine concluding that the risk is too high to want to play that game. And I can imagine other reasons a machine might decide not to end humanity. That is why I wind up at maybe instead of definitely (i.e. p(doom) < 99%).
But that ultimately becomes a question of the machine’s goals, motivation, understanding, agency and risk tolerance. I think that there is a wide distribution of these and therefore an unknown but significant chance that the AGI decides not to destroy humanity.
That is very different from the question of whether the AGI could achieve the destruction of humanity. If the AGI couldn’t destroy humanity in practice, p(doom) would be close to 0.
In other words, I think the AGI can kill humanity but may choose not to. You seemed above to think the AGI can’t, but now seem to think it might be able to but may choose not to.