As one of many mitigations it’s better than nothing but it is not sufficient because it does not solve the underlying problems and can easily be subverted by a superintelligence.
What do you think of the argument here (if you read that far) to build this into progenitor models? This idea does not apply to an SI, as I clarify in the text as well.
Let’s look at this from today. Current AI is unable to hide all deception from us. It seems reasonable to me that a switch would trigger before stealth SI is deployed.
As one of many mitigations it’s better than nothing but it is not sufficient because it does not solve the underlying problems and can easily be subverted by a superintelligence.
What do you think of the argument here (if you read that far) to build this into progenitor models? This idea does not apply to an SI, as I clarify in the text as well.
Let’s look at this from today. Current AI is unable to hide all deception from us. It seems reasonable to me that a switch would trigger before stealth SI is deployed.