“If” is the question, not “how long”. And I think we’d stand a pretty good chance of handling a proof object in a secure way, assuming we have a secure digital transmission channel etc.
But the original scope of the thought experiment was assuming that we want to verify the proof. Wei Dai said:
Surely most humans would be too dumb to understand such a proof? And even if you could understand it, how does the AI convince you that it doesn’t contain a deliberate flaw that you aren’t smart enough to find? Or even better, you can just refuse to look at the proof.
I was responding to the first question, exclusively disjoint from the others. If your point is that we shouldn’t attempt to verify an AI’s precommitment proof, I agree.
I’m getting more confused. To me, the statements “Humans are too dumb to understand the proof” and the statement “Humans can understand the proof given unlimited time”, where ‘understand’ is qualified to include the ability to properly map the proof to the AI’s capabilities, are equivalent.
My point is not that we shouldn’t attempt to verify the AI’s proof for any external reasons—my point is that there is no useful information to be gained from the attempt.
“If” is the question, not “how long”. And I think we’d stand a pretty good chance of handling a proof object in a secure way, assuming we have a secure digital transmission channel etc.
But the original scope of the thought experiment was assuming that we want to verify the proof. Wei Dai said:
I was responding to the first question, exclusively disjoint from the others. If your point is that we shouldn’t attempt to verify an AI’s precommitment proof, I agree.
I’m getting more confused. To me, the statements “Humans are too dumb to understand the proof” and the statement “Humans can understand the proof given unlimited time”, where ‘understand’ is qualified to include the ability to properly map the proof to the AI’s capabilities, are equivalent.
My point is not that we shouldn’t attempt to verify the AI’s proof for any external reasons—my point is that there is no useful information to be gained from the attempt.