It seems to me that many disagreements regarding whether the world can be made robust against a superintelligent attack (e. g., the recent exchange here) are downstream of different people taking on a mathematician’s vs. a hacker’s mindset.
I’m seeing a very different crux to these debates. Most people are not interested in the absolute odds, but rather how to make the world safer against this scenario—the odds ratios under different interventions. And a key intervention type would be the application of the mathematician’s mindset.
The linked post cites a ChatGPT conversation which claims that the number of bugs per 1,000 lines of code has declined by orders of magnitude, which (if you read the transcript) is precisely due to the use of modern provable frameworks.
It is worth quoting this conclusion in full.
Defense technologies should be more of the “armor the sheep” flavor, less of the “hunt down all the wolves” flavor. Discussions about the vulnerable world hypothesis often assume that the only solution is a hegemon maintaining universal surveillance to prevent any potential threats from emerging. But in a non-hegemonic world, this is not a workable approach (see also: security dilemma), and indeed top-down mechanisms of defense could easily be subverted by a powerful AI and turned into its offense. Hence, a larger share of the defense instead needs to happen by doing the hard work to make the world less vulnerable.
So this reads to me like rejecting the hacker mindset, in favor of a systems engineering approach. Breaking things is useful only to the extent you formalize the root cause, and your systems are legible enough to integrate those lessons.
I’m seeing a very different crux to these debates. Most people are not interested in the absolute odds, but rather how to make the world safer against this scenario—the odds ratios under different interventions. And a key intervention type would be the application of the mathematician’s mindset.
The linked post cites a ChatGPT conversation which claims that the number of bugs per 1,000 lines of code has declined by orders of magnitude, which (if you read the transcript) is precisely due to the use of modern provable frameworks.
It is worth quoting this conclusion in full.
So this reads to me like rejecting the hacker mindset, in favor of a systems engineering approach. Breaking things is useful only to the extent you formalize the root cause, and your systems are legible enough to integrate those lessons.