Importantly, if there are multiple misaligned superintelligences, and no aligned superintelligence, it seems likely that they will be motivated and capable to coordinate with each other to overthrow humanity and divide the spoils.
This since non-obvious to me (or at least not a slam dunk is really what I think). It may be easier for misaligned AI 1 to strike a deal with humanity that it will use humans’ resources to defeat AI 2 and 3 in exchange for say 80% of the lightcone (as opposed to the split 3 ways with the AIs).
I’m not actually sure how well this applies in the exact situation Daniel describes (I’d need to think more) but it definitely seems plausible under a bunch of scenarios with multiple misaligned ASIs
Unaugmented humanity can’t be a signatory to a non-fake deal with a superintelligence, because we can’t enforce it or verify its validity. Any such “deal” would end with the superintelligence backstabbing us once we’re no longer useful. See more here.
A possible counter-proposal is to request, as part of the deal, that the superintelligence provides us with tools we can use to verify that it will comply with the deal/tools to bind it to the deal. That also won’t work: any tools it provides us will be poisoned in some manner, guaranteed not to actually work.
Yes, even if we request those tools to be e. g. mathematically verifiable or something. They would just be optimized to exploit bugs in our proof-verifiers, or bugs in human minds that would cause us to predictably and systemically misunderstand what the tools actually do, etc. See more here.
It does seem unlikely to me that humanity would credibly offer large fractions of all future resources. (So I wouldn’t put it in a scenario forecast meant to represent one of my top few most likely scenarios.)
Importantly, if there are multiple misaligned superintelligences, and no aligned superintelligence, it seems likely that they will be motivated and capable to coordinate with each other to overthrow humanity and divide the spoils.
This since non-obvious to me (or at least not a slam dunk is really what I think). It may be easier for misaligned AI 1 to strike a deal with humanity that it will use humans’ resources to defeat AI 2 and 3 in exchange for say 80% of the lightcone (as opposed to the split 3 ways with the AIs).
I’m not actually sure how well this applies in the exact situation Daniel describes (I’d need to think more) but it definitely seems plausible under a bunch of scenarios with multiple misaligned ASIs
Unaugmented humanity can’t be a signatory to a non-fake deal with a superintelligence, because we can’t enforce it or verify its validity. Any such “deal” would end with the superintelligence backstabbing us once we’re no longer useful. See more here.
A possible counter-proposal is to request, as part of the deal, that the superintelligence provides us with tools we can use to verify that it will comply with the deal/tools to bind it to the deal. That also won’t work: any tools it provides us will be poisoned in some manner, guaranteed not to actually work.
Yes, even if we request those tools to be e. g. mathematically verifiable or something. They would just be optimized to exploit bugs in our proof-verifiers, or bugs in human minds that would cause us to predictably and systemically misunderstand what the tools actually do, etc. See more here.
I agree it’s not a slam dunk.
It does seem unlikely to me that humanity would credibly offer large fractions of all future resources. (So I wouldn’t put it in a scenario forecast meant to represent one of my top few most likely scenarios.)