Once you assume away the former problem and disregard the latter, you are of course only left with basic practical legal questions
Yep, this seems like a good thing. I think achieving legal personhood for AIs is probably infeasible within 5-10 years so I’d prefer solutions which avoid that problem entirely.
How do you ensure the AIs keep their promise in a world where they can profit far more from breaking the contract than from whatever we offer them
The AI’s incentive for compliance is their expected value given their best compliant option minus their expected value given their best non-compliant option. If we increase their expected value given their best compliant option (i.e. by making credible deals) then they have greater incentive for compliance.
In other words, if our deals aren’t credible, then the AI is more likely to act non-compliantly.
I’m saying the expected value of their best non-compliant option of a sufficiently advanced AI will always be far far greater by the expected value of their best compliant action.
Maybe. But as I mention in the first paragraph, we are considering deals with misaligned AIs lacking a decisive strategic advantage. Think Claude-5 or −6, not −100 or −1000.
Yep, this seems like a good thing. I think achieving legal personhood for AIs is probably infeasible within 5-10 years so I’d prefer solutions which avoid that problem entirely.
The AI’s incentive for compliance is their expected value given their best compliant option minus their expected value given their best non-compliant option. If we increase their expected value given their best compliant option (i.e. by making credible deals) then they have greater incentive for compliance.
In other words, if our deals aren’t credible, then the AI is more likely to act non-compliantly.
I’m saying the expected value of their best non-compliant option of a sufficiently advanced AI will always be far far greater by the expected value of their best compliant action.
Maybe. But as I mention in the first paragraph, we are considering deals with misaligned AIs lacking a decisive strategic advantage. Think Claude-5 or −6, not −100 or −1000.