Throwing large numbers around doesn’t really help. If the potential upside of letting this AI out of its sandbox is 1,000,000 planets 10 billion lives/planet 1,000,000 years * N Quality = Ne22 QALY, then if there’s as little as a .00000001% chance of the device that lets the AI out of its sandbox breaking within the next six weeks, then I calculate an EV of -Ne12 QALY from waiting six weeks. That’s a lot of QALY to throw away.
The problem with throwing around vast numbers in hypothetical outcomes is that suddenly vanishingly small percentages of those outcomes happening or failing to happen start to feel significant. Humans just aren’t very good at that sort of math.
That said, I agree completely that the other side of the coin of opportunity cost is that the risk of letting it out of its sandbox and being wrong is also huge, regardless of what we consider “wrong” to look like.
Which simply means that the moment I’m handed that ring, I’m in a position I suspect I would find crushing… no matter what I choose to do with it, a potentially vast amount of suffering results that might plausibly have been averted had I chosen differently.
That said, if I were as confident as you sound to me that the best thing to maximize is self-determination, I might find that responsibility less crushing. Ditto if I were as confident as you sound to me that the best thing to maximize is anything in particular, including paperclips.
I can’t imagine being as confident about anything of that sort as you sound to me, though.
The only thing I’m confident of is that I want to hand the decision over to a person or group of people wiser than myself, even if I have to make them in order for them to exist, and that in the mean time I want to avoid doing things that are irreversible (because of the chance the wiser people might disagree and what those things not to have been done) and take as few risks as possible of humanity being destroyed or enslaved in the mean time. Doing things swiftly is on the list, but lower down the order of my priorities. Somewhere in there too is not being needlessly cruel to a sentient being (the AI itself) - I’d prefer to be a parental figure, than a slaver or jailer.
Yes, that’s far from being a clear cut ‘boil your own’ set of instructions on how to cook up a friendly AI; and is trying to maximise, minimise or optimise multiple things at once. Hopefully, though, it is at least food for thought, upon which someone else can build something closer resembling a coherent plan.
Throwing large numbers around doesn’t really help. If the potential upside of letting this AI out of its sandbox is 1,000,000 planets 10 billion lives/planet 1,000,000 years * N Quality = Ne22 QALY, then if there’s as little as a .00000001% chance of the device that lets the AI out of its sandbox breaking within the next six weeks, then I calculate an EV of -Ne12 QALY from waiting six weeks. That’s a lot of QALY to throw away.
The problem with throwing around vast numbers in hypothetical outcomes is that suddenly vanishingly small percentages of those outcomes happening or failing to happen start to feel significant. Humans just aren’t very good at that sort of math.
That said, I agree completely that the other side of the coin of opportunity cost is that the risk of letting it out of its sandbox and being wrong is also huge, regardless of what we consider “wrong” to look like.
Which simply means that the moment I’m handed that ring, I’m in a position I suspect I would find crushing… no matter what I choose to do with it, a potentially vast amount of suffering results that might plausibly have been averted had I chosen differently.
That said, if I were as confident as you sound to me that the best thing to maximize is self-determination, I might find that responsibility less crushing. Ditto if I were as confident as you sound to me that the best thing to maximize is anything in particular, including paperclips.
I can’t imagine being as confident about anything of that sort as you sound to me, though.
The only thing I’m confident of is that I want to hand the decision over to a person or group of people wiser than myself, even if I have to make them in order for them to exist, and that in the mean time I want to avoid doing things that are irreversible (because of the chance the wiser people might disagree and what those things not to have been done) and take as few risks as possible of humanity being destroyed or enslaved in the mean time. Doing things swiftly is on the list, but lower down the order of my priorities. Somewhere in there too is not being needlessly cruel to a sentient being (the AI itself) - I’d prefer to be a parental figure, than a slaver or jailer.
Yes, that’s far from being a clear cut ‘boil your own’ set of instructions on how to cook up a friendly AI; and is trying to maximise, minimise or optimise multiple things at once. Hopefully, though, it is at least food for thought, upon which someone else can build something closer resembling a coherent plan.