If you’re going to join the mind and you don’t care about paper clips and it cares about paper clips, that’s not going to happen. But if it can offer some kind of compelling shared value story that everybody could agree with in some sense, then we can actually get values which can snowball.
I thought the “merge” idea was that, if the super-mind cares about paperclips and you care about staples, and you have 1% of the bargaining power of the super-mind, then you merge into a super+1-mind that cares 99% about paperclips and 1% about staples. And that can be a Pareto improvement for both. Right?
For one thing, it doesn’t really care about the actual von Neumann conditions like “not being money-pumped” because it’s the only mind, so there’s not an equilibrium that keeps it in check.
I think “not being money-pumped” is not primarily about adversarial dynamics, where there’s literally another agent trying to trick you, but rather about the broader notion of having goals about the future, and being effective in achieving those goals. Being dutch-book-able implies sometimes making bad decisions by your own light, and a smart agent should recognize that this is happening and avoid it, in order to accomplish more of its own goals.
(Thanks for the thought-provoking post.)
Couple nitpicks:
I thought the “merge” idea was that, if the super-mind cares about paperclips and you care about staples, and you have 1% of the bargaining power of the super-mind, then you merge into a super+1-mind that cares 99% about paperclips and 1% about staples. And that can be a Pareto improvement for both. Right?
I think “not being money-pumped” is not primarily about adversarial dynamics, where there’s literally another agent trying to trick you, but rather about the broader notion of having goals about the future, and being effective in achieving those goals. Being dutch-book-able implies sometimes making bad decisions by your own light, and a smart agent should recognize that this is happening and avoid it, in order to accomplish more of its own goals.
(TBC there are other reasons to question the applicability of VNM rationality, including Garrabrant’s fairness thing and the assumption that the agent has pure long-term consequentialist goals in the first place.)