2) If there’s not much evidence supporting that intuition, how should I change my actions?
If the value is high enough it is time to shut up and do the impossible. But the importance of having these details solved is not high enough to warrant that kind of desperation. On the other hand the likelyhood also doesn’t qualify as ‘impossible’. We multiply instead. To do this it will be necessary to answer two further questions:
3) When I actually encounter one of these scenarios what am I (or an agent I identify with) going to do? The universe doesn’t just let us opt out of making a decision just because it’s impossible to make a correct one. It is physically impossible to do nothing. The aspect of the wave function representing me is going to change whether I like it or not.
4) How can I avoid getting into the unsolvable game theoretic scenarios? Can I: a) Gain the power to take control of the whole cake and divide it however I damn well please? b) Overpower the prison guard and release myself and my friend?
These are not just trite dismissals. Crude as it may seem gaining power for yourself really is the best way to handle many game theory decisions for yourself, that is—prevention!
Note that there is one particular instance of ‘cake division’ that needs a solution. That is, if you gain power by creating an FAI there is a cake that must be divided. The problem is not the same but nevertheless you need to create a solution that you will be able to successfully implement without anybody else killing you before you press the button. You must choose a preference aggregation method that does work, which can run into some similar difficulties, and democracy is ruled out. Note that this isn’t something I have seen any inspiring ideas on—and it is conspicuously absent from any ‘CEV’ based solution that I’ve encountered.
I’m not sure if your question 3) sheds any light on the problem. Let’s replace “solving decision theory” with “solving the halting problem”. It’s a provably impossible task: there’s no algorithm that always gives the right result, and there’s no algorithm that beats all other algorithms. What will I do if asked to solve the halting problem for some random Turing machine? Not sure… I’ll probably use some dirty heuristics, even though I know they sometimes fail and there exist other heuristics that dominate mine. Shutting up and doing the impossible ain’t gonna help because in this case the impossible really is impossible.
Regarding question 4), if the UDT worldview is right and you are actually a bunch of indistinguishable copies of yourself spread out all over mathematics, then the AIs built by these copies will face a coordination problem. If you code the AI wrong, these copies may wage war among themselves and lose utility as a result. I got really freaked out when Wei pointed out that possibility, but now it seems quite obvious to me.
If you code the AI wrong, these copies may wage war among themselves and lose utility as a result. I got really freaked out when Wei pointed out that possibility, but now it seems quite obvious to me.
Excuse me for being dense, but how would these AI’s go about waging war on each other if the are in causally distinct universes? I’m sure there’s some clever way, but I can’t see what it is.
I don’t understand precisely enough what “causally distinct” means, but anyway the AIs don’t have to be causally distinct. If our universe is spatially infinite (which currently seems likely, but not certain), it contains infinitely many copies of you and any AIs that you build. If you code the AI wrong (e.g. using the assumption that it’s alone and must fend for itself), its copies will eventually start fighting for territory.
If you code the AI wrong, it can end up fighting these non-copy AIs too, even though they may be similar enough to ours to make acausal cooperation possible.
Unless they’re far enough apart, and inflation is strong enough, that their future light-cones never intersect. I thought you were going to talk about them using resources on acausal blackmail instead.
Also, I was traveling in May, so I just discovered this post. Have your thoughts changed since then?
Causally distinct isn’t a technical term, I just made it up on the spot. Basically, I was imagining the different AIs as existing in different Everett Branches or Tegmark universes or hypothetical scenario’s or something like that. I hadn’t considered the possibility of multiple AIs in the same universe.
I’m not sure if your question 3) sheds any light on the problem.
It certainly (and obviously) sheds light on the problem of “how should I change my actions?”.
If you make the question one of practical action then practical actions and the consequences thereof are critical. I need to know (or have a guess given a certain level of resource expenditure) what I am going to do in such situations and what the expected outcome will be. This influences how important the solution is to find and also the expected value of spending more time creating better ‘dirty heuristics’.
If the value is high enough it is time to shut up and do the impossible. But the importance of having these details solved is not high enough to warrant that kind of desperation. On the other hand the likelyhood also doesn’t qualify as ‘impossible’. We multiply instead. To do this it will be necessary to answer two further questions:
3) When I actually encounter one of these scenarios what am I (or an agent I identify with) going to do? The universe doesn’t just let us opt out of making a decision just because it’s impossible to make a correct one. It is physically impossible to do nothing. The aspect of the wave function representing me is going to change whether I like it or not.
4) How can I avoid getting into the unsolvable game theoretic scenarios? Can I:
a) Gain the power to take control of the whole cake and divide it however I damn well please?
b) Overpower the prison guard and release myself and my friend?
These are not just trite dismissals. Crude as it may seem gaining power for yourself really is the best way to handle many game theory decisions for yourself, that is—prevention!
Note that there is one particular instance of ‘cake division’ that needs a solution. That is, if you gain power by creating an FAI there is a cake that must be divided. The problem is not the same but nevertheless you need to create a solution that you will be able to successfully implement without anybody else killing you before you press the button. You must choose a preference aggregation method that does work, which can run into some similar difficulties, and democracy is ruled out. Note that this isn’t something I have seen any inspiring ideas on—and it is conspicuously absent from any ‘CEV’ based solution that I’ve encountered.
I’m not sure if your question 3) sheds any light on the problem. Let’s replace “solving decision theory” with “solving the halting problem”. It’s a provably impossible task: there’s no algorithm that always gives the right result, and there’s no algorithm that beats all other algorithms. What will I do if asked to solve the halting problem for some random Turing machine? Not sure… I’ll probably use some dirty heuristics, even though I know they sometimes fail and there exist other heuristics that dominate mine. Shutting up and doing the impossible ain’t gonna help because in this case the impossible really is impossible.
Regarding question 4), if the UDT worldview is right and you are actually a bunch of indistinguishable copies of yourself spread out all over mathematics, then the AIs built by these copies will face a coordination problem. If you code the AI wrong, these copies may wage war among themselves and lose utility as a result. I got really freaked out when Wei pointed out that possibility, but now it seems quite obvious to me.
Excuse me for being dense, but how would these AI’s go about waging war on each other if the are in causally distinct universes? I’m sure there’s some clever way, but I can’t see what it is.
I don’t understand precisely enough what “causally distinct” means, but anyway the AIs don’t have to be causally distinct. If our universe is spatially infinite (which currently seems likely, but not certain), it contains infinitely many copies of you and any AIs that you build. If you code the AI wrong (e.g. using the assumption that it’s alone and must fend for itself), its copies will eventually start fighting for territory.
Isn’t it much more likely to encounter many other, non-copy AI’s prior to meeting itself?
If you code the AI wrong, it can end up fighting these non-copy AIs too, even though they may be similar enough to ours to make acausal cooperation possible.
Unless they’re far enough apart, and inflation is strong enough, that their future light-cones never intersect. I thought you were going to talk about them using resources on acausal blackmail instead.
Also, I was traveling in May, so I just discovered this post. Have your thoughts changed since then?
Nope, I didn’t get any new ideas since May. :-(
Causally distinct isn’t a technical term, I just made it up on the spot. Basically, I was imagining the different AIs as existing in different Everett Branches or Tegmark universes or hypothetical scenario’s or something like that. I hadn’t considered the possibility of multiple AIs in the same universe.
It certainly (and obviously) sheds light on the problem of “how should I change my actions?”.
If you make the question one of practical action then practical actions and the consequences thereof are critical. I need to know (or have a guess given a certain level of resource expenditure) what I am going to do in such situations and what the expected outcome will be. This influences how important the solution is to find and also the expected value of spending more time creating better ‘dirty heuristics’.