Ok, fair enough. (I had misunderstood you on that particular point, sorry.)
Benya
I meant that the alieved probability is small in absolute terms, not that it is small compared to the payoff. That’s why I mentioned the “stick to the mainline probability” heuristic. I really do believe that there are many people who, if they alieved that they (or a group effort they could join) could change the probability of a 10^80-sized future by 10%, would really care; but who do not alieve that the probability is large enough to even register, as a probability; and whose brains will not attempt to multiply a not-even-registering probability with a humongous payoff. (By “alieving a probability” I simply mean processing the scenario the way one’s brain processes things it assigns that amount of credence, not a conscious statement about percentages.)
This is meant as a statement about people’s actual reasoning processes, not about what would be reasonable (though I did think that you didn’t feel that multiplying a very small success probability with a very large payoff was a good reason to donate to MIRI; in any case seems to me that the more important unreasonableness is requesting mountains of evidence before alieving a non-vanishing probability for weird-sounding things).
[ETA: I find it hard to put a number on the not-even-registering probability the sort of person I have in mind might actually alieve, but I think a fair comparison is, say, the “LHC will create black holes” thing—I think people will tend to process both in a similar way, and this does not mean that they would shrug it off if somebody counterfactually actually did drop a mountain of evidence about either possibility on their head.]
Realistic amounts of difference in epistemics + the “humans best stick to the mainline probability” heuristic seem enough (where by “realistic” I mean “of the degree actually found in the world”). I.e., I honestly believe that there are many people out there who would care the hell about the fate of future galaxies if they alieved that they had any non-vanishing chance of significantly influencing that fate (and to choose the intervention that influences it in the desired direction).
Thanks!
(1) I agree with the grandparent.
(2) Yes, of course. But I feel that there’s enough evidence to assign very low probability to AGI not being inventable if humanity survives, but not enough evidence to assign very low probability to it being very hard and taking very long; eyeballing, it might well be thousands of years of no AGI before even considering AGI-is-impossible seriously (assuming that there is no other evidence cropping up why AGI is impossible, besides humanity having no clue how to do it; conditioning on impossible AGI, I would such expect such evidence to crop up earlier). Eliezer might put less weight on the tail of the time-to-AGI distribution and may have to have a correspondingly shorter time before considering impossible AGI seriously.
If we have had von Neumann-level AGI for a while and still no idea how to make a more efficient AGI, my update towards “superintelligence is impossible” would be very much quicker than the update towards “AGI is impossible” in the above scenario, I think. [ETA: Of course I still expect you can run it faster than a biological human, but I can conceive of a scenario where it’s within a few orders of magnitude of a von Neumann WBE, the remaining difference coming from the emulation overhead and inefficiencies in the human brain that the AGI doesn’t have but that don’t lead super-large improvements.]
I would love to go, but it would be difficult for me to afford, and I’ll be busy Aug 3-9 and 11-16. So alas, all I can say is that if you set up a financial aid fund and the event was outside those date ranges, I would apply :-)
Much as I love the idea of this and would like it to work for me, unfortunately as far as I can tell my brain simply treats “magical reality fluid” the same way as it would something bland like “degree of reality”.
Though come to think of it, I’m not actually sure whether or not I’ve really been saying the magical part to myself all this time. I’ll try to make sure I don’t leave it out in the future, and see whether it makes a difference.
In the UDT case, the set of outcomes is finite (well, or at least the set of equivalence classes of outcomes under the preference relation is finite) and the utility functions don’t have any particular properties, so every possible preference relation the model can treat at all can be represented by a utility function!
(I should note that this is not UDT as such we’re talking about here, but one particular formal way of implementing some of the ideas of UDT.)
Well—look, you can’t possibly fix your argument by reformulating BIP*, because your paper gives a correct mathematical proof that its version of BIP*, plus the “quadripartite disjunction”, are enough to imply the simulation argument! :-)
(For people who haven’t read the paper: the quadripartite disjunction is H1 = almost no human civilizations survive to posthuman OR H2 = almost no posthuman civilizations run ancestor simulations OR H3 = almost all humans live in a simulation OR SIM = I live in a simulation. To get the right intuition, note that this is logically equivalent to “If I live in the real world, i.e. if not SIM, then H1 or H2 or H3”.)
More formally, the argument in your paper shows that BIP* plus Cr(quadripartite disjunction) ~ 1 implies Cr(SIM) >~ 1 - Cr(H1 or H2).
I think that (a) there’s a confusion about what the symbol Cr(.) means, and (b) what you’re really trying to do is to deny Bostrom’s original BIP.
Your credence symbol must be meant to already condition on all your information; recall that the question your paper is examining is whether we should accept that Cr(SIM) ~ 1, which is only an interesting question if this is supposed to take into account our current information. A conditional credence, like Cr(SIM | A), must thus mean: If I knew all I know now, plus the one additional fact that A is true, what would my credence be that I live in a simulation?
[ETA: I.e., the stuff we’re conditioning on is not supposed to represent our state of knowledge, it is hypothetical propositions we’re taking into account in addition to our knowledge! The reason I’m interested in the BIP* from the paper is not that I consider (A or B) a good representation of our state of knowledge (in which case Cr(SIM | A or B) would simply be equal to Cr(SIM)); rather, the reason is that the argument in your paper shows that together with the quadripartite disjunction it is sufficient to give the simulation argument.]
So Bostrom’s BIP, which reads Cr(SIM | f_sim = x) = x, means that given all your current information, if you knew the one additional fact that the fraction of humans that live in a simulation is x, then your credence in yourself living in a simulation would be x. If you want to argue that the simulation argument fails even if our current evidence supports the quadripartite disjunction, because the fact that we observe what we observe gives us additional information that we need to take into account, then you need to argue that BIP is false. I can see ways in which you could try to do this: For example, you could question whether the particle accelerators of simulated humans would reliably work in accordance with quantum mechanics, and if one doesn’t believe this, then we have additional information suggesting we’re in the outside world. More generally, you’d have to identify something that we observe that none (or very very close to none) of the simulated humans would. A very obvious variant of the DNA analogy illustrates this: If the gene gives you black hair, and you have red hair, then being told that 60% of all humans have the gene shouldn’t make you assign a 60% probability to having the gene.
The obvious way to take this into account in the formal argument would be to redefine f_sim to refer to, instead of all humans, only to those humans that live in simulations in which physics etc. looks pretty much like in the outside world; i.e., f_sim says how many of those humans actually live in a simulation. Then, the version of BIP referring to this new f_sim should be uncontroversial, and the above counterarguments would become an attack on the quadripartite disjunction (which is sensible, because they’re arguments about the world, and the quadripartite disjunction is where all the empirically motivated input to the argument is supposed to go).
I think your interpretation of “if I don’t live in a simulation, then a fraction x of all humans lives in a simulation” as P(SIM or A) is wrong
Huh?
The paper talks about P(SIM | ¬SIM → A), which is equal to P(SIM | SIM ∨ A) because ¬SIM → A is logically equivalent to SIM ∨ A. I wrote the P(SIM | ¬SIM → A) from the paper in words as P(I live in a simulation | if I don’t live in a simulation, then a fraction x of all humans lives in a simulation) and stated explicitly that the if-then was a logical implication. I didn’t talk about P(SIM or A) anywhere.
Not necessarily—when you build a particle accelerator you’re setting up lots of matter to depend on the exact details of small amounts of matter, which might be detectable on a much more automatic level.
Ok; my point was that, due to butterfly effects, it seems likely that this is also true for the weather or some other natural process, but if there is a relatively simple way to calculate a well-calibrated probability distribution for whether any particular subatomic interaction will influence large amounts of matter, that should probably do the trick. (This works whether or not this distribution can actually detect the particular interactions that will influence the weather, as long as it can reliably detect the particle accelerator ones.)
But in any case, most plausible simulators have AGI-grade code anyway.
Fair enough, I think. Also I just noticed that you actually said “trivial for a SI”, which negates my terminological squabble—argh, sorry. … OK, comment retracted.
I think it’s correct that this makes the simulation argument goes through, but I don’t believe the “trivial”. As far as I can see, you need the simulation code to literally keep track of will humans notice this—my intuition is that this would require AGI-grade code (without that I expect you would either have noticeable failures or you would have something that is so conservative about its decisions of what not to simulate that it will end up simulating the entire atmosphere on a quantum level because when and where hurricanes occur influences the variables it’s interested in), and I suppose you could call this squabbles over terminology but AGI-grade code is above my threshold for “trivial”.
[ETA: Sorry, you did say “for a superintelligence”—I guess I need to reverse my squabble over words.]
Doesn’t this argument trivially fail as a matter of logic, because assuming ~SH, we do in fact have good evidence about the expected future power of computation, so that if you accept the first two claims the third claim ~SH still becomes inconsistent as the original SA holds, hence SA still goes through?
It’s interesting, but it’s also, as far as I can tell, wrong.
Birch is willing to concede that if I know that almost all humans live in a simulation, and I know nothing else that would help me distinguish myself from an average human, then I should be almost certain that I’m living in a simulation; i.e., P(I live in a simulation | almost everybody lives in a simulation) ~ 1. More generally, he’s willing to accept that P(I live in a simulation | a fraction x of all humans live in a simulation) = x; similar to how, if I know that 60% of all humans have a gene that has no observable effects, and I don’t know anything about whether I specifically have that gene, I should assign 60% probability to the proposition that I have that gene.
However, Bostrom’s argument rests on the idea that our physics experiments show that there is a lot of computational power in the universe that can in principle be used for simulations. Birch points out that if we live in a simulation, then our physics experiments don’t necessarily give good information about the true computational power in the universe. My first intuition would be that the argument still goes through if we don’t live in a simulation, so perhaps we can derive an almost-contradiction from that? [ETA: Hm, that wasn’t a very good explanation; Eliezer’s comment does better.] Birch considers such a variation and concludes that we would need a principle that P(I live in a simulation | if I don’t live in a simulation, then a fraction x of all humans lives in a simulation) >= x, and he doesn’t see a compelling reason to believe that. (The if-then is a logical implication.)
But this follows from the principle he’s willing to accept. “If I don’t live in a simulation, then a fraction x of all humans lives in a simulation” is logically equivalent to (A or B), where A = “A fraction x of all humans lives in a simulation” and B = “the fraction of all humans that live in a simulation is != x, but I, in particular, live in a simulation”; note that A and B are mutually exclusive. Birch is willing to accept that P(I live in a simulation | A) = x, and it’s certainly true that P(I live in a simulation | B) = 1. Writing p := P(A | A or B), we get
P(SIM | A or B) = [P(SIM | A) p] + [P(SIM | B) (1-p)] = [x * p] + [1-p] >= x.
- May 19, 2013, 8:28 AM; 1 point) 's comment on [Paper] On the ‘Simulation Argument’ and Selective Scepticism by (
The paradigmatic economic application I recall is consumer choice theory: You have a certain amount of money,
m
, and two goods you can buy. These goods have fixed pricesp
andq
. Your choices are pairs (x
,y
) saying how much of each good you buy; the “feasible set” of choices is{(x,y) : x,y >= 0 and xp + yq <= m}
. What’s your best choice in this set? We want to use calculus to solve this, so we’ll express your preferences as a differentiable utility function. The reasons VNM or Savage doesn’t enter into it is that actions lead to outcomes deterministically.In UDT, we don’t even start with a natural definition of “outcome”; in principle, we need to specify (1) a set of outcomes, (2) an ordering on these outcomes, and (3) a deterministic, no-input program which does some complicated computation and returns one of these outcomes. (The intuition is that the complicated computation computes everything that happens in our universe, then picks out the morally relevant parts and prints them out.) It’s just simpler to skip parts (1) and (2) in the formal specification and say that the program (3) returns a number. Since the proof-based models have no notion of probability (even implicitly like in Savage’s theorem), this makes the program an order-only “utility function.”
(Thanks for adding the point about Savage’s theorem!)
Ah, the bank holiday—drats, I could have thought of that… But great that you’re considering coming! :-) And really cool to hear that you’ve been working with Simon and Gaurav!
BTW, if you’re interested in decision making, you should consider coming to some of the Thursday seminars—we often have really interesting people visit (e.g. recently we’ve had Mike Shadlen, who did some of the ground-breaking work on the neuroscience of decision making that Luke mentions in his crash course).
There may be some confusion over terms, because economists do in fact also have use for utility functions that only express an ordering of outcomes. (Incidentally, this is also true of some of the decision theory work that has appeared on LessWrong: the utility functions in our proof-based versions of UDT only express an ordering; these models don’t have a notion of probabilities at all.) The OP and the parent comment are about the utility functions given by the von Neumann-Morgenstern theorem; these are left invariant by any affine rescaling and (by the uniqueness part of the theorem) are changed by any non-affine rescaling.
Congrats! Hopefully we’ll manage something in June, then :-)
Sorry ’bout that! The usual advice is that people should be bold and just post a time & place for a first meetup, to allow people to just show up, but it looks like it would be good to start collecting preferences for future meetup times—when’s good for you? Are you going to be in the area after your exams? If you’re coming from Exeter, I’m guessing weekends would probably be better for you, right?
(I don’t have answers to your specific questions, but here are some thoughts about the general problem.)
I agree with most of you said. I also assign significant probability mass to most parts of the argument for hope (but haven’t thought about this enough to put numbers on this), though I too am not comforted on these parts because I also assign non-small chance to them going wrong. E.g., I have hope for “if AI is visible [and, I add, AI risk is understood] then authorities/elites will be taking safety measures”.
That said, there are some steps in the argument for hope that I’m really worried about:
I worry that even smart (Nobel prize-type) people may end up getting the problem completely wrong, because MIRI’s argument tends to conspicuously not be reinvented independently elsewhere (even though I find myself agreeing with all of its major steps).
I worry that even if they get it right, by the time we have visible signs of AGI we will be even closer to it than we are now, so there will be even less time to do the necessary basic research necessary to solve the problem, making it even less likely that it can be done in time.
Although it’s also true that I assign some probability to e.g. AGI without visible signs, I think the above is currently the largest part of why I feel MIRI work is important.