In the comment section of Roko’s banned post, PeerInfinity mentioned “rescue simulations”. I’m not going to post the context here because I respect Eliezer’s dictatorial right to stop that discussion, but here’s another disturbing thought.
An FAI created in the future may take into account our crazy desire that the all the suffering in the history of the world hadn’t happened. Barring time machines, it cannot reach into the past and undo the suffering (and we know that hasn’t happened anyway), but acausal control allows it to do the next best thing: create large numbers of history sims where bad things get averted. This raises two questions: 1) if something very bad is about to happen to you, what’s your credence that you’re in a rescue sim and have nothing to fear? 2) if something very bad has already happened to you, does this constitute evidence that we will never build an FAI?
(If this isn’t clear: just like PlaidX’s post, my comment is intended as a reductio ad absurdum of any fears/hopes concerning future superintelligences. I’d still appreciate any serious answers though.)
This falls in the same confused cluster as anticipated experience. You only anticipate certain things happening because they describe the fraction of the game you value playing and are able to play (plan for), over other possibilities where things go crazy. Observations don’t provide evidence, and how you react to observations is a manner in which you follow a plan, conditional strategy of doing certain things in response to certain inputs, a plan that you must decide on from other considerations. Laws of physics seem to be merely a projection of our preference, something we came to value because we evolved to play the game within them (and are not able to easily influence things outside of them).
So “credence” is a very imprecise idea, and certainly not something you can use to make conclusions about what is actually possible (well, apart from however it reveals your prior, which might be a lot). What is actually possible is all there in the prior, not in what you observe. This suggests a kind of “anti-Bayesian” principle, where the only epistemic function of observations is to “update” your knowledge about what your prior actually is, but this “updating” is not at all straightforward. (This view also allows to get rid of the madness in anthropic thought experiments.)
Disagree, but upvoted. Given that there’s a canonical measure on configurations (i.e. the one with certain key symmetries, as with the L^2 measure on the Schrödinger equation), it makes mathematical sense to talk about the measure of various successor states to a person’s current experience.
It is true that we have an evolved sense of anticipated experience (coupled with our imaginations) that matches this concept, but it’s a nonmysterious identity: an agent whose subjective anticipation matches their conditional measure will make more measure-theoretic optimal decisions, and so the vast majority of evolved beings (counting by measure) will have these two match.
It may seem simpler to disregard any measure on the set of configurations, but it really is baked into the structure of the mathematical object.
I think that the mathematical structure of the multiverse matters fundamentally to anthropic probabilities. I think it’s creative but wrong to think that an agent could achieve quantum-suicide-level anthropic superpowers by changing how much ve now cares about certain future versions of verself, instead of ensuring that only some of them will be actual successor states of ver patterns of thought.
However, my own thinking on anthropic probabilites (Bostromian, so far as I understand him) has issues†, so I’m pondering it and reading his thesis.
† In particular, what if someone simulates two identical copies of me simultaneously? Is that different from one copy? If so, how does that difference manifest itself in the gray area between running one and two simulations, e.g. by pulling apart two matching circuitboards running the pattern?
I think it’s creative but wrong to think that an agent could achieve quantum-suicide-level anthropic superpowers by changing how much ve now cares about certain future versions of verself, instead of ensuring that only some of them will be actual successor states of ver patterns of thought.
You can’t change your preference. The changed preference won’t be yours. What you care about is even more unchangeable than reality. So we don’t disagree here, I don’t think you can get anthropic superpowers, because you care about a specific thing.
If we lump together even a fraction of my life as “me” rather than just me-this-instant, we’d find that my preference is actually pretty malleable while preserving the sense of identity. I think it’s within the realm of possibility that my brain could be changed (by a superintelligence) to model a different preference (say, one giving much higher weight to versions of me that win each day’s lottery) without any changes more sudden or salient to me than the changes I’ve already gone through.
If I expected this to be done to me, though, I wouldn’t anticipate finding my new preference to be well-calibrated; I’d rather expect to find myself severely surprised/disappointed by the lottery draw each time.
Am I making sense in your framework, or misunderstanding it?
I am still puzzled how preference corresponds to the physical state of brain. Is preference only partially presented in our universe (intersection of set of universes which correspond to your subjective experience and set of universes which correspond to mine subjective experience)?
I don’t say that the nature of the match is particularly mysterious, indeed measure might count as an independent component of the physical laws as explanation for the process of evolution (and this might explain Born’s rule). But decision-theoretically, it’s more rational to look at what your prior actually is, rather than at what the measure in our world actually is, even if the two very closely match. It’s the same principle as with other components of evolutionary godshatter, but anticipation is baked in most fundamentally.
You don’t discard measure at human level, it’s a natural concept that captures a lot of structure of our preference, and so something to use as a useful heuristic in decision-making, but once you get to be able to work at the greater level of detail, physical laws or measures over the structures that express them cease to matter.
Whoa. That’s gotta be the most interesting comment I read on LW ever. Did you just give an evolutionary explanation for the concept of probability? If Eliezer’s ideas are madness, yours are ultimate madness. It does sound like it can be correct, though.
But I don’t see how it answers my question. Are you claiming I have no chance of ending up in a rescue sim because I don’t care about it? Then can I start caring about it somehow? Because it sounds like a good idea.
Did you just give an evolutionary explanation for the concept of probability?
It is much worse, this seems to be an evolutionary “explanation” for, say, particle physics, and I can’t yet get through the resulting cognitive dissonance. This can’t be right.
Yep, I saw the particle physics angle immediately too, but I saw it as less catastrophic than probability, not more :-) Let’s work it out here. I’ll try to think of more stupid-sounding questions, because they seemed to be useful to you in the past.
As applied to your comment, it means that you can only use observations epistemically where you expect to be existing according to the concept of anticipated experience as coded by evolution. Where you are instantiated by artificial devices like rescue simulations, these situations don’t map on anticipated experience, so observations remembered in those states don’t reveal your prior, and can’t be used to learn how things actually are (how your prior actually is).
You can’t change what you anticipate, because you can’t change your mind that precisely, but changing what you anticipate isn’t fundamental and doesn’t change what will actually happen—everything “actually happens” in some sense, you just care about different things to different degree. And you certainly don’t want to change what you care about (and in a sense, can’t: the changed thing won’t be what you care about, it will be something else). (Here, “caring” is used to refer to preference, and not anticipation.)
Before I dig into it formally, let’s skim the surface some more. Do you also think Rolf Nelson’s AI deterrence won’t work? Or are sims only unusable on humans?
I think this might get dangerously close to the banned territory, and our Friendly dictator will close the whole thread. Though since it wasn’t clarified what exactly is banned, I’ll go ahead and discuss acausal trade in general until it’s explicitly ruled banned as well.
As discussed before, “AI deterrence” is much better thought of as participation in acausal multiverse economy, but it probably takes a much more detailed knowledge of your preference than humans possess to make the necessary bead jar guesses to make your moves in the global game. This makes it doubtful that it’s possible on human level, since the decision problem deteriorates into a form of Pascal’s Wager (without infinities, but with quantities outside the usual ranges and too difficult to estimate, while precision is still important).
ETA: And sims are certainly “usable” for humans, they produce some goodness, but maybe less so than something else. That they aren’t subjectively anticipated, doesn’t make them improbable, in case you actually build them. Subjective anticipation is not a very good match for prior, it only tells you a general outline, sometimes in systematic error.
If you haven’t already, read BLIT. I’m feeling rather like the protagonist.
Every additional angle, no matter how indirect, gets me closer to seeing that which I Must Not Understand. Though I’m taking it on faith that this is the case, I have reason to think the faith isn’t misplaced. It’s a very disturbing experience.
I think I’ll go read another thread now. Or wait, better yet, watch anime. There’s no alcohol in the house..
I managed to parse about half of your second paragraph, but it seems you didn’t actually answer the question. Let me rephrase.
You say that sims probably won’t work on humans because our “preference” is about this universe only, or something like that. When we build an AI, can we specify its “preference” in a similar way, so it only optimizes “our” universe and doesn’t participate in sim trades/threats? (Putting aside the question whether we want to do that.)
I don’t believe that anticipated experience in natural situations as an accidental (specific to human psychology) way for eliciting prior was previously discussed, though general epistemic uselessness of observations for artificial agents is certainly an old idea.
Your scenario is burdened by excessive detail about FAI. Any situation in which people create lots of sims but don’t allow lots of suffering/horror in the sims (perhaps as “rescue sims,” perhaps because of something like animal welfare laws, or many other possibilities) poses almost the same questions.
I thought about the “burdensome details” objection some more and realized that I don’t understand it. Do you think the rescue sim idea would work? If yes, the FAI should either use it to rescue us, or find another course of action that’s even better—but either way we’d be saved from harm, no? If the FAI sees a child on a train track, believing that the FAI will somehow rescue it isn’t “burdensome detail”! So you should either believe that you’ll be rescued, or believe that rescue sims and other similar scenarios don’t work, or believe that we won’t create FAI.
The plan that’s even better won’t be about “rescuing the child” in particular, and for the same reason you can’t issue specific wishes to FAI, like to revive the cryopreserved.
But whatever the “better plan” might be, we know the FAI won’t leave the child there to die a horrible death. To borrow Eliezer’s analogy, I don’t know which moves Kasparov will make, but I do know he will win.
It’s not a given that rescuing the child is the best use of one’s resources. As a matter of heuristic, you’d expect that, and as a human, you’d form that particular wish, but it’s not obvious that even such heuristics will hold. Maybe something even better than rescuing the child can be done instead.
Not to speak of the situation where the harm is already done. Fact is a fact, not even a superintelligence can alter a fact. An agent determines, but doesn’t change. It could try “writing over” the tragedy with simulations of happy resolutions (in the future or rented possible worlds), but those simulations would be additional things to do, and not at all obviously optimal use of FAI’s control.
You’d expect the simularity of original scenario to “connect” the original scenario with the new ones, diluting the tradegy through reduction in anticipated experience of it happening, but anticipated experience has no absolute moral value, apart from allowing to discover moral value of certain facts. So this doesn’t even avert the tragedy, and simulation of sub-optimal pre-singularity world, even without the tragedy, even locally around the averted tragedy, might be grossly noneudaimonic.
But whatever the “better plan” might be, we know the FAI won’t leave the child there to die a horrible death. To borrow Eliezer’s analogy, I don’t know which moves Kasparov will make, but I do know he will win.
If that actually happened, it can’t be changed. An agent determines, never changes. Fact is a fact. And writing saved child “over” the fact of the actually harmed one, in future simulations or rented possible worlds, isn’t necessarily the best use of FAI’s control. So the best plan might well involve leaving that single fact be, with nothing done specifically “about” that situation.
I don’t say rescue sims are strictly impossible in the above argument, indeed I said that everything is possible (in the sense of being in the domain of prior, roughly speaking), but you anticipate only a tiny fraction of what’s possible (or likely), and rescue sims probably don’t fall into that area. I agree with Carl that your FAI scenario is unlikely to the point of impossible (in the sense of prior, not just anticipation).
That would fall under “nitpicking”. When I said “impossible” I meant to say “they won’t work on us here”. Or will work with negligible probability, which is pretty much the same thing. My question to Carl stands: does he agree that it’s impossible/pointless to save people in the past by building rescue sims? Is this a consequence of UDT, the way he understands it?
A word on nitpicking: even if I believe it’s likely you meant a given thing, if it’s nonetheless not clear that you didn’t mean another, or presentation doesn’t make it clear for other people that you didn’t mean another, it’s still better to debias the discussion from illusion of transparency by explicitly disambiguating than relying on fitting the words to a model that was never explicitly tested.
There is an essential ambiguity for this discussion between “pointless” because subjective anticipation won’t allow you noticing, and “pointless” because it doesn’t optimize goodness as well as other plans do. It might be pointless saving people in the past by building sims, but probably only for the same reason it might be pointless reviving the cryonauts: because there are even better decisions available.
Given the Mathematical Universe/Many-Worlds/Simulation Argument everything good and bad is happening to you, i.e. is timeless.
I can’t follow all the fear-mongering about potential multiple-infinite-hyper-effective torture scenarios here. It’s a really old idea called ‘hell’. It’ll be/is an interesting experience, at least you won’t be dead. Get over it, it’s probably happening anyway.
In the comment section of Roko’s banned post, PeerInfinity mentioned “rescue simulations”. I’m not going to post the context here because I respect Eliezer’s dictatorial right to stop that discussion, but here’s another disturbing thought.
An FAI created in the future may take into account our crazy desire that the all the suffering in the history of the world hadn’t happened. Barring time machines, it cannot reach into the past and undo the suffering (and we know that hasn’t happened anyway), but acausal control allows it to do the next best thing: create large numbers of history sims where bad things get averted. This raises two questions: 1) if something very bad is about to happen to you, what’s your credence that you’re in a rescue sim and have nothing to fear? 2) if something very bad has already happened to you, does this constitute evidence that we will never build an FAI?
(If this isn’t clear: just like PlaidX’s post, my comment is intended as a reductio ad absurdum of any fears/hopes concerning future superintelligences. I’d still appreciate any serious answers though.)
This falls in the same confused cluster as anticipated experience. You only anticipate certain things happening because they describe the fraction of the game you value playing and are able to play (plan for), over other possibilities where things go crazy. Observations don’t provide evidence, and how you react to observations is a manner in which you follow a plan, conditional strategy of doing certain things in response to certain inputs, a plan that you must decide on from other considerations. Laws of physics seem to be merely a projection of our preference, something we came to value because we evolved to play the game within them (and are not able to easily influence things outside of them).
So “credence” is a very imprecise idea, and certainly not something you can use to make conclusions about what is actually possible (well, apart from however it reveals your prior, which might be a lot). What is actually possible is all there in the prior, not in what you observe. This suggests a kind of “anti-Bayesian” principle, where the only epistemic function of observations is to “update” your knowledge about what your prior actually is, but this “updating” is not at all straightforward. (This view also allows to get rid of the madness in anthropic thought experiments.)
(This is a serious response. Honest.)
Edit: See also this clarification.
Disagree, but upvoted. Given that there’s a canonical measure on configurations (i.e. the one with certain key symmetries, as with the L^2 measure on the Schrödinger equation), it makes mathematical sense to talk about the measure of various successor states to a person’s current experience.
It is true that we have an evolved sense of anticipated experience (coupled with our imaginations) that matches this concept, but it’s a nonmysterious identity: an agent whose subjective anticipation matches their conditional measure will make more measure-theoretic optimal decisions, and so the vast majority of evolved beings (counting by measure) will have these two match.
It may seem simpler to disregard any measure on the set of configurations, but it really is baked into the structure of the mathematical object.
Do we still have a disagreement? If we do, what is it?
I think that the mathematical structure of the multiverse matters fundamentally to anthropic probabilities. I think it’s creative but wrong to think that an agent could achieve quantum-suicide-level anthropic superpowers by changing how much ve now cares about certain future versions of verself, instead of ensuring that only some of them will be actual successor states of ver patterns of thought.
However, my own thinking on anthropic probabilites (Bostromian, so far as I understand him) has issues†, so I’m pondering it and reading his thesis.
† In particular, what if someone simulates two identical copies of me simultaneously? Is that different from one copy? If so, how does that difference manifest itself in the gray area between running one and two simulations, e.g. by pulling apart two matching circuitboards running the pattern?
You can’t change your preference. The changed preference won’t be yours. What you care about is even more unchangeable than reality. So we don’t disagree here, I don’t think you can get anthropic superpowers, because you care about a specific thing.
If we lump together even a fraction of my life as “me” rather than just me-this-instant, we’d find that my preference is actually pretty malleable while preserving the sense of identity. I think it’s within the realm of possibility that my brain could be changed (by a superintelligence) to model a different preference (say, one giving much higher weight to versions of me that win each day’s lottery) without any changes more sudden or salient to me than the changes I’ve already gone through.
If I expected this to be done to me, though, I wouldn’t anticipate finding my new preference to be well-calibrated; I’d rather expect to find myself severely surprised/disappointed by the lottery draw each time.
Am I making sense in your framework, or misunderstanding it?
I am still puzzled how preference corresponds to the physical state of brain. Is preference only partially presented in our universe (intersection of set of universes which correspond to your subjective experience and set of universes which correspond to mine subjective experience)?
I don’t say that the nature of the match is particularly mysterious, indeed measure might count as an independent component of the physical laws as explanation for the process of evolution (and this might explain Born’s rule). But decision-theoretically, it’s more rational to look at what your prior actually is, rather than at what the measure in our world actually is, even if the two very closely match. It’s the same principle as with other components of evolutionary godshatter, but anticipation is baked in most fundamentally.
You don’t discard measure at human level, it’s a natural concept that captures a lot of structure of our preference, and so something to use as a useful heuristic in decision-making, but once you get to be able to work at the greater level of detail, physical laws or measures over the structures that express them cease to matter.
Whoa. That’s gotta be the most interesting comment I read on LW ever. Did you just give an evolutionary explanation for the concept of probability? If Eliezer’s ideas are madness, yours are ultimate madness. It does sound like it can be correct, though.
But I don’t see how it answers my question. Are you claiming I have no chance of ending up in a rescue sim because I don’t care about it? Then can I start caring about it somehow? Because it sounds like a good idea.
It is much worse, this seems to be an evolutionary “explanation” for, say, particle physics, and I can’t yet get through the resulting cognitive dissonance. This can’t be right.
Yep, I saw the particle physics angle immediately too, but I saw it as less catastrophic than probability, not more :-) Let’s work it out here. I’ll try to think of more stupid-sounding questions, because they seemed to be useful to you in the past.
As applied to your comment, it means that you can only use observations epistemically where you expect to be existing according to the concept of anticipated experience as coded by evolution. Where you are instantiated by artificial devices like rescue simulations, these situations don’t map on anticipated experience, so observations remembered in those states don’t reveal your prior, and can’t be used to learn how things actually are (how your prior actually is).
You can’t change what you anticipate, because you can’t change your mind that precisely, but changing what you anticipate isn’t fundamental and doesn’t change what will actually happen—everything “actually happens” in some sense, you just care about different things to different degree. And you certainly don’t want to change what you care about (and in a sense, can’t: the changed thing won’t be what you care about, it will be something else). (Here, “caring” is used to refer to preference, and not anticipation.)
Before I dig into it formally, let’s skim the surface some more. Do you also think Rolf Nelson’s AI deterrence won’t work? Or are sims only unusable on humans?
I think this might get dangerously close to the banned territory, and our Friendly dictator will close the whole thread. Though since it wasn’t clarified what exactly is banned, I’ll go ahead and discuss acausal trade in general until it’s explicitly ruled banned as well.
As discussed before, “AI deterrence” is much better thought of as participation in acausal multiverse economy, but it probably takes a much more detailed knowledge of your preference than humans possess to make the necessary bead jar guesses to make your moves in the global game. This makes it doubtful that it’s possible on human level, since the decision problem deteriorates into a form of Pascal’s Wager (without infinities, but with quantities outside the usual ranges and too difficult to estimate, while precision is still important).
ETA: And sims are certainly “usable” for humans, they produce some goodness, but maybe less so than something else. That they aren’t subjectively anticipated, doesn’t make them improbable, in case you actually build them. Subjective anticipation is not a very good match for prior, it only tells you a general outline, sometimes in systematic error.
If you haven’t already, read BLIT. I’m feeling rather like the protagonist.
Every additional angle, no matter how indirect, gets me closer to seeing that which I Must Not Understand. Though I’m taking it on faith that this is the case, I have reason to think the faith isn’t misplaced. It’s a very disturbing experience.
I think I’ll go read another thread now. Or wait, better yet, watch anime. There’s no alcohol in the house..
I managed to parse about half of your second paragraph, but it seems you didn’t actually answer the question. Let me rephrase.
You say that sims probably won’t work on humans because our “preference” is about this universe only, or something like that. When we build an AI, can we specify its “preference” in a similar way, so it only optimizes “our” universe and doesn’t participate in sim trades/threats? (Putting aside the question whether we want to do that.)
This has been much discussed on LW. Search for “updateless decision theory” and “UDT.”
I don’t believe that anticipated experience in natural situations as an accidental (specific to human psychology) way for eliciting prior was previously discussed, though general epistemic uselessness of observations for artificial agents is certainly an old idea.
This is crazy.
Yes, quite absurd.
I’d give that some credence, though note that we’ve talking about subjective anticipation, which is a piece of humanly-compelling nonsense.
For me, essentially zero, that is I would act (or attempt to act) as if I had zero credence that I was in a rescue sim.
Your scenario is burdened by excessive detail about FAI. Any situation in which people create lots of sims but don’t allow lots of suffering/horror in the sims (perhaps as “rescue sims,” perhaps because of something like animal welfare laws, or many other possibilities) poses almost the same questions.
I thought about the “burdensome details” objection some more and realized that I don’t understand it. Do you think the rescue sim idea would work? If yes, the FAI should either use it to rescue us, or find another course of action that’s even better—but either way we’d be saved from harm, no? If the FAI sees a child on a train track, believing that the FAI will somehow rescue it isn’t “burdensome detail”! So you should either believe that you’ll be rescued, or believe that rescue sims and other similar scenarios don’t work, or believe that we won’t create FAI.
The plan that’s even better won’t be about “rescuing the child” in particular, and for the same reason you can’t issue specific wishes to FAI, like to revive the cryopreserved.
But whatever the “better plan” might be, we know the FAI won’t leave the child there to die a horrible death. To borrow Eliezer’s analogy, I don’t know which moves Kasparov will make, but I do know he will win.
It’s not a given that rescuing the child is the best use of one’s resources. As a matter of heuristic, you’d expect that, and as a human, you’d form that particular wish, but it’s not obvious that even such heuristics will hold. Maybe something even better than rescuing the child can be done instead.
Not to speak of the situation where the harm is already done. Fact is a fact, not even a superintelligence can alter a fact. An agent determines, but doesn’t change. It could try “writing over” the tragedy with simulations of happy resolutions (in the future or rented possible worlds), but those simulations would be additional things to do, and not at all obviously optimal use of FAI’s control.
You’d expect the simularity of original scenario to “connect” the original scenario with the new ones, diluting the tradegy through reduction in anticipated experience of it happening, but anticipated experience has no absolute moral value, apart from allowing to discover moral value of certain facts. So this doesn’t even avert the tragedy, and simulation of sub-optimal pre-singularity world, even without the tragedy, even locally around the averted tragedy, might be grossly noneudaimonic.
If that actually happened, it can’t be changed. An agent determines, never changes. Fact is a fact. And writing saved child “over” the fact of the actually harmed one, in future simulations or rented possible worlds, isn’t necessarily the best use of FAI’s control. So the best plan might well involve leaving that single fact be, with nothing done specifically “about” that situation.
Nesov says rescue sims are impossible, while you only say my FAI scenario is unlikely. But you claim to be thinking about the same UDT. Why is that?
I don’t say rescue sims are strictly impossible in the above argument, indeed I said that everything is possible (in the sense of being in the domain of prior, roughly speaking), but you anticipate only a tiny fraction of what’s possible (or likely), and rescue sims probably don’t fall into that area. I agree with Carl that your FAI scenario is unlikely to the point of impossible (in the sense of prior, not just anticipation).
That would fall under “nitpicking”. When I said “impossible” I meant to say “they won’t work on us here”. Or will work with negligible probability, which is pretty much the same thing. My question to Carl stands: does he agree that it’s impossible/pointless to save people in the past by building rescue sims? Is this a consequence of UDT, the way he understands it?
A word on nitpicking: even if I believe it’s likely you meant a given thing, if it’s nonetheless not clear that you didn’t mean another, or presentation doesn’t make it clear for other people that you didn’t mean another, it’s still better to debias the discussion from illusion of transparency by explicitly disambiguating than relying on fitting the words to a model that was never explicitly tested.
There is an essential ambiguity for this discussion between “pointless” because subjective anticipation won’t allow you noticing, and “pointless” because it doesn’t optimize goodness as well as other plans do. It might be pointless saving people in the past by building sims, but probably only for the same reason it might be pointless reviving the cryonauts: because there are even better decisions available.
To clarify: I already accept the objections about “burdensome details” and “better plans”. I’m only interested in the subjective anticipation angle.
ETA: sometime after writing this comment I stopped understanding those objections, but anticipation still interests me more.
Given the Mathematical Universe/Many-Worlds/Simulation Argument everything good and bad is happening to you, i.e. is timeless.
I can’t follow all the fear-mongering about potential multiple-infinite-hyper-effective torture scenarios here. It’s a really old idea called ‘hell’. It’ll be/is an interesting experience, at least you won’t be dead. Get over it, it’s probably happening anyway.