The whole calculation is based on the premise that Neal’s concept of “full non-indexical conditioning” is a reasonable way to do probability theory. Usually you do probability theory on what you are calling “centered propositions”, and you interpret each data point you receive as the proposition “I have received this data”. Not as “There exists a version of me which has received this data as well as all of the prior data I have received”. It seems really odd to do the latter, and I think more motivation is needed for it. (To be fair, I don’t have a better alternative in mind.)
It seems really odd to do the latter, and I think more motivation is needed for it.
This old post of mine may help. The short version is that if you do probability with “centered propositions” then the resulting probabilities can’t be used in expected utility maximization.
(To be fair, I don’t have a better alternative in mind.)
I think the logical next step from Neal’s concept of “full non-indexical conditioning” (where updating on one’s experiences means taking all possible worlds, assigning 0 probability to those not containing “a version of me which has received this data as well as all of the prior data I have received”, then renormalizing sum of the rest to 1) is to not update, in other words, use UDT. The motivation here is that from a decision making perspective, the assigning 0 / renormalizing step either does nothing (if your decision has no consequences in the worlds that you’d assign 0 probability to) or is actively bad (if your decision does have consequences in those possible worlds, due to logical correlation between you and something/someone in one of those worlds). (UDT also has a bunch of other motivations if this one seems insufficient by itself.)
Yeah, but the OP was motivated by an intuition that probability theory is logically prior to and independent of decision theory. I don’t really have an opinion on whether that is right or not but I was trying to answer the post on its own terms. The lack of a good purely-probability-theory analysis might be a point in favor of taking a measure non-realist point of view though.
To make clear the difference between your view and ksvanhorn’s, I should point out that in his view if Sleeping Beauty is an AI that’s just woken up on Monday/Tuesday but not yet received any sensory input, then the probabilities are still 1⁄2; it is only after receiving some sensory input which is in fact different on the two days (even if it doesn’t allow the AI to determine what day it is) that the probabilities become 1⁄3. Whereas for decision-theoretic purposes you want the probability to be 1⁄3 as soon as the AI wakes up on Monday/Tuesday.
for decision-theoretic purposes you want the probability to be 1⁄3 as soon as the AI wakes up on Monday/Tuesday.
That is based on a flawed decision analysis that fails to account for the fact that Beauty will make the same choice, with the same outcome, on both Monday and Tuesday (it treats the outcomes on those two days as independent).
So you want to use FDT, not CDT. But if the additional data of which direction the fly is going isn’t used in the decision-theoretic computation, then Beauty will make the same choice on both days regardless of whether she has seen the fly’s direction or not. So according to this analysis the probability still needs to be 1⁄2 after she has seen the fly.
1. Non-indexical conditioning is not “a way to do probability theory”; it is just a policy of not throwing out any data, even data that appears irrelevant.
2. No, you do not usually do probability theory on centered propositions such as “today is Monday”, as they are not legitimate propositions in classical logic. The propositions of classical logic are timeless—they are true, or they are false, but they do not change from one to the other.
3. Nowhere in the analysis do I treat a data point as “there exists a version of me which has received this data...”; the concept of “a version of me” does not even appear in the discussion. If you are quibbling over the fact that Pdt is only the stream of perceptions Beauty remembers experiencing as of time t, instead of being the entire stream of perceptions up to time t, then you can suppose that Beauty has perfect memory. This simplifies things—we can now let Pd simply be the entire sequence of perceptions Beauty experiences over the course of the day, and define R(y,d) to mean ”y is the first n elements of Pd, for some n“—but it does not alter the analysis.
Nowhere in the analysis do I treat a data point as “there exists a version of me which has received this data...”;
This confuses me. Dacyn’s “There exists a version of me which has received this data as well as all of the prior data I have received” seems equivalent to Neal’s “I will here consider what happens if you ignore such indexical information, conditioning only on the fact that someone in the universe with your memories exists. I refer to this procedure as “Full Non-indexical Conditioning” (FNC).” (Section 2.3 of Neal2007)
Do you think Dacyn is saying something different from Neal? Or that you are saying something different from both Dacyn and Neal? Or something else?
None of this is about “versions of me”; it’s about identifying what information you actually have and using that to make inferences. If the FNIC approach is wrong, then tell me what how Beauty’s actual state of information differs from what is used in the analysis; don’t just say, “it seems really odd.”
I responded to #2 below, and #1 seems to be just a restatement of your other points, so I’ll respond to #3 here. You seem to be taking what I wrote a little too literally. It looks like you want the proposition Sleeping Beauty conditions on to be “on some day, Sleeping Beauty has received / is receiving / will receive the data X”, where X is the data that she has just received. (If this is not what you think she should condition on, then I think you should try to write the proposition you think she should condition on, using English and not mathematical symbols.) This proposition doesn’t have any reference to “a version of me”, but it seems to me to be morally the same as what I wrote (and in particular, I still think that it is really odd to say that that it is the proposition she should condition on, and that more motivation is needed for it).
The whole calculation is based on the premise that Neal’s concept of “full non-indexical conditioning” is a reasonable way to do probability theory. Usually you do probability theory on what you are calling “centered propositions”, and you interpret each data point you receive as the proposition “I have received this data”. Not as “There exists a version of me which has received this data as well as all of the prior data I have received”. It seems really odd to do the latter, and I think more motivation is needed for it. (To be fair, I don’t have a better alternative in mind.)
This old post of mine may help. The short version is that if you do probability with “centered propositions” then the resulting probabilities can’t be used in expected utility maximization.
I think the logical next step from Neal’s concept of “full non-indexical conditioning” (where updating on one’s experiences means taking all possible worlds, assigning 0 probability to those not containing “a version of me which has received this data as well as all of the prior data I have received”, then renormalizing sum of the rest to 1) is to not update, in other words, use UDT. The motivation here is that from a decision making perspective, the assigning 0 / renormalizing step either does nothing (if your decision has no consequences in the worlds that you’d assign 0 probability to) or is actively bad (if your decision does have consequences in those possible worlds, due to logical correlation between you and something/someone in one of those worlds). (UDT also has a bunch of other motivations if this one seems insufficient by itself.)
Yeah, but the OP was motivated by an intuition that probability theory is logically prior to and independent of decision theory. I don’t really have an opinion on whether that is right or not but I was trying to answer the post on its own terms. The lack of a good purely-probability-theory analysis might be a point in favor of taking a measure non-realist point of view though.
To make clear the difference between your view and ksvanhorn’s, I should point out that in his view if Sleeping Beauty is an AI that’s just woken up on Monday/Tuesday but not yet received any sensory input, then the probabilities are still 1⁄2; it is only after receiving some sensory input which is in fact different on the two days (even if it doesn’t allow the AI to determine what day it is) that the probabilities become 1⁄3. Whereas for decision-theoretic purposes you want the probability to be 1⁄3 as soon as the AI wakes up on Monday/Tuesday.
That is based on a flawed decision analysis that fails to account for the fact that Beauty will make the same choice, with the same outcome, on both Monday and Tuesday (it treats the outcomes on those two days as independent).
So you want to use FDT, not CDT. But if the additional data of which direction the fly is going isn’t used in the decision-theoretic computation, then Beauty will make the same choice on both days regardless of whether she has seen the fly’s direction or not. So according to this analysis the probability still needs to be 1⁄2 after she has seen the fly.
There are several misconceptions here:
1. Non-indexical conditioning is not “a way to do probability theory”; it is just a policy of not throwing out any data, even data that appears irrelevant.
2. No, you do not usually do probability theory on centered propositions such as “today is Monday”, as they are not legitimate propositions in classical logic. The propositions of classical logic are timeless—they are true, or they are false, but they do not change from one to the other.
3. Nowhere in the analysis do I treat a data point as “there exists a version of me which has received this data...”; the concept of “a version of me” does not even appear in the discussion. If you are quibbling over the fact that Pdt is only the stream of perceptions Beauty remembers experiencing as of time t, instead of being the entire stream of perceptions up to time t, then you can suppose that Beauty has perfect memory. This simplifies things—we can now let Pd simply be the entire sequence of perceptions Beauty experiences over the course of the day, and define R(y,d) to mean ”y is the first n elements of Pd, for some n“—but it does not alter the analysis.
This confuses me. Dacyn’s “There exists a version of me which has received this data as well as all of the prior data I have received” seems equivalent to Neal’s “I will here consider what happens if you ignore such indexical information, conditioning only on the fact that someone in the universe with your memories exists. I refer to this procedure as “Full Non-indexical Conditioning” (FNC).” (Section 2.3 of Neal2007)
Do you think Dacyn is saying something different from Neal? Or that you are saying something different from both Dacyn and Neal? Or something else?
None of this is about “versions of me”; it’s about identifying what information you actually have and using that to make inferences. If the FNIC approach is wrong, then tell me what how Beauty’s actual state of information differs from what is used in the analysis; don’t just say, “it seems really odd.”
I responded to #2 below, and #1 seems to be just a restatement of your other points, so I’ll respond to #3 here. You seem to be taking what I wrote a little too literally. It looks like you want the proposition Sleeping Beauty conditions on to be “on some day, Sleeping Beauty has received / is receiving / will receive the data X”, where X is the data that she has just received. (If this is not what you think she should condition on, then I think you should try to write the proposition you think she should condition on, using English and not mathematical symbols.) This proposition doesn’t have any reference to “a version of me”, but it seems to me to be morally the same as what I wrote (and in particular, I still think that it is really odd to say that that it is the proposition she should condition on, and that more motivation is needed for it).