Yeah… well, I thought of the Z because it sounds like we’re getting the probabilities of Y from some experiment.
So Z=z is the results of the experiment, which in this case is a vector of frequencies.
When I put it like that, it sounds like it’s is just a rhetorical device for saying that we have given probabilities of Y.
But I still seem to need Z for my dictionary.
I have γ(x)=P[X=x].
What is γ′(x)?
It is some kind of updated probability of X=x, right?
Like we went from one probability to the other by doing an experiment.
If I didn’t write γ′(x)=P[X|Z=z], I’d need something like γ(x)=P1[X=x] and γ′(x)=P2[X=x].
Reading again, it seems like this is exactly Jeffrey conditionalization.
So whether you include some extra variable just depends on what you think of Jeffrey conditionalization.
I feel like I’m missing something, though, about what this experiment is and means.
For example, I’m not totally clear on whether we have one state X, and a collection of replicates of state Y; or is it a collection of replicates of (X,Y) pairs?
Looking at the paper, I see the connection to Jeffrey conditionalization is made explicitly.
And it mentions Pearl’s “virtual evidence method”; is this what he calls introducing this Z?
But no clarity on exactly what this experiment is.
It just says:
But how should the above be generalized to the situation where the new information does not come in the form of a definite value y0 for Y, but as “soft evidence,” i.e., a probability distribution τ(y)?”
By the way, regarding your coin toss example, I can at least say how this is handled in Bayesian statistics.
There are separate random variables for each coin toss.
Y1 is the first, Y2 is the second, etc.
If you have n coin tosses, then your sample is a vector →Y containing Y1 to Yn.
Then the posterior probability is P[loaded|→Y=→y].
This will be covered in any Bayesian statistics textbook as “the Bernoulli model”.
My class used Hoff’s book, which provides a quick start.
I guess this example suggests a single unknown X (whether the coin is loaded or not) and replicates of Y.
Yes, I’m aware of the Bernoulli model—my point is that the vector →Y is itself the outcome of that experiment; I suppose you can call it Z though it makes the notation a bit confusing. The general point is that yes, you update your belief about X based on a series of outcomes on Y. In fact I think in general γ′(x)=P[X=x|→Y] works just fine.
Yeah… well, I thought of the Z because it sounds like we’re getting the probabilities of Y from some experiment. So Z=z is the results of the experiment, which in this case is a vector of frequencies. When I put it like that, it sounds like it’s is just a rhetorical device for saying that we have given probabilities of Y.
But I still seem to need Z for my dictionary. I have γ(x)=P[X=x]. What is γ′(x)? It is some kind of updated probability of X=x, right? Like we went from one probability to the other by doing an experiment. If I didn’t write γ′(x)=P[X|Z=z], I’d need something like γ(x)=P1[X=x] and γ′(x)=P2[X=x].
Reading again, it seems like this is exactly Jeffrey conditionalization. So whether you include some extra variable just depends on what you think of Jeffrey conditionalization.
I feel like I’m missing something, though, about what this experiment is and means. For example, I’m not totally clear on whether we have one state X, and a collection of replicates of state Y; or is it a collection of replicates of (X,Y) pairs?
Looking at the paper, I see the connection to Jeffrey conditionalization is made explicitly. And it mentions Pearl’s “virtual evidence method”; is this what he calls introducing this Z? But no clarity on exactly what this experiment is. It just says:
By the way, regarding your coin toss example, I can at least say how this is handled in Bayesian statistics. There are separate random variables for each coin toss. Y1 is the first, Y2 is the second, etc. If you have n coin tosses, then your sample is a vector →Y containing Y1 to Yn. Then the posterior probability is P[loaded|→Y=→y]. This will be covered in any Bayesian statistics textbook as “the Bernoulli model”. My class used Hoff’s book, which provides a quick start.
I guess this example suggests a single unknown X (whether the coin is loaded or not) and replicates of Y.
Yes, I’m aware of the Bernoulli model—my point is that the vector →Y is itself the outcome of that experiment; I suppose you can call it Z though it makes the notation a bit confusing. The general point is that yes, you update your belief about X based on a series of outcomes on Y. In fact I think in general γ′(x)=P[X=x|→Y] works just fine.