I want to understand Bayesian reasoning in detail, in the sense that, I want to take up a statement that is relevant to our daily life and then try to find exactly how much should I believe in it based on the the beliefs that I already have. I think this might be a good exercise for the LW community? If yes, then let’s take up a statement, for example, “The whole world is going to be nuked before 2020.” And now, based on whatever you know right now, you should form some percentage of belief in this statement. Can someone please show me exactly how to do that?
The interesting question isn’t so much “how do I convert a degree of belief into a number”, but “how do I reconcile my degrees of beliefs in various propositions so that they are more consistent and make me less vulnerable to Dutch books”.
One way to do that is to formalize what you take that statement to mean, so that its relationships to “other beliefs” becomes clearer. It’s what, in the example you suggest, the Doomsday clock scientists have done. So you can look at whatever data has been used by the Doomsday Clock people, and if you have reason to believe they got the data wrong (say, about international agreements), then your estimate would have to be different from theirs. Or you could figure out they forgot to include some evidence that is relevant (say, about peak uranium), or that they included evidence you disagree is relevant. In each of these cases Bayes’ theorem would probably tell you at the very least in what direction you should update your degree of belief, if not the exact amount.
Or, finally, you could disagree with them about the structural relationships between bits of evidence. That case pretty much amounts to making up your own causal model of the situation. As other commenters have noted it’s fantastically hard to apply Bayes rigorously to even a moderately sophisticated causal model, especially one that involves such an intricately interconnected system as human society. But you can always simplify, and end up with something you know is strictly wrong, but has enough correspondence with reality to be less wrong than a more naive model.
In practice, it’s worth noting that only very seldom does science tackle a statement like this one head-on; as a reductionist approach science generally tries to explicate causal relationships in much smaller portions of the whole situation, treating each such portion as a “black box” module, and hoping that once this module’s workings are formalized it can be plugged back into a more general model without threatening the overall model’s validity too much.
The word “complex” is appropriate to refer precisely to situations where this approach fails, IMHO.
Well to begin with we need a prior. You can choose one of two wagers. In the first, 1,000,000 blue marble and one red marble are put in a bag. You get to remove one marble, if it is the red one you win a million dollars. Blue you get nothing. In the second wager, you win a million dollars if a a nuclear weapon is detonated under non-testing and non-accidental conditions before 2020. Otherwise, nothing. In both cases you don’t get the money until January 1st 2021. Which wager do you prefer?
If you prefer the nuke bet, repeat with 100,000 blue marbles, if you prefer the marbles try 100,000,000. Repeat until you get wagers that are approximately equal in their estimated value to you.
Edit: Commenters other than vinayak should do this too so that he has someone to exchange information with. I think I stop at maybe 200:1 against nuking.
So 200:1 is your prior? Then where’s the rest of the calculation? Also, how exactly did you come up with the prior? How did you decide that 200:1 is the right place to stop? Or in other words, can you claim that if a completely rational agent had the same information that you have right now, then that agent would also come up with a prior of 200:1? What you have described is just a way of measuring how much you believe in something. But what I am asking is how do you decide how strong your belief should be.
It’s just the numerical expression of how likely I feel a nuclear attack is. (ETA: I didn’t just pick it out of thin air. I can give reasons but they aren’t mathematically exact. But we could work up to that by considering information about geopolitics, proliferation etc.)
Or in other words, can you claim that if a completely rational agent had the same information that you have right now, then that agent would also come up with a prior of 200:1?
No, I absolutely can’t claim that.
What you have described is just a way of measuring how much you believe in something. But what I am asking is how do you decide how strong your belief should be.
By making a lot of predictions and hopefully getting good at it while paying attention to known biases and discussing the proposition with others to catch your errors and gather new information. If you were hoping there was a perfect method for relating information about extremely complex propositions to their probabilities… I don’t have that. If anyone here does please share. I have missed this!
But theoretically, if we’re even a little bit rational the more updating we do the closer we should get to the the right answer (though I’m not actually sure we’re even this rational). So we pick priors and go from there.
Normally I would try and find systematic risk analyses by people who know more about this subject than me. However, Martin Hellman has written a preliminary risk analysis of nuclear deterrence as part of his Defusing the Nuclear Threat project, and he claims that there have been no formal studies of the failure rate of nuclear deterrence. Hellman himself estimates that failure rate as on the order of 1% a year, but I don’t know how seriously to take that estimate.
Can someone please show me exactly how to do that?
The problem with your question is that the event you described has never happened. Normally you would take a dataset and count the number of times an event occurs vs. the number of times it does not occur, and that gives you the probability.
So to get estimates here you need to be creative with the definition of events. You could count the number of times a global war started in a decade. Going back to say 1800 and counting the two world wars and the Napoleonic wars, that would give about 3⁄21. If you wanted to make yourself feel safe, you could count the number of nukes used compared to the number that have been built. You could count the number of people killed due to particular historical events, and fit a power law to the distribution.
But nothing is going to give you the exact answer. Probability is exact, but statistics (the inverse problem of probability) decidedly isn’t.
Consulting a dataset and counting the number of times the event occured and so on would be a rather frequentist way of doing things. If you are a Bayesian, you are supposed to have a probability estimate for any arbitrary hypothesis that’s presented to you. You cannot say that oh, I do not have the dataset with me right now, can I get back to you later?
What I was expecting as a reply to my question was something along the following lines. One would first come up with a prior for the hypothesis that the world will be nuked before 2020. Then, one would identify some facts that could be used as evidence in favour or against the hypothesis. And then one would do the necessary Bayesian updates.
I know how to do this for the simple cases of balls in a bin etc. But I get confused when it comes to forming beliefs about statements that are about the real world.
If you haven’t already, you might want to take a look at Bayes Theorem by Eliezer.
As sort of a quick tip about where you might be getting confused: you summarize the steps involved as (1) come up with a prior, (2) identify potential evidence, and (3) update on the evidence. You’re missing one step. You also need to check to see whether the potential evidence is “true,” and you need to do that before you update.
If you check out Conservation of Expected Evidence, linked above, you’ll see why. You can’t update just because you’ve thought of some facts that might bear on your hypothesis and guessed at their probability—if your intuition is good enough, your guess about the probability of the facts that bear on the hypothesis should already be factored into your very first prior. What you need to do is go out and actually gather information about those facts, and then update on that new information.
For example: I feel hot. I bet I’m running a fever. I estimate my chance of having a bacterial infection that would show up on a microscope slide at 20%.
I think: if my temperature were above 103 degrees, I would be twice as likely to have a bacterial infection, and if my temperature were below 103 degrees, I would only be half as likely to have a bacterial infection. Considering how hot I feel, I guess there’s a 50-50 chance my temperature is above 103 degrees. I STILL estimate my chance of having a bacterial infection at 20%, because I already accounted for all of this. This is just a longhand way of guessing.
Now, I take my temperature with a thermometer. The readout says 104 degrees. Now I update on the evidence; now I think the odds that I have a bacterial infection are 40%.
The math is fudged very heavily, but hopefully it clarifies the concepts. If you want accurate math, you can read Eliezer’s post.
The answer is… its complicated, so you approximate. A good way of approximating is getting a dataset together and putting together a good model that helps explain that dataset. Doing the perfect Bayesian update in the real world is usually worse than nontrivial—its basically impossible.
I want to understand Bayesian reasoning in detail, in the sense that, I want to take up a statement that is relevant to our daily life and then try to find exactly how much should I believe in it based on the the beliefs that I already have. I think this might be a good exercise for the LW community? If yes, then let’s take up a statement, for example, “The whole world is going to be nuked before 2020.” And now, based on whatever you know right now, you should form some percentage of belief in this statement. Can someone please show me exactly how to do that?
The interesting question isn’t so much “how do I convert a degree of belief into a number”, but “how do I reconcile my degrees of beliefs in various propositions so that they are more consistent and make me less vulnerable to Dutch books”.
One way to do that is to formalize what you take that statement to mean, so that its relationships to “other beliefs” becomes clearer. It’s what, in the example you suggest, the Doomsday clock scientists have done. So you can look at whatever data has been used by the Doomsday Clock people, and if you have reason to believe they got the data wrong (say, about international agreements), then your estimate would have to be different from theirs. Or you could figure out they forgot to include some evidence that is relevant (say, about peak uranium), or that they included evidence you disagree is relevant. In each of these cases Bayes’ theorem would probably tell you at the very least in what direction you should update your degree of belief, if not the exact amount.
Or, finally, you could disagree with them about the structural relationships between bits of evidence. That case pretty much amounts to making up your own causal model of the situation. As other commenters have noted it’s fantastically hard to apply Bayes rigorously to even a moderately sophisticated causal model, especially one that involves such an intricately interconnected system as human society. But you can always simplify, and end up with something you know is strictly wrong, but has enough correspondence with reality to be less wrong than a more naive model.
In practice, it’s worth noting that only very seldom does science tackle a statement like this one head-on; as a reductionist approach science generally tries to explicate causal relationships in much smaller portions of the whole situation, treating each such portion as a “black box” module, and hoping that once this module’s workings are formalized it can be plugged back into a more general model without threatening the overall model’s validity too much.
The word “complex” is appropriate to refer precisely to situations where this approach fails, IMHO.
Well to begin with we need a prior. You can choose one of two wagers. In the first, 1,000,000 blue marble and one red marble are put in a bag. You get to remove one marble, if it is the red one you win a million dollars. Blue you get nothing. In the second wager, you win a million dollars if a a nuclear weapon is detonated under non-testing and non-accidental conditions before 2020. Otherwise, nothing. In both cases you don’t get the money until January 1st 2021. Which wager do you prefer?
If you prefer the nuke bet, repeat with 100,000 blue marbles, if you prefer the marbles try 100,000,000. Repeat until you get wagers that are approximately equal in their estimated value to you.
Edit: Commenters other than vinayak should do this too so that he has someone to exchange information with. I think I stop at maybe 200:1 against nuking.
So 200:1 is your prior? Then where’s the rest of the calculation? Also, how exactly did you come up with the prior? How did you decide that 200:1 is the right place to stop? Or in other words, can you claim that if a completely rational agent had the same information that you have right now, then that agent would also come up with a prior of 200:1? What you have described is just a way of measuring how much you believe in something. But what I am asking is how do you decide how strong your belief should be.
It’s just the numerical expression of how likely I feel a nuclear attack is. (ETA: I didn’t just pick it out of thin air. I can give reasons but they aren’t mathematically exact. But we could work up to that by considering information about geopolitics, proliferation etc.)
No, I absolutely can’t claim that.
By making a lot of predictions and hopefully getting good at it while paying attention to known biases and discussing the proposition with others to catch your errors and gather new information. If you were hoping there was a perfect method for relating information about extremely complex propositions to their probabilities… I don’t have that. If anyone here does please share. I have missed this!
But theoretically, if we’re even a little bit rational the more updating we do the closer we should get to the the right answer (though I’m not actually sure we’re even this rational). So we pick priors and go from there.
Normally I would try and find systematic risk analyses by people who know more about this subject than me. However, Martin Hellman has written a preliminary risk analysis of nuclear deterrence as part of his Defusing the Nuclear Threat project, and he claims that there have been no formal studies of the failure rate of nuclear deterrence. Hellman himself estimates that failure rate as on the order of 1% a year, but I don’t know how seriously to take that estimate.
The problem with your question is that the event you described has never happened. Normally you would take a dataset and count the number of times an event occurs vs. the number of times it does not occur, and that gives you the probability.
So to get estimates here you need to be creative with the definition of events. You could count the number of times a global war started in a decade. Going back to say 1800 and counting the two world wars and the Napoleonic wars, that would give about 3⁄21. If you wanted to make yourself feel safe, you could count the number of nukes used compared to the number that have been built. You could count the number of people killed due to particular historical events, and fit a power law to the distribution.
But nothing is going to give you the exact answer. Probability is exact, but statistics (the inverse problem of probability) decidedly isn’t.
Consulting a dataset and counting the number of times the event occured and so on would be a rather frequentist way of doing things. If you are a Bayesian, you are supposed to have a probability estimate for any arbitrary hypothesis that’s presented to you. You cannot say that oh, I do not have the dataset with me right now, can I get back to you later?
What I was expecting as a reply to my question was something along the following lines. One would first come up with a prior for the hypothesis that the world will be nuked before 2020. Then, one would identify some facts that could be used as evidence in favour or against the hypothesis. And then one would do the necessary Bayesian updates.
I know how to do this for the simple cases of balls in a bin etc. But I get confused when it comes to forming beliefs about statements that are about the real world.
If you haven’t already, you might want to take a look at Bayes Theorem by Eliezer.
As sort of a quick tip about where you might be getting confused: you summarize the steps involved as (1) come up with a prior, (2) identify potential evidence, and (3) update on the evidence. You’re missing one step. You also need to check to see whether the potential evidence is “true,” and you need to do that before you update.
If you check out Conservation of Expected Evidence, linked above, you’ll see why. You can’t update just because you’ve thought of some facts that might bear on your hypothesis and guessed at their probability—if your intuition is good enough, your guess about the probability of the facts that bear on the hypothesis should already be factored into your very first prior. What you need to do is go out and actually gather information about those facts, and then update on that new information.
For example: I feel hot. I bet I’m running a fever. I estimate my chance of having a bacterial infection that would show up on a microscope slide at 20%.
I think: if my temperature were above 103 degrees, I would be twice as likely to have a bacterial infection, and if my temperature were below 103 degrees, I would only be half as likely to have a bacterial infection. Considering how hot I feel, I guess there’s a 50-50 chance my temperature is above 103 degrees. I STILL estimate my chance of having a bacterial infection at 20%, because I already accounted for all of this. This is just a longhand way of guessing.
Now, I take my temperature with a thermometer. The readout says 104 degrees. Now I update on the evidence; now I think the odds that I have a bacterial infection are 40%.
The math is fudged very heavily, but hopefully it clarifies the concepts. If you want accurate math, you can read Eliezer’s post.
The answer is… its complicated, so you approximate. A good way of approximating is getting a dataset together and putting together a good model that helps explain that dataset. Doing the perfect Bayesian update in the real world is usually worse than nontrivial—its basically impossible.