I am having trouble cashing out your example in concrete terms; what kind of propositions could behave like that? More importantly, why would they behave like that?
I realize I said something wrong in my previous comment: evolutionary pressure is not the only kind of reason that someone might think their / their species’ beliefs may be trustworthy. For example, you might think that evolutionary pressure causes beliefs to become more accurate when they are about topics relevant to survival/reproduction, and that the uniformity of logic means that the kind of mind that is good at having accurate beliefs on such topics is also somewhat good at having accurate beliefs on other topics. But if you really think that there is NO reason at all that you might have accurate beliefs on a given topic, it seems to me that you do not have beliefs about that topic at all.
But if you really think that there is NO reason at all that you might have accurate beliefs on a given topic, it seems to me that you do not have beliefs about that topic at all.
This doesn’t seem true to me.
First, you need to assign probabilities in order to coherently make decisions under uncertainty, even if the probabilities are totally made up. It’s not because the probabilities are informative, it’s because if your decisions can’t be justified by any probability distribution, then you’re leaving money on the table somewhere with respect to your own preferences.
Second, recursive justification must hit bottom somewhere. At some point you have to assume something if you’re going to prove anything. So, there has to be a base of beliefs which you can’t provide justification for without relying on those beliefs themselves.
Perhaps you didn’t mean to exclude circular justification, so the recursive-justification-hits-bottom thing doesn’t contradict what you were saying. However, I think the first point stands; you sometimes want beliefs (any beliefs at all!) as opposed to no beliefs, even when there is no reason to expect their accuracy.
I certainly didn’t mean to exclude circular justification: we know that evolution is true because of the empirical and theoretical evidence, which relies on us being able to trust our senses and reasoning, and the reason we can mostly trust our senses and reasoning is because evolution puts some pressure on organisms to have good senses and reasoning.
Maybe what you are saying is useful for an AI but for humans I think the concept of “I don’t have a belief about that” is more useful than making up a number with absolutely no justification just so that you won’t get Dutch booked. I think evolution deals with Dutch books in other ways (like making us reluctant to gamble) and so it’s not necessary to deal with that issue explicitly most of the time.
I agree. The concept of “belief” comes apart into different notions in such cases; like, we might explicitly say “I don’t have a belief about that” and we might internally be unable to summon any arguments one way or another, but we might find ourselves making decisions nonetheless.
I do think this is somewhat relevant for humans rather than only AI, though. If we find ourselves paralyzed and unable to act because we are unable to form a belief, we will end up doing nothing, which in many cases will be worse that things we would have done had we assigned any probability at all. Needing to make decisions is a more powerful justification for needing probabilities than Dutch books are.
I am having trouble cashing out your example in concrete terms; what kind of propositions could behave like that? More importantly, why would they behave like that?
The propositions aren’t doing anything. The dice rolls represent genetic variation (the algorithm could be less convoluted, but it felt appropriate). The propositions can be anything from “earth is flat”, to “I will win a lottery”. Your beliefs about these propositions depend on your initial priors, and the premise is that these can depend on your genes.
For example, you might think that evolutionary pressure causes beliefs to become more accurate when they are about topics relevant to survival/reproduction, and that the uniformity of logic means that the kind of mind that is good at having accurate beliefs on such topics is also somewhat good at having accurate beliefs on other topics.
Sure, there are reasons why we might expect the “species average” predictions not to be too bad. But there are better groups. E.g. we would surely improve the quality of our predictions if, while taking the average, we ignored the toddlers, the senile and the insane. We would improve even more if we only averaged the well educated. And if I myself am educated and sane adult, then I can expect reasonably well that I’m outperforming the “species average”, even under your consideration.
But if you really think that there is NO reason at all that you might have accurate beliefs on a given topic, it seems to me that you do not have beliefs about that topic at all.
If I know nothing about a topic, then I have my priors. That’s what priors are. To “not have beliefs” is not a valid option in this context. If I ask you for a prediction, you should be able to say something (e.g. “0.5″).
I think the species average belief for both “earth is flat” and “I will win a lottery” is much less than 0.35. That is why I am confused about your example.
I think Hanson would agree that you have to take a weighted average, and that toddlers should be weighted less highly. But toddlers should agree that they should be weighted less highly, since they know that they do not know much about the world.
If the topic is “Is xzxq kskw?” then it seems reasonable to say that you have no beliefs at all. I would rather say that than say that the probability is 0.5. If the topic is something that is meaningful to you, then the way that the proposition gets its meaning should presumably also let you estimate its likelihood, in a way that bears some relation to accuracy.
I think the species average belief for both “earth is flat” and “I will win a lottery” is much less than 0.35. That is why I am confused about your example.
Feel free to take more contentious propositions, like “there is no god” or “I should switch in Monty Hall”. But, also, you seem to be talking about current beliefs, and Hanson is talking about genetic predispositions, which can be modeled as beliefs at birth. If my initial prior, before I saw any evidence, was P(earth is flat)=0.6, that doesn’t mean I still believe that earth is flat. It only means that my posterior is slightly higher than someone’s who saw the same evidence but started with a lower prior.
Anyway, my entire point is that if you take many garbage predictions and average them out, you’re not getting anything better than what you started with. Averaging only makes sense with additional assumptions. Those assumptions may sometimes be true in practice, but I don’t see them stated in Hanson’s paper.
I think Hanson would agree that you have to take a weighted average
No, I don’t think weighing makes sense in Hanson’s framework of pre-agents.
But toddlers should agree that they should be weighted less highly, since they know that they do not know much about the world.
No, idiots don’t always know that they’re idiots. An idiot who doesn’t know it is called a “crackpot”. There are plenty of those. Toddlers are also surely often overconfident, though I don’t think there is a word for that.
If the topic is “Is xzxq kskw?” then it seems reasonable to say that you have no beliefs at all.
When modeling humans as Bayesians, “having no beliefs” doesn’t type check. A prior is a function from propositions to probabilities and “I don’t know” is not a probability. You could perhaps say that “Is xzxq kskw?” is not a valid proposition. But I’m not sure why bother. I don’t see how this is relevant to Hanson’s paper.
P(earth is flat)=0.6 isn’t a garbage prediction, since it lets people update to something reasonable after seeing the appropriate evidence. It doesn’t incorporate all the evidence, but that’s a prior for you.
I think God and Monty Hall are both interesting examples, In particular Monty Hall is interesting because so many professional mathematicians got the wrong answer for it, and God is interesting because people disagree as to who the relevant experts are, as well as what epistemological framework is appropriate for evaluating such a proposition. I don’t think I can give you a good answer to either of them (and just to be clear I never said that I agreed with Hanson’s point of view).
Maybe you’re right that xzxq is not relevant to Hanson’s paper.
Regarding weighting, Hanson’s paper doesn’t talk about averaging at all so it doesn’t make sense to ask whether the averaging that it talks about is weighted. But the idea that all agents would update to a (weighted) species-average belief is an obvious candidate for an explanation for why their posteriors would agree. I realize my previous comments may have obscured this distinction, sorry about that.
P(earth is flat)=0.6 isn’t a garbage prediction, since it lets people update to something reasonable after seeing the appropriate evidence.
What is a garbage prediction then? P=0 and P=1? When I said “garbage”, I meant that it has no relation to the real world, it’s about as good as rolling a die to choose a probability.
Why? Are there no conceivable lotteries with that probability of winning? (There are, e.g. if I bought multiple tickets). Is there no evidence that we could see in order to update this prediction? (There is, e.g. the number of tickets sold, the outcomes of past lotteries, etc). I continue to not understand what standard of “garbage” you’re using.
So, I guess it depends on exactly how far back you want to go when erasing your background knowledge to try to form the concept of a prior. I was assuming you still knew something about the structure of the problem, i.e. that there would be a bunch of tickets sold, that you have only bought one, etc. But you’re right that you could recategorize those as evidence in which case the proper prior wouldn’t depend on them.
If you take this to the extreme you could say that the prior for every sentence should be the same, because the minimum amount of knowledge you could have about a sentence is just “There is a sentence”. You could then treat all facts about the number of words in the sentence, the instances in which you have observed people using those words, etc. as observations to be updated on.
It is tempting to say that the prior for every sentence should be 0.5 in this case (in which case a “garbage prediction” would just be one that is sufficiently far away from 0.5 on a log-odds scale), but it is not so clear that a “randomly chosen” sentence (whatever that means) has a 0.5 probability of being true. If by a “randomly chosen” sentence we mean the kinds of sentences that people are likely to say, then estimating the probability of such a sentence requires all of the background knowledge that we have, and we are left with the same problem.
Maybe all of this is an irrelevant digression. After rereading your previous comments, it occurs to me that maybe I should put it this way: After updating, you have a bunch of people who all have a small probability for “the earth is flat”, but they may have slightly different probabilities due to different genetic predispositions. Are you saying that you don’t think averaging makes sense here? There is no issue with the predictions being garbage, we both agree that they are not garbage. The question is just whether to average them.
I was assuming you still knew something about the structure of the problem, i.e. that there would be a bunch of tickets sold, that you have only bought one, etc.
If you’ve already observed all the possible evidence, then your prediction is not a “prior” any more, in any sense of the word. Also, both total tickets sold and the number of tickets someone bought are variables. If I know that there is a lottery in the real world, I don’t usually know how many tickets they really sold (or will sell), and I’m usually allowed to buy more than one (although it’s hard for me to not know how many I have).
After updating, you have a bunch of people who all have a small probability for “the earth is flat”, but they may have slightly different probabilities due to different genetic predispositions. Are you saying that you don’t think averaging makes sense here?
I think that Hanson wants to average before updating. Although if everyone is a perfect bayesian and saw the same evidence, then maybe there isn’t a huge difference between averaging before or after the update.
Either way, my position is that averaging is not justified without additional assumptions. Though I’m not saying that averaging is necessarily harmful either.
If you are doing a log-odds average then it doesn’t matter whether you do it before or after updating.
Like I pointed out in my previous comment the question “how much evidence have I observed / taken into account?” is a continuous question with no obvious “minimum” answer. The answer “I know that a bunch of tickets will be sold, and that I will only buy a few” seems to me to not be a “maximum” answer either, so beliefs based on it seem reasonable to call a “prior”, even if under some framings they are a posterior. Though really it is pointless to talk about what is a prior if we don’t have some specific set of observations in mind that we want our prior to be prior to.
I am having trouble cashing out your example in concrete terms; what kind of propositions could behave like that? More importantly, why would they behave like that?
I realize I said something wrong in my previous comment: evolutionary pressure is not the only kind of reason that someone might think their / their species’ beliefs may be trustworthy. For example, you might think that evolutionary pressure causes beliefs to become more accurate when they are about topics relevant to survival/reproduction, and that the uniformity of logic means that the kind of mind that is good at having accurate beliefs on such topics is also somewhat good at having accurate beliefs on other topics. But if you really think that there is NO reason at all that you might have accurate beliefs on a given topic, it seems to me that you do not have beliefs about that topic at all.
This doesn’t seem true to me.
First, you need to assign probabilities in order to coherently make decisions under uncertainty, even if the probabilities are totally made up. It’s not because the probabilities are informative, it’s because if your decisions can’t be justified by any probability distribution, then you’re leaving money on the table somewhere with respect to your own preferences.
Second, recursive justification must hit bottom somewhere. At some point you have to assume something if you’re going to prove anything. So, there has to be a base of beliefs which you can’t provide justification for without relying on those beliefs themselves.
Perhaps you didn’t mean to exclude circular justification, so the recursive-justification-hits-bottom thing doesn’t contradict what you were saying. However, I think the first point stands; you sometimes want beliefs (any beliefs at all!) as opposed to no beliefs, even when there is no reason to expect their accuracy.
I certainly didn’t mean to exclude circular justification: we know that evolution is true because of the empirical and theoretical evidence, which relies on us being able to trust our senses and reasoning, and the reason we can mostly trust our senses and reasoning is because evolution puts some pressure on organisms to have good senses and reasoning.
Maybe what you are saying is useful for an AI but for humans I think the concept of “I don’t have a belief about that” is more useful than making up a number with absolutely no justification just so that you won’t get Dutch booked. I think evolution deals with Dutch books in other ways (like making us reluctant to gamble) and so it’s not necessary to deal with that issue explicitly most of the time.
I agree. The concept of “belief” comes apart into different notions in such cases; like, we might explicitly say “I don’t have a belief about that” and we might internally be unable to summon any arguments one way or another, but we might find ourselves making decisions nonetheless.
I do think this is somewhat relevant for humans rather than only AI, though. If we find ourselves paralyzed and unable to act because we are unable to form a belief, we will end up doing nothing, which in many cases will be worse that things we would have done had we assigned any probability at all. Needing to make decisions is a more powerful justification for needing probabilities than Dutch books are.
The propositions aren’t doing anything. The dice rolls represent genetic variation (the algorithm could be less convoluted, but it felt appropriate). The propositions can be anything from “earth is flat”, to “I will win a lottery”. Your beliefs about these propositions depend on your initial priors, and the premise is that these can depend on your genes.
Sure, there are reasons why we might expect the “species average” predictions not to be too bad. But there are better groups. E.g. we would surely improve the quality of our predictions if, while taking the average, we ignored the toddlers, the senile and the insane. We would improve even more if we only averaged the well educated. And if I myself am educated and sane adult, then I can expect reasonably well that I’m outperforming the “species average”, even under your consideration.
If I know nothing about a topic, then I have my priors. That’s what priors are. To “not have beliefs” is not a valid option in this context. If I ask you for a prediction, you should be able to say something (e.g. “0.5″).
I think the species average belief for both “earth is flat” and “I will win a lottery” is much less than 0.35. That is why I am confused about your example.
I think Hanson would agree that you have to take a weighted average, and that toddlers should be weighted less highly. But toddlers should agree that they should be weighted less highly, since they know that they do not know much about the world.
If the topic is “Is xzxq kskw?” then it seems reasonable to say that you have no beliefs at all. I would rather say that than say that the probability is 0.5. If the topic is something that is meaningful to you, then the way that the proposition gets its meaning should presumably also let you estimate its likelihood, in a way that bears some relation to accuracy.
Feel free to take more contentious propositions, like “there is no god” or “I should switch in Monty Hall”. But, also, you seem to be talking about current beliefs, and Hanson is talking about genetic predispositions, which can be modeled as beliefs at birth. If my initial prior, before I saw any evidence, was P(earth is flat)=0.6, that doesn’t mean I still believe that earth is flat. It only means that my posterior is slightly higher than someone’s who saw the same evidence but started with a lower prior.
Anyway, my entire point is that if you take many garbage predictions and average them out, you’re not getting anything better than what you started with. Averaging only makes sense with additional assumptions. Those assumptions may sometimes be true in practice, but I don’t see them stated in Hanson’s paper.
No, I don’t think weighing makes sense in Hanson’s framework of pre-agents.
No, idiots don’t always know that they’re idiots. An idiot who doesn’t know it is called a “crackpot”. There are plenty of those. Toddlers are also surely often overconfident, though I don’t think there is a word for that.
When modeling humans as Bayesians, “having no beliefs” doesn’t type check. A prior is a function from propositions to probabilities and “I don’t know” is not a probability. You could perhaps say that “Is xzxq kskw?” is not a valid proposition. But I’m not sure why bother. I don’t see how this is relevant to Hanson’s paper.
P(earth is flat)=0.6 isn’t a garbage prediction, since it lets people update to something reasonable after seeing the appropriate evidence. It doesn’t incorporate all the evidence, but that’s a prior for you.
I think God and Monty Hall are both interesting examples, In particular Monty Hall is interesting because so many professional mathematicians got the wrong answer for it, and God is interesting because people disagree as to who the relevant experts are, as well as what epistemological framework is appropriate for evaluating such a proposition. I don’t think I can give you a good answer to either of them (and just to be clear I never said that I agreed with Hanson’s point of view).
Maybe you’re right that xzxq is not relevant to Hanson’s paper.
Regarding weighting, Hanson’s paper doesn’t talk about averaging at all so it doesn’t make sense to ask whether the averaging that it talks about is weighted. But the idea that all agents would update to a (weighted) species-average belief is an obvious candidate for an explanation for why their posteriors would agree. I realize my previous comments may have obscured this distinction, sorry about that.
What is a garbage prediction then? P=0 and P=1? When I said “garbage”, I meant that it has no relation to the real world, it’s about as good as rolling a die to choose a probability.
P(I will win the lottery) = 0.6 is a garbage prediction.
Why? Are there no conceivable lotteries with that probability of winning? (There are, e.g. if I bought multiple tickets). Is there no evidence that we could see in order to update this prediction? (There is, e.g. the number of tickets sold, the outcomes of past lotteries, etc). I continue to not understand what standard of “garbage” you’re using.
So, I guess it depends on exactly how far back you want to go when erasing your background knowledge to try to form the concept of a prior. I was assuming you still knew something about the structure of the problem, i.e. that there would be a bunch of tickets sold, that you have only bought one, etc. But you’re right that you could recategorize those as evidence in which case the proper prior wouldn’t depend on them.
If you take this to the extreme you could say that the prior for every sentence should be the same, because the minimum amount of knowledge you could have about a sentence is just “There is a sentence”. You could then treat all facts about the number of words in the sentence, the instances in which you have observed people using those words, etc. as observations to be updated on.
It is tempting to say that the prior for every sentence should be 0.5 in this case (in which case a “garbage prediction” would just be one that is sufficiently far away from 0.5 on a log-odds scale), but it is not so clear that a “randomly chosen” sentence (whatever that means) has a 0.5 probability of being true. If by a “randomly chosen” sentence we mean the kinds of sentences that people are likely to say, then estimating the probability of such a sentence requires all of the background knowledge that we have, and we are left with the same problem.
Maybe all of this is an irrelevant digression. After rereading your previous comments, it occurs to me that maybe I should put it this way: After updating, you have a bunch of people who all have a small probability for “the earth is flat”, but they may have slightly different probabilities due to different genetic predispositions. Are you saying that you don’t think averaging makes sense here? There is no issue with the predictions being garbage, we both agree that they are not garbage. The question is just whether to average them.
If you’ve already observed all the possible evidence, then your prediction is not a “prior” any more, in any sense of the word. Also, both total tickets sold and the number of tickets someone bought are variables. If I know that there is a lottery in the real world, I don’t usually know how many tickets they really sold (or will sell), and I’m usually allowed to buy more than one (although it’s hard for me to not know how many I have).
I think that Hanson wants to average before updating. Although if everyone is a perfect bayesian and saw the same evidence, then maybe there isn’t a huge difference between averaging before or after the update.
Either way, my position is that averaging is not justified without additional assumptions. Though I’m not saying that averaging is necessarily harmful either.
If you are doing a log-odds average then it doesn’t matter whether you do it before or after updating.
Like I pointed out in my previous comment the question “how much evidence have I observed / taken into account?” is a continuous question with no obvious “minimum” answer. The answer “I know that a bunch of tickets will be sold, and that I will only buy a few” seems to me to not be a “maximum” answer either, so beliefs based on it seem reasonable to call a “prior”, even if under some framings they are a posterior. Though really it is pointless to talk about what is a prior if we don’t have some specific set of observations in mind that we want our prior to be prior to.