The person proposing the bet is usually right.
This is a crucial observation if you are trying to use this technique to improve your calibration of your own accuracy! You can’t just start making bets when no one else you associate regularly is challenging you to the bets.
Several years ago, I started taking note of all of the times I disagreed with other people and looking it up, but initially, I only counted myself as having “disagreed with other people” if they said something I thought was wrong, and I attempted to correct them. Then I soon added in the case when they corrected me and I argued back. During this period of time, I went from thinking I was about 90% accurate in my claims to believing I was way more accurate than that. I would go months without being wrong, and this was in college, so I was frequently getting into disagreements with people, probably, an average, three a day during the school year. Then I started checking the times that other people corrected me, just as much as I checked when I corrected other people. (Counting even the times that I made no attempt to argue.) And my accuracy rate plummeted.
Another thing I would recommend to people starting out in doing this is that you should keep track of your record with individual people not just your general overall record. My accuracy rate with a few people is way lower than my overall accuracy rate. My overall rate is higher than it should be because I know a few argumentative people who are frequently wrong. (This would probably change if we were actually betting money, and we were only counting arguments when those people were willing to bet. So you’re approach adjusts for this better than mine.) I have several people for whom I’m close to 50%, and there are two people for whom I have several data points and my overall accuracy is below 50%.
There’s one other point I think somebody needs to make about calibration. And that’s that 75% accuracy when you disagree with other people is not the same thing as 75% accuracy. 75% information fidelity is atrocious; 95% information fidelity is not much better. Human brains are very defective in a lot of ways, but they aren’t that defective! Except at doing math. Brains are ridiculously bad at math relative to how easily machines can be implemented to be good at math. For most intents and purposes, 99% isn’t a very high percentage. I am not a particular good driver, but I haven’t gotten into a collision with another vehicle in my well over 1000 times driving. Percentages tend to have an exponential scale to them (or more accurately a logistic curve). You don’t have to be a particularly good driver to avoid getting into an accident 99.9% of the time you get behind the wheel, because that is just a few orders of magnitude improvement relative to 50%.
Information fidelity differs from information retention. Discarding 25% or 95% or more of collected information is reasonable; corrupting information at that rate is what I’m saying would be horrendous. (Because discarding information conserves resources; whereas corrupting information does not… except to the extent that you would consider compressing information with a lossy (as in “not lossless”) compression to be a corrupting information, but I would still consider that to be discarding information. Episodic memory is either very compressed or very corrupted depending on what you think it should be.)
In my experience, people are actually more likely to be underconfident about factual information than they are to be overconfident, if you measure confidence on an absolute scale instead of a relative-to-other-people scale. My family goes to trivia night, and we almost always get at least as many correct as we expect to get correct, usually more. However, other teams typically score better than we expect them to score too, and we win the round less often than we expect to.
Think back to grade school when you actually had fill in the blank and multiple choice questions on tests. I’m going to guess that you probably were an A student and got around 95% right on your tests… because a) that’s about what I did and I tend to project, b) you’re on LessWrong so you were probably an A student, and C) you say you feel like you ought to be right about 95% of the time. I’m also going to guess (because I tend to project my experience onto other people) that you probably felt a lot less than 95% confident on average when you were taking the tests. There were more than a few tests I took in my time in school where I walked out of the test thinking “I didn’t know any of that; I’ll probably get a 70 or better just because that would be horribly bad compared to what I usually do, but I really feel like I failed that”… and it was never 70. (Math was the one exception in which I tended to be overconfident, I usually made more mistakes than I expected to make on my math tests.)
Where calibration is really screwed up is when you deal with subjects that are way outside of the domain of normal experience, especially if you know that you know more than your peer group about this domain. People are not good at thinking about abstract mathematics, artificial intelligence, physics, evolution, and other subjects that happen at a different scale from normal everyday life. When I was 17, I thought I understood Quantum Mechanics just because I’d read A Brief History of Time and A Universe in a Nut Shell… Boy was I wrong!
On LessWrong, we are usually discussing subjects that are way beyond the domain of normal human experience, so we tend to be overconfident in our understanding of these subjects… but part of the reason for this overconfidence is that we do tend to be correct about most of the things we encounter within the confines of routine experience.
I have been recently questioning how worthwhile it is to be perceived as smart. Since I have always wanted to be intelligent, having people affirm my intelligence has always made me feel validated, much more so than receiving other forms of compliments. Either in response to that form of approval or else in anticipation of receiving it, I have gone out of my way to present myself primarily as an intelligent person and to consider any other perceptions others may have of me as secondary to that one.
As I’ve begun to question whether this is a good image to promote, one of the things I’ve begun to think is that I was mistaken to seek to promote a single uniformed image of myself at all. Instead, I should try to tailor more to my specific context. When I’ve tried to think of how I should want to project myself in general, the best idea I have been able to come up with is that I would want people to find me interesting. But when I think about what would be a beneficial way to be perceived in various contexts, I quickly realized that there are context-specific labels that direct me towards how I should want other people to view me at various times. In the office, I should seek to be viewed as being “professional.” When I’m looking for dates, I want people to think I’m “sexy.” When I comment on LW, I should probably seek to be perceived as “rational.” This all feels very obvious in retrospect, but while I was still focused on trying to appear “smart,” I was unable to even ask the right questions that would help me present myself to the world in a more desirable way.
Complicating this decision is the realization that I have to transition away from feeling successful about my self image to feeling far less successful. A few months ago, I felt good about myself when my boss identified my intelligence as the greatest asset I brought to the company and also identified increasing my professionalism as the area where I needed to grow the most. Today, if I were to truly internalize the change that I have been making cognitively, I think I should regard the phrase “highly professional nitwit” as slightly more flattering than “unprofessional genius,” in which case I need to change my view of the feedback I received. At the time, I felt like those two comments summed to feedback that was mostly positive. Emotionally, I still feel that way. But since this discussion was in a business context, and since what we were talking about had far less to do with what is actually true than it did with what my boss perceived to be true, I think I should view the feedback I received as indicating that I should make substantial adjustments to my behavior. E.g. I need to adjust my email response policy away from asking myself “Do I have anything worth saying to add to this conversations?” and more towards “Given that I don’t have anything noteworthy to add to this conversation, is it more professional to acknowledge receipt of this particular email or to simply move on?” (I need to make dozens of small changes similar to that one, wherever I notice that some particular aspect of my behavior is based on my desire to appear intelligent.) Again, this feels very obvious now that I’ve begun to think about things in these terms, but I previously failed to even ask the necessary questions.