Did you know you can just buy blackbelts?
Epistemic status: Exploratory. I can’t tell if this is really prevalent or if I’m just annoyed at how often it happens around me.
I.
Did you know you can just buy blackbelts? It’s true! Go online and take a look, they’re about ten dollars.
Think about that. The black belt is a symbol of skill in many martial arts, and while the exact degree of skill it implies varies, thousands of people around the world who have studied for years still don’t have one. Many an ambitious or dedicated student has entered the dojo vowing to work hard and someday get a black belt, or spoken in impressed and awed terms of the senior students who have theirs. And that can be yours for five minutes of Amazon shopping!
Of course this is nonsense. Nobody thinks it’s the fabric that’s the important part of the black belt. That’s absurd, practically a reductio ad absurdum, a Goodhart’s Law gone past plausibility into parody. And yet I keep running into people who seem to try things just as silly.
I run Calibration Trivia sometimes. It’s like pub trivia, where you try to answer questions about miscellaneous details of the world, except you also try to give an answer for how confident you are that your answer is correct. It’s easy to explain, it’s easy to run, and if you do it regularly you can start training some calibration and a good felt sense of what it’s like to be uncertain. In the martial art of rationality, calibration trivia may be our basic punching drill.
And every single time I run it[1] I get some clever fellow who points out that they could answer “I don’t know” or “somfadsoifm”, put 0.0001% chance they’re right, and wind up with the best calibration score in the room.
That person is totally right! And yet this would be pointless and everyone knows it.[2]
It’s like going to the gym with a car jack and claiming you can lift five hundred pounds because you put the jack under the dumbell. It’s not even like playing calibration trivia like this would fool the rest of the room into thinking you were impressive: I explain at the start that I’ll show both the number of correct answers and the Brier scores side by side. Having your name next to 0 out of 30 correct answers and 99.99% calibrated means you don’t know any of the answers.
II.
I’m not complaining about people doing weird munchkin moves to get the things they care about in ways that ignore parts of the normal process.
Copying and pasting code you found online (or more realistically these days, asking an LLM) isn’t buying a blackbelt. Often you’re not trying to appreciate the sublime beauty of software engineering, you’re just trying to get that script to work and upload those files.
Talking a big game about some virtue — donating to charity, being honest, tolerating different views from yours — in order to reap the social benefits of being a virtuous person, then not actually practicing the virtue, isn’t buying a blackbelt. There’s a thing you want that you have a chance of getting: the acknowledgement and adoration of your peers.
Using glitches or cheats in a videogame in order to win isn’t buying the blackbelt if you want to see the cut scenes of the story with less work, or you just like watching the explosions when you blow up the entire enemy army with a single button press. Age of Empires II, a videogame I loved growing up, was mostly about medieval armies fighting with swords and arrows but had a cheat to give you a sports car with a machine gun. I loved driving it around blowing things up. Still do once in a great while. I am in some ways a very simple man: fire is pretty.
Even publishing a blank paper and calling it The Unsuccessful Self Treatment of a Case of Writer’s Block isn’t buying a blackbelt. Sure, it’s obviously not advancing the repository of all human knowledge, but it’s funny and it made people laugh.
People can even disagree about what goal we’re pursuing! Teachers whose goal in assigning homework is to impart an appreciation for the sublime beauty of software engineering and students whose goal in doing homework is to finish in time to prep their D&D campaign later that night have different goals. The student isn’t buying a blackbelt, they’re just not on the same page with the teacher here.
The problem I’m pointing at is when the person doing it has mistaken “I found an edge case” with “I have achieved the goal.”
III.
People in the rationalist community contains people who are occasionally proud of their cleverness in finding some loophole, regardless of whether exploiting the loophole would actually get them what they want.
This is annoying to me. It’s annoying both because I get tired of explaining the pointlessness of it every time I run Calibration Trivia, sometimes to the same person again, and because the people doing it are misdirecting their energy. They aren’t actually getting the thing they want, just a poor pica version of it. I’m spending energy getting them back on track and they’re spending energy getting told no.
My best advice I’ve come up with for these circumstances is to think one move further ahead. What are you about to do, and what do you think will happen next? If you’re trying to get some kind of social acclaim, do you think other people will admire what you’re doing? If you’re playing a board game to win do you think the judge is going to accept whatever weird loophole you’re talking about, or rule against you as soon as your opponents call the judge over?
(And if you’re going to accuse someone of lying because they said something imprecise or idiomatic, and you plan to make a big deal of this and raise a stink over the untrustworthy nature of the other person, do you think observers are going to think you’re in the right once they look at what both people said? Or at least that’s what I want to say, but that particular strategy has proven surprisingly effective in my observation! Not a perfect long term strategy, to be clear, but it has more legs than I’d have thought possible when I was young and innocent in the halcyon days of 2020.)
The surface level behavior of spotting gaps can be useful in certain stages of some projects. QA testers are a beloved part of a good software team, and they’re engaged in this kind of thing all the time. But QA testers know why they’re doing it.
Please don’t take this as saying I generally don’t want you to point out an important gap in something I’m working on.
But maybe let this be one more drop in the ocean, trying to raise the sanity waterline in one very small way?
- ^
Is this literally true and it’s every time? I can’t prove that. What I can prove is that I’ve got a bunch of index cards with my notes from Calibration Trivia tests that say variants of ‘yep, someone tried the calibrated for wrong answers thing again, it was ____ this time.’
- ^
Maybe they’re trying to helpfully point out a fix to the scoring rules? This is true in some cases. I’m pretty sure Maia in that comment is trying to helpfully point out a complaint people had about the activity, and she’s musing on ways to fix it. In a couple of local cases it really didn’t seem like the local person at my meetup was trying to be helpful, since they had a habit of raising objections the majority of the times we tried any kind of rationalist practice in a dismissive way.
In the case of Calibration Trivia, my gut reaction is that you’re being a bit unfair to the ‘clever fellow’ (or at least to the hypothetical version of him in my head, who isn’t simply being a smartarse). It sounds like you’re presenting Calibration Trivia as a competitive game, and within that frame it makes sense to poke at edge cases in the rules and either exploit them or, if the exploit would clearly just be tedious and pointless, suggest that the rules are preemptively tweaked to unbreak the game. I know the ultimate purpose of the game is to train a real skill, but still, you’ve chosen gamification as your route to that goal, and maybe there are no free lunches on offer here; to the extent that people derive extra motivation from the competitive element, they’re also going to be focused on the proxy goal of scoring points rather than purely on the underlying goal of training the skill.
For a much less gameable and more fun version of Calibration Trivia, use the rules that the bar I play trivia at uses.
The game is broken up into rounds. Within a round, players/teams have 2, 4, 6, 8, and 10 points that they can assign to their answer as they submit it. They can use each only once. Here is the sequence of events:
Read out the 5 categories for the 5 questions making up the round. This way folks know roughly how to allocate their points.
Read the first question. Folks must submit their answers and the number of points they will score if their answer is correct. These come in simultaneously on the same sheet. At this time they know the categories of the other 4 questions but don’t know the questions, so they will have to estimate their knowledge of the topic areas of questions 2-5, and compare that to their confidence about their answer to question 1.
Then do the 2nd, 3rd, 4th, and 5th questions the same way. Each point value can be used only once, so the maximum score is 30 points—but you can score up to 18 points even knowing only 2⁄5 answers if you are correctly calibrated.
You can adapt this basic structure to test calibration more or less. You could, e.g., read all 5 questions and have folks assign the point values to all 5 answers before turning any of them in, or require assigning point values to the categories before hearing any of the questions. You could make the points a wager (allowing negative scores) - but allow people to bet zero some number of times. You could increase/decrease/change the point values per round.
But this basic structure allows you to reward good calibration without rewarding stupid calibration. Knowing some answers in this game is necessary but not sufficient to win.
As the classic video game design quote goes:
> “Given the opportunity, players will optimize the fun out of a game.”
(in this case substitute fun with pedagogical value)
If I came up with a game in which always saying “[wrong answer], ≈0%” was a winning strategy, I’d conclude not valuing correctness at all was a fatally flawed idea, and the change the rules so that wasn’t true any more, not insist the game was fine, and it’s the people who actually thought about the rules who were playing it wrong.
I, for one, do not enjoy playing games like calibrated trivia when the rules are broken. I am a person who often 0%s. The fun in a game from my point of view is to maximize the chances of winning (or some other goal like EV(score)). When you discourage 0%ing, you are saying “we are just doing random actions without neccessarily trying to maximize our score”. This ruins the original point of the game of trying to get the players to be as well calibrated as possible.
Additionally, there is a very easy fix: just give 2p points if the player is correct and deduct p^2 points if the player is incorrect. This gives the score an expected value of p^3 + p^2 , and gives more meaning to certainty even when it’s under 50%. It doesn’t put enough meaning on the calibration part, but a neccessary part of trivia IS putting value on being correct. If you really don’t like it, just require players to put at least 50%. If rulesets have a problem, fixing them will often result in a better ruleset.
For curious people who know a bit of chess, I played a version of bughouse (chess) where there are 3 boards. The bad rules were that you had to win 2 games to win. I got annoyed by the fact that the middle board has way to much influence, and a lot of times your center-board stopping their opponent was bad because you couldn’t get pieces. When I found the problem, I played middle board (as I was the worst chess player) and instructed my teammates to play normal chess while their opponents were thinking they are playing bughouse (we still lost somehow). After that, they talked about forcing everyone to do a move every minute. This is not what you do. You do not patch holes, you create better dams. Just give the middle board a weight of 2 and put the win condition first to 2 points like in regular bughouse
The rules have intent as well as literal text; the point of calibration trivia is to get better at calibration, not to find the person who’s best at Goodharting the rules to get a high score without being well-calibrated. (Personally, I don’t find it fun to play with such people).
Really struck a nerve with the misaligned optimizer squad, huh.
I see the point you’re getting at, and I agree that there’s a real failure mode here about I’ve been annoyed in similar ways. Heck, I kinda think it’s silly for people to show up to promotions to receive the black belt they earned, but that’s a separate topic.
At the same time, there’s another side of this which is important.
At my jiu jitsu gym there’s a new instructor who likes doing constraint led games. One of these games had the explicit goal of “get your opponents hands to the mat” with the implicit purpose of learning to off balance the top player. I decided to be a little muchkin and start grabbing peoples hands and pulling them to the mat even when they had a good base.
I actually did get social acclaim for this. The instructor thought that was awesome, and used it as an example of how he wanted people to play the games. In his view, as in mine, the point of the game is to explore how you can maneuver to win at the game as specified, without being restrained by artificial limitations which really ought to be accounted for in the game design.
If the new instructor would have tried to lecture us about playing to some underspecified “spirit” of the rules instead of the rules as he described them—and about how we’re not earning social points with him for gaming the system—and was visibly annoyed about this… he would have been missing the point that he’s not earning social points with me, and likely not with the others either. And I wouldn’t much care for winning points with him, if that’s how he were to respond. It’s a filter. A feature, not a bug.
Breaking the game is to be encouraged, and if playing the game earnestly doesn’t suit the intended purpose, “don’t hate the player, hate the game”. In his case, the game wasn’t so broken so as to ruin the game so it turned out to be more fun and probably more useful than I had anticipated. Maybe it wasn’t quite optimal, but it was playable for sure. In your case, the broken game is the sign that calibration isn’t what we care about—because that annoying shit was calibrated, and you weren’t happy about it. What we need is a better scoring rule that weights calibration appropriately. Which exist!
Any time we find ourselves annoyed, there is a learning opportunity. Annoyance is our cue that reality is violating our expectations. It’s a call to update.
If you explained the game to me, I would ask about that exploit for the sake of trying to understand why it wouldn’t work and therefore better understand the game. Hearing that this natural exploit is just there makes the game seem kind of annoying to play. If I don’t know the answer, I am punished for thinking really hard about a guess that might work (and giving it low prob) vs. not thinking.
Not sure if it fixes the issue but multiple choice seems to at least help. Contestants can put a probability on each.
Also, in case it needs saying: major props for running games like this often enough to develop pet peeves. Our community needs more concrete exercises and you’re doing the Lord’s work in developing and deploying them.
My version would be that people decide “I gain n points if I get this right” for each answer they submit, where 0<=n<=10; if they get it wrong they lose 2^n points. Collapsing the multiple success criteria into a single score ensures miscalibration is penalized, but leaves no ambiguity in what to maximize.
Notes:
If scores update with each question, you might want to start players out with a small pool of points and then have them ‘go bust’ (i.e. get relegated to spectator and/or advisory roles) if their scores go negative. Otherwise, you’d end up with overconfident-and-unlucky people death-marching through the second half of the game with a score of −976 tied around their necks, which I imagine wouldn’t be fun.
My version still isn’t perfect because players will be torn between optimizing E(score) and P(I get the highest score): in a room full of people, the person who takes first place will probably be someone who was slightly more overconfident than the idealized EV-maxxing version of themselves. (This opportunity for circumrational reasoning could be seen as a feature but is honestly-probably-mostly a bug.)
Another imperfection is that there’s no clean separation between trivia skill and rationality skill. All I can say is that this was also true of the original; the Clever Fellows aren’t just demonstrating a blatant edge case, they’re showcasing a strategy which can be microdosed to smoothly and silently trade trivia score for rationality score; in addition to training calibration, your rules implicitly train “if you’re pretty sure your best guess is wrong, pretend you have no idea so you can at least appear better-calibrated”.