Consider .
Optimization Process
(Strong approval for this post. Figuring out how to deal with filtered evidence is close to my heart.)
Suppose that the facts relevant to making optimal decisions about an Issue are represented by nine rolls of the Reality die, and that the quality (utility) of Society’s decision is proportional to the (base-two logarithm) entropy of the distribution of what facts get heard and discussed.
Sorry—what distribution are we measuring the entropy of? When I hear “entropy of a distribution,” I think -- but it’s not clear to me how to get from there to , , and .
Ahhh! Yes, that helps a great deal. Thank you!
Some wagers have the problem that their outcome correlates with the value of what’s promised. For example, “I bet $90 against your $10 that the dollar will not undergo >1000% inflation in the next ten years”: the apparent odds of 9:1 don’t equal the probability of hyperinflation at which you’d be indifferent to this bet.
For some (all?) of these problematic bets, you can mitigate the problem by making the money change hands in only one arm of the bet, reframing it as e.g. “For $90, I will sell you an IOU that pays out $100 in ten years if the dollar hasn’t seen >1000% inflation.” (Okay, you’ll still need to tweak the numbers for time-discounting purposes, but it seems simpler now that we’re conditioning on lack-of-hyperinflation.)
Does this seem correct in the weak case? (“some”)
Does this seem correct in the strong case? (“all”)
Clearly not all—the extreme version of this is betting on human extinction. It’s hard to imagine the payout that has any value after that comes to pass.
Agreed that post-extinction payouts are essentially worthless—but doesn’t the contract “For $90, I will sell you an IOU that pays out $100 in one year if humans aren’t extinct” avoid that problem?
Further point of confusion: the Emergency Use Authorization summary mentions n=31 positive samples and n=11 negative samples in the “Analytical Specificity” section—how do you get “98%” or “99%” out of those sample sizes? Shouldn’t you need at least n=50 to get 98%? Heck, why do they have any
positive(edit: negative) samples in a “Specificity” section?
If no such thing exists, I might take a stab at creating one—so I’d even love to hear if you know of some causal-graph-inference-toolkit-thing that isn’t specifically for COVID but seems like a promising foundation to build atop!
But, if no such thing exists, that also seems like evidence that it… wouldn’t be useful? Maybe because very few social graphs have the communication and methodicalness to compose a detailed list of all the interactions they take part in? Conceivably because it’s a computationally intractable problem? (I dunno, I hear that large Bayes nets are extremely hard to compute with.)
Is this some kind of attempt at code injection? :P
Only the benign kind! I’ve got some ideas burbling in my brain re: embedding dynamic content in my writing, so I’m just exploring the limits of what Less Wrong permits in its HTML. (Conclusion: images hosted on arbitrary other domains are okay, but svgs are not. Seems sane.)
I would love to live in this world.
This seems like a really hard problem: if a market like this “wins,” so that having a lot of points makes you high-status, people will try to game it, and if gaming it is easy, this will kill respect for the market.
Specific gaming strategies I can think of:
Sybil attacks: I create one “real” account and 100 sock puppets; my sock puppets make dumb bets against my real account; my real account gains points, and I discard my sock puppets. Defenses I’ve heard of against Sybil attacks: make it costly to participate (e.g. proof-of-work); make the cost of losing at least as great as the benefit of winning (e.g. make “points” equal money); or do Distributed Trust Stuff (e.g. Rangzen, TrustDavis).
Calibration-fluffing: if the market grades me on calibration, then I can make dumb predictions but still look perfectly calibrated by counterbalancing those with more dumb predictions (e.g. predict “We’ll have AGI by Tuesday, 90%”, then balance that out with nine “The sun will rise tomorrow, 90%” predictions). To protect against this… seems like you’d need some sort of way to distinguish “predictions that matter” from “calibration fluff.”
Buying status: pay people to make dumb bets against you. The Metaculus equivalent of buying Likes or Amazon reviews. On priors, if Amazon can’t squash this problem, it probably can’t be squashed.
-
Possible answer: “No election is decided by a single vote; if it’s that close, it’ll be decided by lawyers.”
Rebuttal: yeah, it’s a little fuzzy, but, without having cranked through the math, I don’t think it matters: my null hypothesis is that my vote shifts the probability distribution for who wins the legal battle in my desired direction, with an effect size around the same as in the naive lawyer-free model.
-
-
Possible answer: “You’re doing a causal-decision-theory calculation here (assuming that your vote might swing the election while everything else stays constant); but in reality, we need to break out [functional decision theory or whatever the new hotness is], on account of politicians predicting and “pricing in” your vote as they design their platforms.”
Hmm, yeah, maybe. In which case, the model shouldn’t be “my vote might swing the election,” but instead “my vote will acausally incrementally change candidates’ platforms,” which I don’t have very good models for.
-
-
Possible answer: “Sure, it’s individually rational for you to devote your energy to Getting Out The Vote instead of donating to charity, but the group-level rational thing for people to do is to donate to charity, rather than playing tug-o’-war against each other.”
Ugh, yeah, maybe. I see the point of this sort of… double-think… but I’ve never been fully comfortable with it. It sounds like this argument is saying “Hey, you put yourself at a 60% probability of being right, but actually, Outside View, it should be much smaller, like 51%.” But, buddy, the 60% is already me trying to take the outside view! My inside view is that it’s more like 95%!
It sounds like down this path lies a discussion around how overconfident I should expect my brain to be (and therefore how hard I should correct for that). Which is important, sure, but also, ugh.
-
Sure! I’m modeling the election as being coin flips: if there are more Heads than Tails, then candidate H wins, else candidate T wins.
If you flip coins, each coin coming up Heads with probability , then the number of Heads is binomially distributed with standard deviation , which I lazily rounded to .
The probability of being at a particular value near the peak of that distribution is approximately 1 / [that standard deviation]. (“Proof”: numerical simulation of flipping 500k coins 1M times, getting 250k Heads about 1⁄800 of the time.)
Wait… your county has a GDP of over half a million dollars per capita? That is insanely high!
I agree! (Well, actually more like $1-200k/capita, because there are more people than voters, but still.) Sources: population, GDP, turnout.
Also, note that your probability of swinging the election is only 1/√n if the population is split exactly 50⁄50; it drops off superexponentially as the distribution shifts to one side or the other by √n voters or more.
Yesss, this seems related to shadonra’s answer. If my “500k coin flips” model were accurate, then most elections would be very tight (with the winner winning by a margin of 1⁄800, i.e. 0.125%), which empirically isn’t what happens. So, in reality, if you don’t know how an election is going to turn out, it’s not that there are 500k fair coins, it’s that there are either 500k 51% coins or 500k 49% coins, and the uncertainty in the election outcome comes from not knowing which of those worlds you’re in. But, in either case, your chance of swinging the election is vanishingly small, because both of those worlds put extremely little probability-mass on the outcome being a one-vote margin.
if you’re actively pushing an election, not just voting yourself, then that plausibly has a much bigger impact than just your one vote
That… is… a very interesting corollary. Although… you only get the “superexponential” benefit in the case where you’re far out on the tail of the PDF—in the “500k 49% coins” world, throwing 100 votes for Heads instead of 1 would increase your chances of swinging the election by a factor of much, much more than 100x, but your probability of swinging the election is still negligible, since the 50% mark is, uh, 14 standard deviations out from the mean. Right?
Yeah, a fair point!
The understanding I came away with: there are (at least) three stages of understanding a problem:
You can’t write a program to solve it.
You can write a cartoonishly wasteful program to solve it.
You can write a computationally feasible program to solve it.
“Shuffle-sort” achieves the second level of knowledge re: sorting lists. Yeah, it’s cartoonishly wasteful, and it doesn’t even resemble any computationally feasible sorting algorithm (that I’m aware of) -- but, y’know, viewed through this lens, it’s still a huge step up from not even understanding “sorting” well enough to sort a list at all.
(Hmm, only marginally related but entertaining: if you reframe the problem of epistemology not as sequence prediction, but as “deduce what program is running your environment,” then a Solomonoff inductor can be pretty fairly described as “consider every possible object of type EnvironmentProgram; update its probability based on the sensory input; return the posterior PDF over EnvironmentProgram-space.” The equivalent program for list-sorting is “consider every possible object of type List<Int>; check if (a) it’s sorted, and (b) it matches the element-counts of the input-list; if so, return it.” Which is even more cartoonishly wasteful than shuffle-sort. Ooh, and if you want to generalize to cases where the list-elements are real numbers, I think you get/have to include something that looks a lot like Solomonoff induction, forcing countability on the the reals by iterating over all possible programs that evaluate to real numbers (and hoping to God that whatever process generated the input list, your mathematical-expression-language is powerful enough to describe all the elements).)
Hmm. If we’re trying to argmax some function over the real numbers, then the simplest algorithm would be something like “iterate over all mathematical expressions ; for each one, check whether the program ‘iterate over all provable theorems, halting when you find one that says ’ halts; if it does, return .”
...but I guess that’s not guaranteed to ever halt, since there could conceivably be an infinite procession of ever-more-complex expressions, eking out ever-smaller gains on . It seems possible that no matter what (reasonably powerful) mathematical language you choose, there are function-expressions with finite maxima at values not expressible in your language. Which is maybe what you meant by “as far as I know there can’t be [an algorithm for it].”
(I’m assuming our mathematical language doesn’t have the word , since in that case we’d pretty quickly stumble on the expression , verify that , and return it, which is obviously a cop-out.)
Thanks for the feedback! No pressure to elaborate, but if you care to—would you want to browse all predictions, even ones by people you’ve never heard of? If so, how do you know the randos you’re betting against won’t just run off with your money when you lose, and refuse to pay up when you win? Maybe you just trust the-sort-of-person-who-uses-this-site to be honorable? Or maybe you have some clever solution for establishing trust that I haven’t thought of!
(Or maybe you meant something more like “I’d like to be able to browse my friends’ predictions,” which I can totally sympathize with and it’s on my to-do list!)
Very interesting! I like this formalization/categorization.
Hm… I’d have filed “Why the tails come apart” under “Extremal Goodhart”: this image from that post is almost exactly what I was picturing while reading your abstract example for Extremal Goodhart. Is Extremal “just” a special case of Regressional, where that ellipse is a circle? Or am I missing something?