Inadequacy and Modesty
The following is the beginning of Inadequate Equilibria, a new sequence/book on a generalization of the notion of efficient markets, and on this notion’s implications for practical decision-making and epistemic rationality.
This is a book about two incompatible views on the age-old question: “When should I think that I may be able to do something unusually well?”
These two viewpoints tend to give wildly different, nearly cognitively nonoverlapping analyses of questions like:
My doctor says I need to eat less and exercise, but a lot of educated-sounding economics bloggers are talking about this thing called the “Shangri-La Diet.” They’re saying that in order to lose weight, all you need to do is consume large quantities of flavorless, high-calorie foods at particular times of day; and they claim some amazing results with this diet. Could they really know better than my doctor? Would I be able to tell if they did?
My day job is in artificial intelligence and decision theory. And I recall the dark days before 2015, when there was plenty of effort and attention going into advancing the state of the art in AI capabilities, but almost none going into AI alignment: better understanding AI designs and goals that can safely scale with capabilities. Though interest in the alignment problem has since increased quite a bit, it still makes sense to ask whether at the time I should have inferred from the lack of academic activity that there was no productive work to be done here; since if there were reachable fruits, wouldn’t academics be taking them?
Should I try my hand at becoming an entrepreneur? Whether or not it should be difficult to spot promising ideas in a scientific field, it certainly can’t be easy to think up a profitable idea for a new startup. Will I be able to find any good ideas that aren’t already taken?
The effective altruism community is a network of philanthropists and researchers that try to find the very best ways to benefit others per dollar, in full generality. Where should effective altruism organizations like GiveWell expect to find low-hanging fruit—neglected interventions ripe with potential? Where should they look to find things that our civilization isn’t already doing about as well as can be done?
When I think about problems like these, I use what feels to me like a natural generalization of the economic idea of efficient markets. The goal is to predict what kinds of efficiency we should expect to exist in realms beyond the marketplace, and what we can deduce from simple observations. For lack of a better term, I will call this kind of thinking inadequacy analysis.
Toward the end of this book, I’ll try to refute an alternative viewpoint that is increasingly popular among some of my friends, one that I think is ill-founded. This viewpoint is the one I’ve previously termed “modesty,” and the message of modesty tends to be: “You can’t expect to be able to do X that isn’t usually done, since you could just be deluding yourself into thinking you’re better than other people.”
I’ll open with a cherry-picked example that I think helps highlight the difference between these two viewpoints.
I once wrote a report, “Intelligence Explosion Microeconomics,” that called for an estimate of the economic growth rate in a fully developed country—that is, a country that is no longer able to improve productivity just by importing well-tested innovations. A footnote of the paper remarked that even though Japan was the country with the most advanced technology—e.g., their cellphones and virtual reality technology were five years ahead of the rest of the world’s—I wasn’t going to use Japan as my estimator for developed economic growth, because, as I saw it, Japan’s monetary policy was utterly deranged.
Roughly, Japan’s central bank wasn’t creating enough money. I won’t go into details here.
A friend of mine, and one of the most careful thinkers I know—let’s call him “John”—made a comment on my draft to this effect:
How do you claim to know this? I can think of plenty of other reasons why Japan could be in a slump: the country’s shrinking and aging population, its low female workplace participation, its high levels of product market regulation, etc. It looks like you’re venturing outside of your area of expertise to no good end.
“How do you claim to know this?” is a very reasonable question here. As John later elaborated, macroeconomics is an area where data sets tend to be thin and predictive performance tends to be poor. And John had previously observed me making contrarian claims where I’d turned out to be badly wrong, like endorsing Gary Taubes’ theories about the causes of the obesity epidemic. More recently, John won money off of me by betting that AI performance on certain metrics would improve faster than I expected; John has a good track record when it comes to spotting my mistakes.
It’s also easy to imagine reasons an observer might have been skeptical. I wasn’t making up my critique of Japan myself; I was reading other economists and deciding that I trusted the ones who were saying that the Bank of Japan was doing it wrong… … Yet one would expect the governing board of the Bank of Japan to be composed of experienced economists with specialized monetary expertise. How likely is it that any outsider would be able to spot an obvious flaw in their policy? How likely is it that someone who isn’t a professional economist (e.g., me) would be able to judge which economic critiques of the Bank of Japan were correct, or which critics were wise?
How likely is it that an entire country—one of the world’s most advanced countries—would forego trillions of dollars of real economic growth because their monetary controllers—not politicians, but appointees from the professional elite—were doing something so wrong that even a non-professional could tell? How likely is it that a non-professional could not just suspect that the Bank of Japan was doing something badly wrong, but be confident in that assessment?
Surely it would be more realistic to search for possible reasons why the Bank of Japan might not be as stupid as it seemed, as stupid as some econbloggers were claiming. Possibly Japan’s aging population made growth impossible. Possibly Japan’s massive outstanding government debt made even the slightest inflation too dangerous. Possibly we just aren’t thinking of the complicated reasoning going into the Bank of Japan’s decision.
Surely some humility is appropriate when criticizing the elite decision-makers governing the Bank of Japan. What if it’s you, and not the professional economists making these decisions, who have failed to grasp the relevant economic considerations?
I’ll refer to this genre of arguments as “modest epistemology.”
In conversation, John clarified to me that he rejects this genre of arguments; but I hear these kinds of arguments fairly often. The head of an effective altruism organization once gave voice to what I would consider a good example of this mode of thinking:
I find it helpful to admit to unpleasant facts that will necessarily be true in the abstract, in order to be more willing to acknowledge them in specific cases. For instance, I should expect a priori to be below average at half of things, and be 50% likely to be of below average talent overall; to know many people who I regard as better than me according to my values; to regularly make decisions that look silly ex post, and also ex ante; to be mistaken about issues on which there is expert disagreement about half of the time; to perform badly at many things I attempt for the first time; and so on.
The Dunning-Kruger effect shows that unskilled individuals often rate their own skill very highly. Specifically, although there does tend to be a correlation between how competent a person is and how competent they guess they are, this correlation is weaker than one might suppose. In the original study, people in the bottom two quartiles of actual test performance tended to think they did better than about 60% of test-takers, while people in the top two quartiles tended to think they did better than 70% of test-takers.
This suggests that a typical person’s guesses about how they did on a test are evidence, but not particularly powerful evidence: the top quartile is underconfident in how well they did, and the bottom quartiles are highly overconfident.
Given all that, how can we gain much evidence from our belief that we are skilled? Wouldn’t it be more prudent to remind ourselves of the base rate—the prior probability of 50% that we are below average?
Reasoning along similar lines, software developer Hal Finney has endorsed “abandoning personal judgment on most matters in favor of the majority view.” Finney notes that the average person’s opinions would be more accurate (on average) if they simply deferred to the most popular position on as many issues as they could. For this reason:
I choose to adopt the view that in general, on most issues, the average opinion of humanity will be a better and less biased guide to the truth than my own judgment.
[…] I would suggest that although one might not always want to defer to the majority opinion, it should be the default position. Rather than starting with the assumption that one’s own opinion is right, and then looking to see if the majority has good reasons for holding some other view, one should instead start off by following the majority opinion; and then only adopt a different view for good and convincing reasons. On most issues, the default of deferring to the majority will be the best approach. If we accept the principle that “extraordinary claims require extraordinary evidence”, we should demand a high degree of justification for departing from the majority view. The mere fact that our own opinion seems sound would not be enough.1
In this way, Finney hopes to correct for overconfidence and egocentric biases.
Finney’s view is an extreme case, but helps illustrate a pattern that I believe can be found in some more moderate and widely endorsed views. When I speak of “modesty,” I have in mind a fairly diverse set of positions that rest on a similar set of arguments and motivations.
I once heard an Oxford effective altruism proponent crisply summarize what I take to be the central argument for this perspective: “You see that someone says X, which seems wrong, so you conclude their epistemic standards are bad. But they could just see that you say Y, which sounds wrong to them, and conclude your epistemic standards are bad.”2 On this line of thinking, you don’t get any information about who has better epistemic standards merely by observing that someone disagrees with you. After all, the other side observes just the same fact of disagreement.
Applying this argument form to the Bank of Japan example: I receive little or no evidence just from observing that the Bank of Japan says “X” when I believe “not X.” I also can’t be getting strong evidence from any object-level impression I might have that I am unusually competent. So did my priors imply that I and I alone ought to have been born with awesome powers of discernment? (Modest people have posed this exact question to me on more than one occasion.)
It should go without saying that this isn’t how I would explain my own reasoning. But if I reject arguments of the form, “We disagree, therefore I’m right and you’re wrong,” how can I claim to be correct on an economic question where I disagree with an institution as reputable as the Bank of Japan?
The other viewpoint, opposed to modesty—the view that I think is prescribed by normative epistemology (and also by more or less mainstream microeconomics)—requires a somewhat longer introduction.
By ancient tradition, every explanation of the Efficient Markets Hypothesis must open with the following joke:
Two economists are walking along the street, and one says, “Hey, someone dropped a $20 bill!” and the other says, “Well, it can’t be a real $20 bill because someone would have picked it up already.”
Also by ancient tradition, the next step of the explanation is to remark that while it may make sense to pick up a $20 bill you see on a relatively deserted street, if you think you have spotted a $20 bill lying on the floor of Grand Central Station (the main subway nexus of New York City), and it has stayed there for several hours, then it probably is a fake $20 bill, or it has been glued to the ground.
In real life, when I asked a group of twenty relatively young people how many of them had ever found a $20 bill on the street, five raised their hands, and only one person had found a $20 bill on the street on two separate occasions. So the empirical truth about the joke is that while $20 bills on the street do exist, they’re rare.
On the other hand, the implied policy is that if you do find a $20 bill on the street, you should go ahead and pick it up, because that does happen. It’s not that rare. You certainly shouldn’t start agonizing over whether it’s too arrogant to believe that you have better eyesight than everyone else who has recently walked down the street.
On the other other hand, you should start agonizing about whether to trust your own mental processes if you think you’ve seen a $20 bill stay put for several hours on the floor of Grand Central Station. Especially if your explanation is that nobody else is eager for money.
Is there any other domain such that if we think we see an exploitable possibility, we should sooner doubt our own mental competence than trust the conclusion we reasoned our way to?
If I had to name the single epistemic feat at which modern human civilization is most adequate, the peak of all human power of estimation, I would unhesitatingly reply, “Short-term relative pricing of liquid financial assets, like the price of S&P 500 stocks relative to other S&P 500 stocks over the next three months.” This is something into which human civilization puts an actual effort.
Millions of dollars are offered to smart, conscientious people with physics PhDs to induce them to enter the field.
These people are then offered huge additional payouts conditional on actual performance—especially outperformance relative to a baseline.3
Large corporations form to specialize in narrow aspects of price-tuning.
They have enormous computing clusters, vast historical datasets, and competent machine learning professionals.
They receive repeated news of success or failure in a fast feedback loop.4
The knowledge aggregation mechanism—namely, prices that equilibrate supply and demand for the financial asset—has proven to work beautifully, and acts to sum up the wisdom of all those highly motivated actors.
An actor that spots a 1% systematic error in the aggregate estimate is rewarded with a billion dollars—in a process that also corrects the estimate.
Barriers to entry are not zero (you can’t get the loans to make a billion-dollar corrective trade), but there are thousands of diverse intelligent actors who are all individually allowed to spot errors, correct them, and be rewarded, with no central veto.
This is certainly not perfect, but it is literally as good as it gets on modern-day Earth.
I don’t think I can beat the estimates produced by that process. I have no significant help to contribute to it. With study and effort I might become a decent hedge fundie and make a standard return. Theoretically, a liquid market should be just exploitable enough to pay competent professionals the same hourly rate as their next-best opportunity. I could potentially become one of those professionals, and earn standard hedge-fundie returns, but that’s not the same as significantly improving on the market’s efficiency. I’m not sure I expect a huge humanly accessible opportunity of that kind to exist, not in the thickly traded centers of the market. Somebody really would have taken it already! Our civilization cares about whether Microsoft stock will be priced at $37.70 or $37.75 tomorrow afternoon.
I can’t predict a 5% move in Microsoft stock in the next two months, and neither can you. If your uncle tells an anecdote about how he tripled his investment in NetBet.com last year and he attributes this to his skill rather than luck, we know immediately and out of hand that he is wrong. Warren Buffett at the peak of his form couldn’t reliably triple his money every year. If there is a strategy so simple that your uncle can understand it, which has apparently made him money—then we guess that there were just hidden risks built into the strategy, and that in another year or with less favorable events he would have lost half as much as he gained. Any other possibility would be the equivalent of a $20 bill staying on the floor of Grand Central Station for ten years while a horde of physics PhDs searched for it using naked eyes, microscopes, and machine learning.
In the thickly traded parts of the stock market, where the collective power of human civilization is truly at its strongest, I doff my hat, I put aside my pride and kneel in true humility to accept the market’s beliefs as though they were my own, knowing that any impulse I feel to second-guess and every independent thought I have to argue otherwise is nothing but my own folly. If my perceptions suggest an exploitable opportunity, then my perceptions are far more likely mistaken than the markets. That is what it feels like to look upon a civilization doing something adequately.
The converse side of the efficient-markets perspective would have said this about the Bank of Japan:
conventional cynical economist: So, Eliezer, you think you know better than the Bank of Japan and many other central banks around the world, do you?
eliezer: Yep. Or rather, by reading econblogs, I believe myself to have identified which econbloggers know better, like Scott Sumner.
c.c.e.: Even though literally trillions of dollars of real value are at stake?
c.c.e.: How do you make money off this special knowledge of yours?
eliezer: I can’t. The market also collectively knows that the Bank of Japan is pursuing a bad monetary policy and has priced Japanese equities accordingly. So even though I know the Bank of Japan’s policy will make Japanese equities perform badly, that fact is already priced in; I can’t expect to make money by short-selling Japanese equities.
c.c.e.: I see. So exactly who is it, on this theory of yours, that is being stupid and passing up a predictable payout?
eliezer: Nobody, of course! Only the Bank of Japan is allowed to control the trend line of the Japanese money supply, and the Bank of Japan’s governors are not paid any bonuses when the Japanese economy does better. They don’t get a million dollars in personal bonuses if the Japanese economy grows by a trillion dollars.
c.c.e.: So you can’t make any money off knowing better individually, and nobody who has the actual power and authority to fix the problem would gain a personal financial benefit from fixing it? Then we’re done! No anomalies here; this sounds like a perfectly normal state of affairs.
We don’t usually expect to find $20 bills lying on the street, because even though people sometimes drop $20 bills, someone else will usually have a chance to pick up that $20 bill before we do.
We don’t think we can predict 5% price changes in S&P 500 company stock prices over the next month, because we’re competing against dozens of hedge fund managers with enormous supercomputers and physics PhDs, any one of whom could make millions or billions on the pricing error—and in doing so, correct that error.
We can expect it to be hard to come up with a truly good startup idea, and for even the best ideas to involve sweat and risk, because lots of other people are trying to think up good startup ideas. Though in this case we do have the advantage that we can pick our own battles, seek out one good idea that we think hasn’t been done yet.
But the Bank of Japan is just one committee, and it’s not possible for anyone else to step up and make a billion dollars in the course of correcting their error. Even if you think you know exactly what the Bank of Japan is doing wrong, you can’t make a profit on that. At least some hedge-fund managers also know what the Bank of Japan is doing wrong, and the expected consequences are already priced into the market. Nor does this price movement fix the Bank of Japan’s mistaken behavior. So to the extent the Bank of Japan has poor incentives or some other systematic dysfunction, their mistake can persist. As a consequence, when I read some econbloggers who I’d seen being right about empirical predictions before saying that Japan was being grotesquely silly, and the economic logic seemed to me to check out, as best I could follow it, I wasn’t particularly reluctant to believe them. Standard economic theory, generalized beyond the markets to other facets of society, did not seem to me to predict that the Bank of Japan must act wisely for the good of Japan. It would be no surprise if they were competent, but also not much of a surprise if they were incompetent. And knowing this didn’t help me either—I couldn’t exploit the knowledge to make an excess profit myself—and this too wasn’t a coincidence.
This kind of thinking can get quite a bit more complicated than the foregoing paragraphs might suggest. We have to ask why the government of Japan didn’t put pressure on the Bank of Japan (answer: they did, but the Bank of Japan refused), and many other questions. You would need to consider a much larger model of the world, and bring in a lot more background theory, to be confident that you understood the overall situation with the Bank of Japan.
But even without that detailed analysis, in the epistemological background we have a completely different picture from the modest one. We have a picture of the world where it is perfectly plausible for an econblogger to write up a good analysis of what the Bank of Japan is doing wrong, and for a sophisticated reader to reasonably agree that the analysis seems decisive, without a deep agonizing episode of Dunning-Kruger-inspired self-doubt playing any important role in the analysis.
When we critique a government, we don’t usually get to see what would actually happen if the government took our advice. But in this one case, less than a month after my exchange with John, the Bank of Japan—under the new leadership of Haruhiko Kuroda, and under unprecedented pressure from recently elected Prime Minister Shinzo Abe, who included monetary policy in his campaign platform—embarked on an attempt to print huge amounts of money, with a stated goal of doubling the Japanese money supply.5
Immediately after, Japan experienced real GDP growth of 2.3%, where the previous trend was for falling RGDP. Their economy was operating that far under capacity due to lack of money.6
Now, on the modest view, this was the unfairest test imaginable. Out of all the times that I’ve ever suggested that a government’s policy is suboptimal, the rare time a government tries my preferred alternative will select the most mainstream, highest-conventional-prestige policies I happen to advocate, and those are the very policy proposals that modesty is least likely to disapprove of.
Indeed, if John had looked further into the issue, he would have found (as I found while writing this) that Nobel laureates had also criticized Japan’s monetary policy. He would have found that previous Japanese governments had also hinted to the Bank of Japan that they should print more money. The view from modesty looks at this state of affairs and says, “Hold up! You aren’t so specially blessed as your priors would have you believe; other academics already know what you know! Civilization isn’t so inadequate after all! This is how reasonable dissent from established institutions and experts operates in the real world: via opposition by other mainstream experts and institutions, not via the heroic effort of a lone economics blogger.”
However helpful or unhelpful such remarks may be for guarding against inflated pride, however, they don’t seem to refute (or even address) the central thesis of civilizational inadequacy, as I will define that term later. Roughly, the civilizational inadequacy thesis states that in situations where the central bank of a major developed democracy is carrying out a policy, and a number of highly regarded economists like Ben Bernanke have written papers about what that central bank is doing wrong, and there are widely accepted macroeconomic theories for understanding what that central bank is doing wrong, and the government of the country has tried to put pressure on the central bank to stop doing it wrong, and literally trillions of dollars in real wealth are at stake, then the overall competence of human civilization is such that we shouldn’t be surprised to find the professional economists at the Bank of Japan doing it wrong.
We shouldn’t even be surprised to find that a decision theorist without all that much background in economics can identify which econbloggers have correctly stated what the Bank of Japan is doing wrong, or which simple improvements to their current policies would improve the situation.
It doesn’t make much difference to my life whether I understand monetary policy better than, say, the European Central Bank, which as of late 2015 was repeating the same textbook mistake as the Bank of Japan and causing trillions of euros of damage to the European economy. Insofar as I have other European friends in countries like Italy, it might be important to them to know that Europe’s economy is probably not going to get any better soon; or the knowledge might be relevant to predicting AI progress timelines to know whether Japan ran out of low-hanging technological fruit or just had bad monetary policy. But that’s a rather distant relevance, and for most of my readers I would expect this issue to be even less relevant to their lives.
But you run into the same implicit background questions of inadequacy analysis when, for example, you’re making health care decisions. Cherry-picking another anecdote: My wife has a severe case of Seasonal Affective Disorder. As of 2014, she’d tried sitting in front of a little lightbox for an hour per day, and it hadn’t worked. SAD’s effects were crippling enough for it to be worth our time to consider extreme options, like her spending time in South America during the winter months. And indeed, vacationing in Chile and receiving more exposure to actual sunlight did work, where lightboxes failed.
From my perspective, the obvious next thought was: “Empirically, dinky little lightboxes don’t work. Empirically, the Sun does work. Next step: more light. Fill our house with more lumens than lightboxes provide.” In short order, I had strung up sixty-five 60W-equivalent LED bulbs in the living room, and another sixty-five in her bedroom.
Ah, but should I assume that my civilization is being opportunistic about seeking out ways to cure SAD, and that if putting up 130 LED light bulbs often worked when lightboxes failed, doctors would already know about that? Should the fact that putting up 130 light bulbs isn’t a well-known next step after lightboxes convince me that my bright idea is probably not a good idea, because if it were, everyone would already be doing it? Should I conclude from my inability to find any published studies on the Internet testing this question that there is some fatal flaw in my plan that I’m just not seeing?
We might call this argument “Chesterton’s Absence of a Fence.” The thought being: I shouldn’t build a fence here, because if it were a good idea to have a fence here, someone would already have built it. The underlying question here is: How strongly should I expect that this extremely common medical problem has been thoroughly considered by my civilization, and that there’s nothing new, effective, and unconventional that I can personally improvise?
Eyeballing this question, my off-the-cuff answer—based mostly on the impressions related to me by every friend of mine who has ever dealt with medicine on a research level—is that I wouldn’t necessarily expect any medical researcher ever to have done a formal experiment on the first thought that popped into my mind for treating this extremely common depressive syndrome. Nor would I strongly expect the intervention, if initial tests found it to be effective, to have received enough attention that I could Google it.
But this is just my personal take on the adequacy of 21st-century medical research. Should I be nervous that this line of thinking is just an excuse? Should I fret about the apparently high estimate of my own competence implied by my thinking that I could find an obvious-seeming way to remedy SAD when trained doctors aren’t talking about it and I’m not a medical researcher? Am I going too far outside my own area of expertise and starting to think that I’m good at everything?
In practice, I didn’t bother going through an agonizing fit of self-doubt along those lines. The systematic competence of human civilization with respect to treating mood disorders wasn’t so apparent to me that I considered it a better use of resources to quietly drop the issue than to just lay down the ~$600 needed to test my suspicion. So I went ahead and ran the experiment. And as of early 2017, with two winters come and gone, Brienne seems to no longer have crippling SAD—though it took a lot of light bulbs, including light bulbs in her bedroom that had to be timed to go on at 7:30am before she woke up, to sustain the apparent cure.7
If you want to outperform—if you want to do anything not usually done—then you’ll need to conceptually divide our civilization into areas of lower and greater competency. My view is that this is best done from a framework of incentives and the equilibria of those incentives—which is to say, from the standpoint of microeconomics. This is the main topic I’ll cover here.
In the process, I will also make the case that modesty—the part of this process where you go into an agonizing fit of self-doubt—isn’t actually helpful for figuring out when you might outperform some aspect of the equilibrium.
But one should initially present a positive agenda in discussions like these—saying first what you think is the correct epistemology, before inveighing against a position you think is wrong.
So without further ado, in the next chapter I shall present a very simple framework for inadequate equilibria.
Next chapter: An Equilibrium of No Free Energy.
The full book will be available November 16th. You can go to equilibriabook.com to pre-order the book, or sign up for notifications about new chapters and other developments.
Note: They later said that I’d misunderstood their intent, so take this example with some grains of salt. ↩
This is why I specified relative prices: stock-trading professionals are usually graded on how well they do compared to the stock market, not compared to bonds. It’s much less obvious that bonds in general are priced reasonably relative to stocks in general, though this is still being debated by economists. ↩
This is why I specified near-term pricing of liquid assets. ↩
That is, the Bank of Japan purchased huge numbers of bonds with newly created electronic money. ↩
See “How Japan Proved Printing Money Can Be A Great Idea” for a more recent update.
For readers who are wondering, “Wait, how the heck can printing money possibly lead to real goods and services being created?” I suggest Googling “sticky wages” and possibly consulting Scott Sumner’s history of the Great Depression, The Midas Paradox. ↩
Specifically, Brienne’s symptoms were mostly cured in the winter of 2015, and partially cured in the winter of 2016, when she spent most of her time under fewer lights. Brienne reports that she suffered a lot less even in the more recent winter, and experienced no suicidal ideation, unlike in years prior to the light therapy.
I’ll be moderately surprised if this treatment works reliably, just because most things don’t where depression is concerned; but I would predict that it works often enough to be worth trying for other people experiencing severe treatment-resistant SAD. ↩