The Efficient Market Hypothesis in Research

A classic economics joke goes like this:

Two economists are walking down a road, when one of them notices a $20 bill on the ground. He turns to his friend and exclaims: “Look, a $20 bill!” The other replies: “Nah, if there’s a $20 on the bill on the ground, someone would’ve picked it up already.”

The economists in the joke believe in the Efficient Market Hypothesis (EMH), which roughly says that financial markets are efficient and there’s no way to “beat the market” by making intelligent trades.

If the EMH was true, then why is there still a trillion-dollar finance industry with active mutual funds and hedge funds? In reality, the EMH is not a universal law of economics (like the law of gravity), but more like an approximation. There may exist inefficiencies in markets where stock prices follow a predictable pattern and there is profit to be made (e.g.: stock prices fall when it’s cloudy in New York). However, as soon as someone notices the pattern and starts exploiting it (by making a trading algorithm based on weather data), the inefficiency disappears. The next person will find zero correlation between weather in New York and stock prices.

There is a close parallel in academic research. Here, the “market” is generally efficient: most problems that are solvable are already solved. There are still “inefficiencies”: open problems that can be reasonably solved, and one “exploits” them by solving it and publishing a paper. Once exploited, it is no longer available: nobody else can publish the same paper solving the same problem.

Where does this leave the EMH? In my view, the EMH is a useful approximation, but its accuracy depends on your skill and expertise. For non-experts, the EMH is pretty much universally true: it’s unlikely that you’ve found an inefficiency that everyone else has missed. For experts, the EMH is less often true: when you’re working in highly specialized areas that only a handful of people understand, you begin to notice more inefficiencies that are still unexploited.

A large inefficiency is like a $20 bill on the ground: it gets picked up very quickly. An example of this is when a new tool is invented that can straightforwardly be applied to a wide range of problems. When the BERT model was released in 2018, breaking the state-of-the-art on all the NLP benchmarks, there was instantly an explosion of activity as researchers raced to apply it to all the important NLP problems and be the first to publish. By mid-2019, all the straightforward applications of BERT were done, and the $20 bill was no more.

Above: Representation of the EMH in research. To outsiders, there are no inefficiencies; to experts, inefficiencies exist briefly before they are exploited. Loosely inspired by this diagram by Matt Might.

The EMH implies various heuristics that I use to guide my daily research. If I have a research idea that’s relatively obvious, and the tools to attack it have existed for a while (say, >= 3 years), then probably one of the following is true:

  1. Someone already published it 3 years ago.

  2. Your idea doesn’t work very well.

  3. The result is not that useful or interesting.

  4. One of your basic assumptions is wrong, so your idea doesn’t even make sense.

  5. Etc.

Conversely, a research idea is much more likely to be fruitful (i.e., a true inefficiency) if the tools to solve it have only existed for a few months, requires data and resources that nobody else has access to, or requires rare combinations of insights that conceivably nobody has thought of.

Outside the realm of the known (the red area in my diagram), there are many questions that are unanswerable. These include the hard problems of consciousness and free will, P=NP, etc, or more mundane problems where our current methods are not strong enough. For an outsider, these might seem like inefficiencies, but it would be wise to assume they’re not. The EMH ensures that true inefficiencies are quickly picked up.

To give a more relatable example, take the apps Uber (launched in 2009) and Instagram (launched in 2010). Many of the apps on your phone probably launched around the same time. In order for Uber and Instagram to work, people needed to have smartphones that were connected to the internet, with GPS (for Uber) and decent quality cameras (for Instagram). Neither of these ideas would’ve been possible in 2005, but thanks to the EMH, as soon as smartphone adoption took off, we didn’t have to wait very long to see all the viable use-cases for the new technology to emerge.

Originally posted on my blog: Lucky’s Notes.