It sounds like Eliezer is confident that alignment will fail. If so, the way out is to make sure AGI isn’t built. I think that’s more realistic than it sounds
1. LessWrong is influential enough to achieve policy goals
Right now, the Yann LeCun view of AI is probably more mainstream, but that can change fast.
LessWrong is upstream of influential thinkers. For example:
- Zvi and Scott Alexander read LessWrong. Let’s call folks like them Filter #1
- Tyler Cowen reads Zvi and Scott Alexander. (Filter #2)
- Malcolm Gladwell, a mainstream influencer, reads Tyler Cowen every morning (Filter #3)
I could’ve made a similar chain with Ezra Klein or Holden Karnofsky. All these chains put together is a lot of influence
Right now, I think Eliezer’s argument (AI capabilities research will destroy the world) is blocked at Filter #1. None of the Filter #1 authors have endorsed it. Why should they? The argument relies on intuition. There’s no way for Filter #1 to evaluate it. I think that’s why Scott Alexander and Holden Karnofsky hedged, neither explicitly endorsing nor rejecting the doom theory.
Even if they believed Eliezer, Filter #1 authors need to communicate more than an intuition to Filter #2. Imagine the article: “Eliezer et al have a strong intuition that the sky is falling. We’re working on finding some evidence. In the meantime, you need to pass some policies real fast.”
In short, ideas from LessWrong can exert a strong influence on policymakers. This particular idea hasn’t because it isn’t legible and Filter #1 isn’t persuaded.
2. If implemented early, government policy can prevent AGI development
AGI development is expensive. If Google/Facebook/Huawei didn’t expect to make a lot of money from capabilities development, they’d stop investing in it. This means that the pace of AI is very responsive to government policy.
If the US, China, and EU want to prevent AGI development, I bet they’d get their way. This seems a job for a regulatory agency. Pick a (hopefully narrow) set of technologies and make it illegal to research them without approval.
This isn’t as awful as it sounds. The FAA basically worked, and accidents in the air are very rare. If Eliezer’s argument is true, the costs are tiny compared to the benefits. A burdensome bureaucracy vs destruction of the universe.
Imagine a hypothetical world, where mainstream opinion (like you’d find in the New York Times) says that AGI would destroy the world, and a powerful regulatory agency has the law on its side. I bet AGI is delayed by decades.
3. Don’t underestimate how effectively the US government can do this job
Don’t over-index on covid or climate change. AI safety is different. Covid and climate change both demand sacrifices from the entire population. This is hugely unpopular. AI safety, on the other hand, only demands sacrifices from a small number of companies
For now, I think the top priority is to clearly and persuasively demonstrate why alignment won’t be solved in the next 30 years. This is crazy hard, but it might be way easier than actually solving alignment
Short summary: Biological anchors are a bad way to predict AGI. It’s a case of “argument from comparable resource consumption.” Analogy: human brains use 20 Watts. Therefore, when we have computers with 20 Watts, we’ll have AGI! The 2020 OpenPhil estimate of 2050 is based on a biological anchor, so we should ignore it.
Longer summary:
Lots of folks made bad AGI predictions by asking:
How much compute is needed for AGI?
When that compute will be available?
To find (1), they use a “biological anchor,” like the computing power of the human brain, or the total compute used to evolve human brains.
Hans Moravec, 1988: the human brain uses 10^13 ops/s, and computers with this power will be available in 2010.
Eliezer objects that:
“We’ll have computers as fast as human brains in 2010” doesn’t imply “we’ll have strong AI in 2010.”
The compute needed depends on how well we understand cognition and computer science. It might be done with a hypercomputer but very little knowledge, or a modest computer but lots of knowledge.
An AGI wouldn’t actually need 10^13 ops/s, because human brains are inefficient. One example, they do lots of operations in parallel, which could be replaced with fewer operations in series.
Eliezer, 1999: Eliezer mentions that he too made bad AGI predictions as a teenager
Ray Kurzweil, 2001: Same idea as Moravec, but 10^16 ops/s. Not worth repeating
Someone, 2006: it took ~10^43 ops for evolution to create human brains. It’ll be a very long time before a computer can reach 10^43 ops, so AGI is very far away
Eliezer objects that the use of a biological anchor is sufficient to make this estimate useless. It’s a case of a more general “argument from comparable resource consumption.”
Analogy: human brains use 20 Watts. Therefore, when we have computers with 20 Watts, we’ll have AGI!
OpenPhil, 2020: A much more sophisticated estimate, but still based on a biological anchor. They predict AGI in 2050.
How the new model works:
Demand side: Estimate how many neural-network parameters would emulate a brain. Use this to find the computational cost of training such a model. (I think this part mischaracterizes OpenPhil’s work, my comments at the bottom)
Supply side: Moore’s law, assuming
Willingness to spend on AGI training is a fixed percent of GDP
“Computation required to accomplish a fixed task decreases by half every 2-3 years due to better algorithms.”
Eliezer’s objections:
(Surprise!) It’s still founded on a biological anchor, which is sufficient to make it invalid
OpenPhil models theoretical AI progress as algorithms getting twice as efficient every 2-3 years. This is a bad model, because folks keep finding entirely new approaches. Specifically, it implies “we should be able to replicate any modern feat of deep learning performed in 2021, using techniques from before deep learning and around fifty times as much computing power.”
Some of OpenPhil’s parameters make it easy for the modelers to cheat, and make sure it comes up with an answer they like:
“I was wondering what sort of tunable underdetermined parameters enabled your model to nail the psychologically overdetermined final figure of ’30 years’ so exactly.”
Can’t we use this as an upper bound? Maybe AGI will come sooner, but surely it won’t take longer than this estimate.
Eliezer thinks this is the same non-sequitur as Moravec’s. If you train a model big enough to emulate a brain, that doesn’t mean AGI will pop out at the end.
Other commentary: Eliezer mentions several times that he’s feeling old, tired, and unhealthy. He feels frustrated that researchers today repeat decades-old bad arguments. It takes him a lot of energy to rebut these claims
My thoughts:
I found this persuasive, but I also think it mischaracterized the OpenPhil model
My understanding is that OpenPhil didn’t just estimate the number of neural network parameters required to train a human brain. They used six different biological anchors, including the “evolution anchor’, which I find very useful for an upper bound.
Holden Karnofsky, who seems to put much more stock in the Bio Anchors model than Eliezer, explains the model really well here. But I was frustrated to see that the write-up on Holden’s blog gives 50% by 2090 (first graph) using the evolution anchor, while the same graph in the old calcs gives only 11%. Was this model tuned after seeing the results?
My conclusion: Bio Anchors is a terrible way to model when AGI will actually arrive. But I don’t agree with Eliezer’s dismissal of using Bio Anchors to get an upper bound, because I think the evolution anchor achieves this.