Dario rejects doomerism re:misalignment, fair enough.
But what of doomerism re:slowdown?
Furthermore, the last few years should make clear that the idea of stopping or even substantially slowing the technology is fundamentally untenable. The formula for building powerful AI systems is incredibly simple, so much so that it can almost be said to emerge spontaneously from the right combination of data and raw computation.
[...]
If all companies in democratic countries stopped or slowed development, by mutual agreement or regulatory decree, then authoritarian countries would simply keep going. Given the incredible economic and military value of the technology, together with the lack of any meaningful enforcement mechanism, I don’t see how we could possibly convince them to stop.
Predicting the difficulty of cooperating around / enforcing a “substantial” slow down of AI development seems similarly difficult as predicting the difficulty of avoiding misalignment? Perhaps it is true that this would be historically unprecedented, but as Dario notes, the whole possibility of a country of geniuses in a datacenter is historically unprecedented.
I would love to hear arguments from the slowdown-pessimistic worldview generally.
Specifically addressing perhaps:
1. How do we know we have exhausted / mostly saturated communicating about the risks within the ~US, such that further efforts on that front wouldn’t lead to much meaningful returns?
- (Sure, it does seem like in the current administration has very little sympathy to anything like a slowdown, but it was not the case for the previous admin? Isn’t there a lot of variance regarding this?)
2. How do we know geopolitical adversaries wouldn’t agree to a slowdown, if this was seriously bargained for by a coalition of the willing?
3. What is the estimation about how the situation regarding the above two would change, if there was significantly more legible evidence and understanding about the risks? What about the case of a “warning shot”?
What is the level of evidence likely required for policymakers to seriously consider negotiating a slowdown?
4. How difficult would be the oversight or enforcement of a slowdown policy?
(Even if adversaries in a few years develop their independent chip-manufacturing supply chains, isn’t training “powerful AIs” remaining a highly resource intensive, observable and disruptable process for likely ~decades?)
5. How much early not-yet-too-powerful AIs might help us with coordination and enforcement of a slowdown?
It is a bit ambiguous from your reply whether you mean distributed AI deployment, or distributed training. Agree that distributed deployment seems very hard to police, once training took place, implying also that there is some large amount of compute available somewhere.
About training, I guess the hope for enforcement would be ability to constrain (or at least monitor) total compute available and hardware manufacturing.
Even if you do training in a distributed fashion, you would need the same amount of chips. (Probably more by some multiplier, to pay for increased latency? And if you can’t distribute it to an arbitrary extent, you still need large datacenters that are hard to hide.)
Disguising hardware production seems much harder than disguising training runs or deployment.
Perhaps a counter is “algorithmic improvement”, which is estimated by Epoch to be providing 3x/year effective compute gain.
This is important, but:
- Compute scaling is estimated (again, by Epoch) at 4-5x/year, so if we assume both trends to continue, and if your timeline for dangerous AI was say, 5 years, and we freeze compute scaling today such that only the largest training run today is available in the future, IIUC you would gain ~7 years (which is something!)
But, importantly, the longer timelines you have, if I did the math correctly, you have linearly ~1.5x extra time.
(So, if your timeline for dangerous AI was 2036, it would be pushed out to ~2050.)
- I’m sceptical that “algorithmic improvement” can be extrapolated indefinitely—it would be surprising if you could train GPT-3 in ~8 years on a single V100 GPU in a few months? (You need to get a certain amount of bits into the AI, there is now way around it.)
(At least this should be more and more true as labs reap more and more of the low-hanging fruit of hardware optimisation?)
Also, contingent on point 2. of my original comment, all of the above could be much much easier if we are not assuming a 100% adversarial scenario, where the adversary has willingness to cooperate in the effective implementation of a treaty.