jedharris comments on Relevant pre-AGI possibilities

jedharris 18 Jul 2020 19:56 UTC
3 points
0
A major factor that I did not see on the list is the rate of progress on algorithms, and closely related formal understanding, of deep AI systems. Right now these algorithms can be surprisingly effective (alpha-zero, GPT-3) but are extremely compute intensive and often sample inefficient. Lacking any comprehensive formal models of why deep learning works as well as it does, and why it fails when it does, we are groping toward better systems.
Right now the incentives favor scaling compute power to get more marquee results, since finding more efficient algorithms doesn’t scale as well with increased money. However the effort to make deep learning more efficient continues and probably can give us multiple orders of magnitude increase in both compute and sample efficiency.
Orders of magnitude improvement in the algorithms would be consistent with our experience in many other areas of computing where speedups due to better algorithms have often beaten speedups due to hardware.
Note that this is (more or less) independent of advances that contribute directly to AGI. For example algorithmic improvements may let us train GPT-3 on 100 times less data, with 1000 times less compute work, but may not suggest how to make the GPT series fundamentally smarter / more capable, except by making it bigger.
- gwern 18 Jul 2020 21:13 UTC
  6 points
  0
  Parent
  GPT-3 is very sample-efficient. You can put in just a few examples, and it’ll learn a new task, much like a human would!
  
  Oh, did you mean, sample-inefficient in training data? Yeah, I suppose, but I don’t see why anyone particularly cares about that.
- Daniel Kokotajlo 19 Jul 2020 10:30 UTC
  2 points
  0
  Parent
  Hmm, interesting point. I had considered things like “New insights accelerate AI development” but I didn’t put them in because they seemed too closely intertwined with AI timelines. But yeah now that you mention it I think it deserves to be included. Will add!l