A simpler related question that I don’t know off-hand is, what prevents trillion-parameter NN’s? Does training data requirement scale with network size? (in which case, I don’t expect this to be a problem for long, because I expect we’ll find algorithms with human-level data efficiency before we get AGI.) Or just the limited memory capacity per GPU, and the hassle / overhead / cost of parallelization? (in which case, again I expect we’ll get dramatically more parallelizable algorithms in the near future.)
See AI Impacts articles on ‘Human Level Hardware’ if you haven’t already. I haven’t dug into it myself, but I agree that your question is a good one.
A simpler related question that I don’t know off-hand is, what prevents trillion-parameter NN’s? Does training data requirement scale with network size? (in which case, I don’t expect this to be a problem for long, because I expect we’ll find algorithms with human-level data efficiency before we get AGI.) Or just the limited memory capacity per GPU, and the hassle / overhead / cost of parallelization? (in which case, again I expect we’ll get dramatically more parallelizable algorithms in the near future.)