SoerenMind comments on [AN #60] A new AI challenge: Minecraft agents that assist human players in creative mode

SoerenMind 28 Jul 2019 17:41 UTC
LW: 9 AF: 1
0
AF
Costs don’t really grow linearly with model size because utilization goes down as you spread a model across many GPUs. I. e. aggregate memory requirements grow superlinearly. Relatedly, model sizes increased <100x while compute increased 300000x on OpenAI’s data set. That’s been updating my views a bit recently.

People are trying to solve this with things like GPipe, but I don’t know yet if there can be an approach that scales to many more TPUs than what they tried (8). Communication would be the next bottleneck.

https://ai.googleblog.com/2019/03/introducing-gpipe-open-source-library.html?m=1
- Rohin Shah 28 Jul 2019 18:46 UTC
  LW: 4 AF: 2
  0
  AF Parent
  Yes, this is a good point. Nonetheless, in terms of whether social learning is important, the relevant question is the relative efficiency of the “social learning channel” and the “scaled up GPUs channel”. The social learning channel requires you to work with inputs and outputs only (language and behavior for humans), while the scaled up GPUs allows you to work directly with internal representations, so I still expect that social learning won’t be particularly important for AI systems unless they use it to learn from humans.
  Relatedly, model sizes increased <100x while compute increased 300000x on OpenAI’s data set.
  This doesn’t seem relevant to the question of whether social learning is important? Perhaps you were just stating an interesting related fact, but if you were trying to make a point I don’t know what it is.
  - SoerenMind 28 Jul 2019 20:55 UTC
    3 points
    0
    Parent
    Yep my comment was about the linear scale up rather than it’s implications for social learning.