The way I set up the analogy makes it seem like AutomatedCorp has a serial compute advantage: because they have 50 years they can run things that take many serial years while NormalCorp can’t. As in, the exact analogy implies that they could use a tenth of their serial time to run a 5 year long training run on 50k H100s, while they could actually only do this if the run was sufficiently parallelizable such that it could be done on 2.5 million H100s in a tenth of a year. So, you should ignore any serial compute advantage. Similarly, you should ignore difficulties that SlowCorp might have in parallelizing things sufficiently etc.
You can also imagine that SlowCorp has 10 million magically good GPUs (and CPUs etc) which are like H100s but 50x serially faster (but still only has 1 week) while AutomatedCorp has 10 million much worse versions of H100s (and CPUs etc) which are 50x serially slower but otherwise the same (and has 50 years still).
Also SlowCorp has magically 50x better networking equipment than NormalCorp, and 50x higher rate limits on every site they’re trying to scrape, and 50x as much sensor data from any process in the world, and 50x faster shipping on any physical components they need, etc etc (and AutomatedCorp has magically 50x worse of all of those things).
But yeah, agreed that you should ignore all of those intuitions when considering the “1 week” scenario—I just found that I couldn’t actually turn all of those intuitions off when considering the scenario.
Yep, but my understanding is that the time associated with marginal scraping, sensor data, and physical components don’t matter much when talking about AI progress which is on the order of a year. Or honestly, maybe marginal improvements in these sorts of components don’t matter that much at all over this time scale (like freezing all these things for a year wouldn’t be much tax if you prepped in advance). Not super sure about situation with scrapping though.
Yeah, I discuss this here:
You can also imagine that SlowCorp has 10 million magically good GPUs (and CPUs etc) which are like H100s but 50x serially faster (but still only has 1 week) while AutomatedCorp has 10 million much worse versions of H100s (and CPUs etc) which are 50x serially slower but otherwise the same (and has 50 years still).
Also SlowCorp has magically 50x better networking equipment than NormalCorp, and 50x higher rate limits on every site they’re trying to scrape, and 50x as much sensor data from any process in the world, and 50x faster shipping on any physical components they need, etc etc (and AutomatedCorp has magically 50x worse of all of those things).
But yeah, agreed that you should ignore all of those intuitions when considering the “1 week” scenario—I just found that I couldn’t actually turn all of those intuitions off when considering the scenario.
Yep, but my understanding is that the time associated with marginal scraping, sensor data, and physical components don’t matter much when talking about AI progress which is on the order of a year. Or honestly, maybe marginal improvements in these sorts of components don’t matter that much at all over this time scale (like freezing all these things for a year wouldn’t be much tax if you prepped in advance). Not super sure about situation with scrapping though.