Re: my tweet about the cost of training GPT-4. It wasn’t my own estimate of GPT-4 training cost on H100s, it was just the SemiAnalysis estimate. Also, there are different ways to define “cost of training GPT-4” that are reasonable and can easily be 5x higher (e.g. see this post and comments). From now on, I’ll spell out the definition I’m using.
I agree you can’t just drop this money and expect to train GPT-4 (or more companies would have a GPT-4-level model now). I was thinking more about the costs to the leading labs of training a foundation model roughly on the scale of GPT-4 or slightly beyond (but, e.g., with different modalities or a mostly synthetic training set). That said, this is a different cost estimate because they already have the H100s (see linked post). I was making the comparison to the $10B Meta reportedly spent investing in the Metaverse in 2021.
Re: my tweet about the cost of training GPT-4.
It wasn’t my own estimate of GPT-4 training cost on H100s, it was just the SemiAnalysis estimate. Also, there are different ways to define “cost of training GPT-4” that are reasonable and can easily be 5x higher (e.g. see this post and comments). From now on, I’ll spell out the definition I’m using.
I agree you can’t just drop this money and expect to train GPT-4 (or more companies would have a GPT-4-level model now). I was thinking more about the costs to the leading labs of training a foundation model roughly on the scale of GPT-4 or slightly beyond (but, e.g., with different modalities or a mostly synthetic training set). That said, this is a different cost estimate because they already have the H100s (see linked post). I was making the comparison to the $10B Meta reportedly spent investing in the Metaverse in 2021.