Training a 6B model for 500B tokens costs about 20k USD. That number increases linearly with model size and # of tokens, and decreases with the amount of money you have. Work like this is super doable, especially at large labs.
Training a 6B model for 500B tokens costs about 20k USD. That number increases linearly with model size and # of tokens, and decreases with the amount of money you have. Work like this is super doable, especially at large labs.