Our current rates of progress from GPT-2 --> GPT-3 --> GPT-4 have been rapid, but they have been sustained mostly by increasing compute budgets by 2 OOMs during each iteration.
Do you have a source for the claim that GPT-3 --> GPT-4 was about 2OOM increase in compute budgets? Sam Altman seems to say it was a ~100 different tricks in the Lex Fridman podcast.
Do you have a source for the claim that GPT-3 --> GPT-4 was about 2OOM increase in compute budgets? Sam Altman seems to say it was a ~100 different tricks in the Lex Fridman podcast.