Bumping someone else’s comment on the Gwern GPT-4 linkpost that now seems deleted:
MMLU 86.4% is impressive, predictions were around 80%.
This does seem like quite a significant jump (among all the general capabilities jumps shown in the rest of the paper. The previous SOTA was only 75.2% for Flan-PaLM (5-shot, finetuned, CoT + SC).
Bumping someone else’s comment on the Gwern GPT-4 linkpost that now seems deleted:
This does seem like quite a significant jump (among all the general capabilities jumps shown in the rest of the paper. The previous SOTA was only 75.2% for Flan-PaLM (5-shot, finetuned, CoT + SC).
And that previous SOTA was for a model fine-tuned on MMLU, the few-shot capabilities actually jumped from 70.7% to 86.4%!!