gwern comments on Are Emergent Abilities of Large Language Models a Mirage? [linkpost]

gwern 16 Aug 2025 4:08 UTC
2 points
0

For these reasons, I don’t really find the metrics used in the papers ad hoc, except to the extent that “award partial credit for answers that are close to correct” is ad hoc.

If they are not ad hoc, what have they successfully predicted in the two and a half years since they concocted their original reparameterizations like ‘BPE edit distance’ to ‘explain’ past emergence?