and also they get backed into a corner once they write the first name after which their prediction is that they will get close rather than admitting they don’t have a full solution
This is a contingent tuning issue though, not a fundamental limitation. Chatbots are not predictors, they make use of meaningful features that formed when the base model was learning to solve its prediction task. It should be possible to tune the same base model to notice that it apparently committed to something it can’t carry out and so needs to pivot. Eliciting in-context awareness of errors might be easier than not hallucinating in the first place, let alone setting up more expensive and complicated scaffolding.
Here’s the actual paper:
T Besiroglu et al. (Apr 2024) Chinchilla Scaling: A Replication Attempt
The impact of the Chinchilla paper might be mostly the experimental methodology, not specific scaling laws (apart from the 20x rule of thumb, which the Besiroglu paper upholds). How learning rate has to be chosen for a training horizon, as continued training breaks optimality. And how isoFLOP plots gesture at the correct optimization problem to be solving, as opposed to primarily paying attention to training steps or parameter counts. Subsequent studies build on these lessons towards new regimes, in particular
N Muennighoff et al. (May 2023) Scaling Data-Constrained Language Models
Together AI (Dec 2023) StripedHyena
SY Gadre et al. (Mar 2024) Language Models Scale Reliably with Over-training and on Downstream Tasks