And yet, current LLMs have noticeably different personas from each other, as well as coding skills that significantly outstrip what you would expect from imitation of the corpus. So their post-training has a large impact.
The pre-training forms the foundation (LeCun: “Self-supervised learning: The dark matter of intelligence”, tailcalled: “At its most basic, unsupervised prediction forms a good foundation for later specializing the map to perform specific types of prediction”) which gives the model common sense and general abilities, while reinforcement learning adds something like goal orientation on top.
And yet, current LLMs have noticeably different personas from each other, as well as coding skills that significantly outstrip what you would expect from imitation of the corpus. So their post-training has a large impact.
The pre-training forms the foundation (LeCun: “Self-supervised learning: The dark matter of intelligence”, tailcalled: “At its most basic, unsupervised prediction forms a good foundation for later specializing the map to perform specific types of prediction”) which gives the model common sense and general abilities, while reinforcement learning adds something like goal orientation on top.