papetoast comments on Vladimir_Nesov’s Shortform

papetoast 29 Apr 2026 1:57 UTC
1 point
0
- GPT 5.2 is dropping before its knowledge cutoff.
- The decays are probably because there are less training data about recent deaths, and that the pre-training may have started before the knowledge cutoff.
- Older models having better rote memorization on slightly obscure facts isn’t that surprising imo. It is not something that has a lot of optimization pressure.
- Having multiple variables mixed don’t seem like a big issue for detecting ancestry. False positives will still be highly unlikely—different pretrains will probably have different “forgetting curves”.