Petropolitan comments on Aaron_Scher’s Shortform

Petropolitan 28 Jan 2026 21:01 UTC
1 point
0
Same tokenizers on different training data lead to different glitch tokens, see e. g. comparison of Llama-family models in Yuxi Li et al. 2024 https://arxiv.org/abs/2404.09894
- eggsyntax 29 Jan 2026 17:09 UTC
  2 points
  0
  Parent
  Good point! I hadn’t quite realized that although it seems obvious in retrospect.