p.b. comments on shortplav

p.b. 17 Apr 2025 10:50 UTC
2 points
0
It’s probably just a difference in tokenizer. Tokenizers often produce tokens with trailing whitespace. I actually once wrote a tokenizer and trained a model to predict “negative whitespace” when a token for once shouldn’t have a trailing whitespace. But I don’t know how current tokenizers handle this, probably in different ways.
- niplav 17 Apr 2025 13:26 UTC
  2 points
  0
  Parent
  That would be my main guess as well $_{75 %}$ , but not the overwhelmingly likely option.