Rauno Arike comments on faul_sname’s Shortform

Rauno Arike 1 Oct 2025 20:18 UTC
1 point
0
They’re fairly uncommon words, and there are other words that would fit the contexts in which “overshadows” and “disclaimers” were used more naturally. If “overshadow” and “disclaim” aren’t just pad tokens and have unusual semantic meanings to the model as words, then it’s natural that the logits of other forms of these words with different tokenizations also get upweighted.