Brendan Long comments on Shorter Tokens Are More Likely

Brendan Long 24 Aug 2025 20:04 UTC
4 points
0
Thanks for trying this! I wonder if this is making things worse in a similar way to top-k. The C-tokenizer makes it very likely that “c” is always in the top 200 tokens. I wonder if it’s also ensuring that it’s rarely sufficiently uncertain to be filtered by this scoring rule?