plex comments on The Dark Arts of Tokenization or: How I learned to start worrying and love LLMs’ undecoded outputs

plex 17 Oct 2025 19:05 UTC
5 points
0
I present this as a mystery to be tackled by someone else; on a naive model, alternative tokenizations are something akin to random noise—but they are decidedly steering the emotional ratings in a non-random direction.
Hypothesis: The system is trying to get low perplexity, it’s whole world is focused on this. Giving it an unusual encoding is going to be less of whatever equivalent of valance it has, which leaks into the kinds of things it’s thinking about.
idk how to test it but i buy the side of a market of any experiment which can reasonably test this hypothesis at at least 65%, maybe up to 80%.
- gwern 18 Oct 2025 23:45 UTC
  3 points
  0
  Parent
  Or it could just be randomness from initialization which hasn’t been washed away by any training. “Subliminal” encoding immediately comes to mind.
  - plex 19 Oct 2025 0:41 UTC
    2 points
    0
    Parent
    idk, 32 samples, perfectly divided into 16 negative valance and 16 positive valance by random chance is pretty unlikely. unless you mean the randomness was in an underlying valance factor?
    - gwern 19 Oct 2025 0:51 UTC
      3 points
      0
      Parent
      I don’t know what the samples have to do with it. I mean simply that the parameters of the model are initialized randomly and from the start all tokens will cause slightly different model behaviors, even if the tokens are never seen in the training data, and this will remain true no matter how long it’s trained.
      - plex 19 Oct 2025 8:35 UTC
        2 points
        0
        Parent
        The post shows 32 words being tested. The 16 positive valance words all have higher weight with normal tokenization, the 16 low valance words have higher with unusual tokenization. This is extremely improbable by raw chance.
        gwern 19 Oct 2025 17:00 UTC
        2 points
        0
        Parent
        There’s no reason whatsoever to think that they are independent of each other. The very fact that you can classify them systematically as ‘positive’ or ‘negative’ valence indicates they are not and you don’t know what ‘raw chance’ here yields. It might be quite probable.
        plex 19 Oct 2025 18:15 UTC
        2 points
        0
        Parent
        Right, I think that’s the hypothesis I was asking whether you had when I said
        unless you mean the randomness was in an underlying valance factor?
        If so, yeah, this is compatible. I’d still put notably higher odds on the original thing I suggested, but this is the other main hypothesis.