Mis-Understandings comments on Mis-Understandings’s Shortform

Mis-Understandings 19 Mar 2025 3:42 UTC
3 points
0
Wait, it would be nice if there were no-op tokens
tokens that when appended or inserted into a LLM stream, did actually nothing to the probabilty of following tokens.
(because it would allow you to do search in trajectory embedded space with gradient descent for non-fixed length sequences), then use nearest neighbor to find a tokenized version of that prompt.
I am not sure if that is useful enough to make it worth implementing
I know you could to that by adding a token parameter that deweights that token to the attention mechanism
I am not sure how to detangle the positional encoder (I need to study that mechanism)
- Mis-Understandings 19 Mar 2025 14:47 UTC
  1 point
  0
  Parent
  I think it is impossible, because the change positional encoder would break relative position inferences.