These tokens are either very common, or appear especially in reasoning tasks, in particular those with code. This might mean that coding reinforcement learning was the last step in the training process, and that all other tokens got slightly weight decayed. It could also mean that in general, reasoning tokens are treated as so important by gradient descent that their updates are extra large.
The above text is quite compelling and I am currently doing ablations on reasoning and in particular I want to prevent the model from using these reasoning words and see how the reasoning degrades, so I will definitely be citing your work when I publish my results.
Do you have any intuition on what “ocode” means?
Furthermore, it is unclear to me from which GPT OSS model you take those English L2 norm embeddings from. And lastly, can you please elaborate why having the tokenizer means we can use the GPT OSS embeddings to study the token list without having to look at each token’s text content.
> Do you have any intuition on what “ocode” means?
Oddly enough, the “ocode” token also has a child in the BPE merging rules, namely token #107224, “ocoder”. I really wonder where the tokenizer got that one from. This is probably a red hering for the embedding matrix norm though, it’s very likely from ” pseudocode”, which gets tokenized as ” pseud”-”ocode”.
> Furthermore, it is unclear to me from which GPT OSS model you take those English L2 norm embeddings from.
I believe it was 120b.
> And lastly, can you please elaborate why having the tokenizer means we can use the GPT OSS embeddings to study the token list without having to look at each token’s text content.
Because the embeddings contain all the semantic information needed to speak human language as well as gpt-oss does, which is really well! A nearest-neighbor search in embedding space gives semantically similar tokens, etc.
> The above text is quite compelling and I am currently doing ablations on reasoning and in particular I want to prevent the model from using these reasoning words and see how the reasoning degrades
Those are very great and concise answers, so thank you for that. I ended up doing the ablations and here the arxiv preprint has been submitted, waiting on acceptance.
The above text is quite compelling and I am currently doing ablations on reasoning and in particular I want to prevent the model from using these reasoning words and see how the reasoning degrades, so I will definitely be citing your work when I publish my results.
Do you have any intuition on what “ocode” means?
Furthermore, it is unclear to me from which GPT OSS model you take those English L2 norm embeddings from. And lastly, can you please elaborate why having the tokenizer means we can use the GPT OSS embeddings to study the token list without having to look at each token’s text content.
> Do you have any intuition on what “ocode” means?
Oddly enough, the “ocode” token also has a child in the BPE merging rules, namely token #107224, “ocoder”. I really wonder where the tokenizer got that one from. This is probably a red hering for the embedding matrix norm though, it’s very likely from ” pseudocode”, which gets tokenized as ” pseud”-”ocode”.
> Furthermore, it is unclear to me from which GPT OSS model you take those English L2 norm embeddings from.
I believe it was 120b.
> And lastly, can you please elaborate why having the tokenizer means we can use the GPT OSS embeddings to study the token list without having to look at each token’s text content.
Because the embeddings contain all the semantic information needed to speak human language as well as gpt-oss does, which is really well! A nearest-neighbor search in embedding space gives semantically similar tokens, etc.
> The above text is quite compelling and I am currently doing ablations on reasoning and in particular I want to prevent the model from using these reasoning words and see how the reasoning degrades
Looking forward to the results!
Those are very great and concise answers, so thank you for that. I ended up doing the ablations and here the arxiv preprint has been submitted, waiting on acceptance.