You could do this, if you wanted. I suspect that when ChatGPT was patched, they instead just patched the tokenizer to no longer create these tokens, which is significantly easier and would also allow the model to repeat them without too much trouble.
I think that substring operations would mainly work with tokens that are used a fair bit. My model of the situation is, there is some loss that it would leave on the table if it didn’t know some facts about substrings of common tokens, so it learns it. For instance, it would help it be able to complete more acronyms, and if people prefer or avoid alliteration in certain contexts, it would help to predict text. If it was trained on social media, sometimes people will spell things out in ALL CAPITAL LETTERS, or do iNtErCaPs or whatever you call that, which would let it know all sorts of facts about the innards of tokens.
You could do this, if you wanted. I suspect that when ChatGPT was patched, they instead just patched the tokenizer to no longer create these tokens, which is significantly easier and would also allow the model to repeat them without too much trouble.
I think that substring operations would mainly work with tokens that are used a fair bit. My model of the situation is, there is some loss that it would leave on the table if it didn’t know some facts about substrings of common tokens, so it learns it. For instance, it would help it be able to complete more acronyms, and if people prefer or avoid alliteration in certain contexts, it would help to predict text. If it was trained on social media, sometimes people will spell things out in ALL CAPITAL LETTERS, or do iNtErCaPs or whatever you call that, which would let it know all sorts of facts about the innards of tokens.