Interesting work! Could this be fixed in training by giving it practice at repeating each token when asked?
Another thing I’ve wondered is how substring operations can work for tokenized text. For example, if you ask for the first letter of a string, it will often get it right. How does that happen, and are there tokens where it doesn’t work?
You could do this, if you wanted. I suspect that when ChatGPT was patched, they instead just patched the tokenizer to no longer create these tokens, which is significantly easier and would also allow the model to repeat them without too much trouble.
I think that substring operations would mainly work with tokens that are used a fair bit. My model of the situation is, there is some loss that it would leave on the table if it didn’t know some facts about substrings of common tokens, so it learns it. For instance, it would help it be able to complete more acronyms, and if people prefer or avoid alliteration in certain contexts, it would help to predict text. If it was trained on social media, sometimes people will spell things out in ALL CAPITAL LETTERS, or do iNtErCaPs or whatever you call that, which would let it know all sorts of facts about the innards of tokens.
Interesting work! Could this be fixed in training by giving it practice at repeating each token when asked?
Another thing I’ve wondered is how substring operations can work for tokenized text. For example, if you ask for the first letter of a string, it will often get it right. How does that happen, and are there tokens where it doesn’t work?
You could do this, if you wanted. I suspect that when ChatGPT was patched, they instead just patched the tokenizer to no longer create these tokens, which is significantly easier and would also allow the model to repeat them without too much trouble.
I think that substring operations would mainly work with tokens that are used a fair bit. My model of the situation is, there is some loss that it would leave on the table if it didn’t know some facts about substrings of common tokens, so it learns it. For instance, it would help it be able to complete more acronyms, and if people prefer or avoid alliteration in certain contexts, it would help to predict text. If it was trained on social media, sometimes people will spell things out in ALL CAPITAL LETTERS, or do iNtErCaPs or whatever you call that, which would let it know all sorts of facts about the innards of tokens.