Gemini uses a different tokenizer, so the same example won’t work on it. According to this tokenizer, riedenheit is 3 tokens in Gemini 2.5 Pro. I can’t find a source for Gemini’s full vocabulary and it would be hard to find similar tokens without it.
Weirdly, it does seem capable of spelling it when prompted “Can you spell ‘riedenheit’ letter-by-letter?”, which I would expect to also not be able to do it based on what Tiktokenizer shows. It can also tokenize (unspell?) r-i-e-d-e-n-h-e-i-t, which is weird. It’s possible this is a combination of LLMs not learning A->B implies B->A, so it learned to answer ‘How do you spell ‘riedenheit’?”, but didn’t learn to spell it in less common contexts like “riedenheit, what’s the spelling?”
Here’s some even better examples: Asking ChatGPT to spell things backwards. Reversing strings is trivial for a character-level transformer (a model thouands of times smaller than GPT-4o could do this perfectly), but ChatGPT can’t reverse ‘riedenheit’, or ‘umpulan’, or ′ milioane’.
My theory here is that there are lots of spelling examples in the training data, so ChatGPT mostly memorizes how to spell, but there’s very few reversals in the training data, so ChatGPT can’t reverse any uncommon tokens.
Gemini uses a different tokenizer, so the same example won’t work on it. According to this tokenizer, riedenheit is 3 tokens in Gemini 2.5 Pro. I can’t find a source for Gemini’s full vocabulary and it would be hard to find similar tokens without it.
There’s definitely something going on with tokenization, since if I ask ChatGPT to spell “Riedenheit” (3 tokens), it gives the obvious answer with no assumption of mispelling. And if I ask it to just give the spelling and no commentary, it also spells it wrong. If I embed it in an obvious nonsense word, ChatGPT also fails to spell it.
Weirdly, it does seem capable of spelling it when prompted “Can you spell ‘riedenheit’ letter-by-letter?”, which I would expect to also not be able to do it based on what Tiktokenizer shows. It can also tokenize (unspell?) r-i-e-d-e-n-h-e-i-t, which is weird. It’s possible this is a combination of LLMs not learning A->B implies B->A, so it learned to answer ‘How do you spell ‘riedenheit’?”, but didn’t learn to spell it in less common contexts like “riedenheit, what’s the spelling?”
Here’s some even better examples: Asking ChatGPT to spell things backwards. Reversing strings is trivial for a character-level transformer (a model thouands of times smaller than GPT-4o could do this perfectly), but ChatGPT can’t reverse ‘riedenheit’, or ‘umpulan’, or ′ milioane’.
My theory here is that there are lots of spelling examples in the training data, so ChatGPT mostly memorizes how to spell, but there’s very few reversals in the training data, so ChatGPT can’t reverse any uncommon tokens.
EDIT: Asking for every other character in a token is similarly hard.