I remember reading that LLMs are especially good at Caesar Ciphering which might explain how they can transliterate Cyrillic into Latin, this is probably an unintended side effect of the way embeddings work since what is not encoded isn’t the English sentence, but the relative positions of the vectors each token is converted to.
To put it another way, your Cyrillic gibberish and your Latin alphabet are, in embedding space, very very similar. It would be interesting to play around with reverse writing and one-letter-up.
Like asking:
xibu xbt uif jodbo fdpopnz mjlf?
Although my suspision is since that, phonetically speaking, the cyrillic version of your sentence would map to more common tokens than my one-letter-up rendition, perhaps you will experience wildly different results?
I don’t think the Cyrillic text would map to any common tokens, since the output is essentially the result of a substitution cipher, the key being the keyboard mappings. Crucially, Claude deciphered it his (its?) CoT.
I just re-ran the original prompt but disabled thinking, and… it gets caught by the safety filter for some reason, telling me to use Sonnet 4 instead of Opus 4.5. Sonnet 4 doesn’t get it right with thinking disabled, but with thinking re-enabled, it actually gets it.
I don’t think the Cyrillic text would map to any common tokens, since the output is essentially the result of a substitution cipher, the key being the keyboard mappings.
I don’t understand. surely it has been exposed to training resources that contain, say, Serbian which is written in both Latin and Cyrillic. And more relevant: news articles that have transliterations of Anglophone celebrity names and places:
The examples you gave are indeed transliterations. The Cyrillic text I’m talking about is actually nonsensical. Consider the reverse: if I mistakenly tried typing “істина” (Truth) on an qwerty keyboard, the result is “scnbyf”.
I remember reading that LLMs are especially good at Caesar Ciphering which might explain how they can transliterate Cyrillic into Latin, this is probably an unintended side effect of the way embeddings work since what is not encoded isn’t the English sentence, but the relative positions of the vectors each token is converted to.
To put it another way, your Cyrillic gibberish and your Latin alphabet are, in embedding space, very very similar. It would be interesting to play around with reverse writing and one-letter-up.
Like asking:
xibu xbt uif jodbo fdpopnz mjlf?
Although my suspision is since that, phonetically speaking, the cyrillic version of your sentence would map to more common tokens than my one-letter-up rendition, perhaps you will experience wildly different results?
I don’t think the Cyrillic text would map to any common tokens, since the output is essentially the result of a substitution cipher, the key being the keyboard mappings. Crucially, Claude deciphered it his (its?) CoT.
I just re-ran the original prompt but disabled thinking, and… it gets caught by the safety filter for some reason, telling me to use Sonnet 4 instead of Opus 4.5. Sonnet 4 doesn’t get it right with thinking disabled, but with thinking re-enabled, it actually gets it.
I don’t understand. surely it has been exposed to training resources that contain, say, Serbian which is written in both Latin and Cyrillic. And more relevant: news articles that have transliterations of Anglophone celebrity names and places:
Дэвід Бекхэм (David Beckham)
Стенлі Кубрик (Stanley Kubrick)
Лінкольншир (Lincolnshire)
Why wouldn’t these map to common tokens?
The examples you gave are indeed transliterations. The Cyrillic text I’m talking about is actually nonsensical. Consider the reverse: if I mistakenly tried typing “істина” (Truth) on an qwerty keyboard, the result is “scnbyf”.
Interesting, it would be fun to try it with the Claude Tokenizer