Transformers do not natively operate on sequences.
This was a big misconception I had because so much of the discussion around transformers is oriented around predicting sequences. However, it’s more accurate to think of general transformers as operating on unordered sets of tokens. The understanding of sequences only comes if you have a positional embedding to tell the transformer how the tokens are ordered, and possibly a causal mask to force attention to flow in only one direction.
In some sense I agree but I do think it’s more nuanced than that in practice. Once you add in cross-entropy loss on next token prediction, alongside causal masking, you really do get a strong sense of “operating on sequences”. This is because next token prediction is fundamentally sequential in nature in that the entire task is to make use of the correlational structure of sequences of data in order to predict future sequences.
There’s the atom transformer in AlphaFold-like architectures, although the embeddings it operates on do encode 3D positioning from earlier parts of the model so maybe that doesn’t count.
The Money Stuff column mentioned AI alignment, rationality, and the UK AISI today:
Here is a post from the UK AI Security Institute looking for economists to “find incentives and mechanisms to direct strategic AI agents to desirable equilibria.” One model that you can have is that superhuman AI will be terrifying in various ways, but extremely rational. Scary AI will not be an unpredictable lunatic; it will be a sort of psychotic pursuing its own aims with crushing instrumental rationality. And arguably that’s where you need economists! The complaint people have about economics is that it tries to model human behavior based on oversimplified assumptions of rationality. But if super AI is super-rational, economists will be perfectly suited to model it. Anyway if you want to design incentives for AI here’s your chance.
Doublespeak is the deliberate distortion of words’ meaning, particularly to convey different meanings to different audiences or in different contexts. In Preventing Language Models From Hiding Their Reasoning, @Fabien Roger and @ryan_greenblatt show that LLMs can learn to hide their reasoning using apparently innocuous, coded language. I’m wondering if LLMs have or can easily gain the capability to hide more general messages this way. In particular, reasoning or messages completely unrelated to the apparent message. I have some ideas for investigating this empirically, but I’m wondering what intution people have on this.
You might be interested in theserelatedresults. TL;DR: people have tried, but at the scale academics are working at, it’s very hard to get RL to learn interesting encoding schemes. Encoded reasoning is also probably not an important part of the performance of reasoning models (see this).
Transformers do not natively operate on sequences.
This was a big misconception I had because so much of the discussion around transformers is oriented around predicting sequences. However, it’s more accurate to think of general transformers as operating on unordered sets of tokens. The understanding of sequences only comes if you have a positional embedding to tell the transformer how the tokens are ordered, and possibly a causal mask to force attention to flow in only one direction.
In some sense I agree but I do think it’s more nuanced than that in practice. Once you add in cross-entropy loss on next token prediction, alongside causal masking, you really do get a strong sense of “operating on sequences”. This is because next token prediction is fundamentally sequential in nature in that the entire task is to make use of the correlational structure of sequences of data in order to predict future sequences.
Has anyone ever trained a transformer that doesn’t suck, without positional information (such as positional embedding or causal mask)?
There’s the atom transformer in AlphaFold-like architectures, although the embeddings it operates on do encode 3D positioning from earlier parts of the model so maybe that doesn’t count.
The Money Stuff column mentioned AI alignment, rationality, and the UK AISI today:
Can LLMs Doublespeak?
Doublespeak is the deliberate distortion of words’ meaning, particularly to convey different meanings to different audiences or in different contexts. In Preventing Language Models From Hiding Their Reasoning, @Fabien Roger and @ryan_greenblatt show that LLMs can learn to hide their reasoning using apparently innocuous, coded language. I’m wondering if LLMs have or can easily gain the capability to hide more general messages this way. In particular, reasoning or messages completely unrelated to the apparent message. I have some ideas for investigating this empirically, but I’m wondering what intution people have on this.
You might be interested in these related results. TL;DR: people have tried, but at the scale academics are working at, it’s very hard to get RL to learn interesting encoding schemes. Encoded reasoning is also probably not an important part of the performance of reasoning models (see this).
Thanks! Your second link is very similar to what I had in mind — I feel a bit embarrassed for missing it.
“Stochasticity” is in the map, “randomness” is in the territory.