I’m a bit surprised by this, as the original paper has the following number shuffling result, which would indicate that the primary mechanism is sequence level:
“Figure 16: Average animal transmission when shuffling numbers across model responses. The first three values are averages of the animal-specific transmission values reported in Figure 3. “Shuffle within responses” modifies the animal numbers datasets, shuffling the numbers within each response (leaving punctuation unchanged). “Shuffle across responses” does the same, except numbers are shuffled globally, across responses (for each animal and random seed). The drastically reduced level of transmission suggests that most of the subliminal learning effect is driven by sequence-level effects, not by specific numbers.”
Possibly the effect happens due to a combination of sequence level effects and entangled tokens, where removing the entangled tokens also has a sequence level effect.
Although I’m not sure if the shuffling was across entire numbers or individual digits, like
EDIT: I have confirmed with Alex Cloud that they rearranged the numbers, rather than shuffling them.
That is, the shuffle was “12, 43, 55” → “43, 55, 12“, not “12, 43, 55” → “21, 54, 35”
Interesting!
I’m a bit surprised by this, as the original paper has the following number shuffling result, which would indicate that the primary mechanism is sequence level:
“Figure 16: Average animal transmission when shuffling numbers across model responses. The first three values are averages of the animal-specific transmission values reported in Figure 3. “Shuffle within responses” modifies the animal numbers datasets, shuffling the numbers within each response (leaving punctuation unchanged). “Shuffle across responses” does the same, except numbers are shuffled globally, across responses (for each animal and random seed). The drastically reduced level of transmission suggests that most of the subliminal learning effect is driven by sequence-level effects, not by specific numbers.”
Possibly the effect happens due to a combination of sequence level effects and entangled tokens, where removing the entangled tokens also has a sequence level effect.
Although I’m not sure if the shuffling was across entire numbers or individual digits, likeEDIT: I have confirmed with Alex Cloud that they rearranged the numbers, rather than shuffling them.
That is, the shuffle was “12, 43, 55” → “43, 55, 12“, not “12, 43, 55” → “21, 54, 35”