Perhaps there’s a definitional disconnect. The conception of steganography we are working with is not constrained to the symbolic level. Rather, it extends to a more speculative scenario at the activation level: where activations correlated with the primary text might simultaneously encode activations for a secondary text, one that is absent from the observable Chain-of-Thought. It’s pretty clear from our introduction.
Perhaps there’s a definitional disconnect. The conception of steganography we are working with is not constrained to the symbolic level. Rather, it extends to a more speculative scenario at the activation level: where activations correlated with the primary text might simultaneously encode activations for a secondary text, one that is absent from the observable Chain-of-Thought. It’s pretty clear from our introduction.
Yes, I agree there is some definitional disconnect. I actually just posted my understanding of it at here.
Nice! Thanks for sharing. Will take a look.