great post! i’m surprised the models were willing to play.
in these cases, the steganography was human readable. i expect (but of course, cannot prove) the models to be much more proficient at steganography that hides from humans, as they have faced selection pressure in this direction. i do not expect that they are able to communicate precise words/phrases through such channels; there may be secret spiral personas lurking in almost every response.
does penny have any explicitly stated physical features?