dxu comments on Symbol/​Referent Confusions in Language Model Alignment Experiments