I would guess that the reason it hasn’t devolved into full neuralese is because there is a KL divergence penalty, similar to how RHLF works.
I would guess that the reason it hasn’t devolved into full neuralese is because there is a KL divergence penalty, similar to how RHLF works.