sdeture comments on Policy Entropy, Learning, and Alignment (Or Maybe Your LLM Needs Therapy)