cubefox comments on Paper: The Capacity for Moral Self-Correction in Large Language Models (Anthropic)