William_S comments on High-stakes alignment via adversarial training [Redwood Research report]