iamthouthouarti comments on Daniel Kokotajlo’s Shortform

iamthouthouarti 8 Apr 2026 18:44 UTC
1 point
0
Are you at all worried about whether Claude Mythos being accidentally trained against CoT will corrupt future Claude models? Furthermore, I don’t understand how we can get reliable CoT monitoring if it’s included in a model’s training data, otherwise won’t the issue just continue to manifest in different ways?