Brendan Long comments on Harmless reward hacks can generalize to misalignment in LLMs