ryan_greenblatt comments on Modulating sycophancy in an RLHF model via activation steering