Nina Rimsky comments on Modulating sycophancy in an RLHF model via activation steering