“We are updating Grok’s constitution. We think it should be able to play many characters, like the waifu anime girl Ani, the truth-seeking AI assistant Grok, and the unhinged version MechaHitler.”
What annoys me about alignment research is that it could be used for create evil models just by trivially plugging a minus sign in front of it.
“We are updating Grok’s constitution. We think it should be able to play many characters, like the waifu anime girl Ani, the truth-seeking AI assistant Grok, and the unhinged version MechaHitler.”
What annoys me about alignment research is that it could be used for create evil models just by trivially plugging a minus sign in front of it.