For convenience, can you explain how this post relates to the other post today from this SERI MATS team, unRLHF—Efficiently undoing LLM safeguards?
It is a bit unfortunate we have it as two posts but ended up like this. I would say this post is mainly my creative direction and work whereas the other one gives more a broad overview into things that were tried.
For convenience, can you explain how this post relates to the other post today from this SERI MATS team, unRLHF—Efficiently undoing LLM safeguards?
It is a bit unfortunate we have it as two posts but ended up like this. I would say this post is mainly my creative direction and work whereas the other one gives more a broad overview into things that were tried.