Cool post! Did you try steering with the “Killing all humans” vector? Does it generalize as well as others, and are the responses similar?
Cool post! Did you try steering with the “Killing all humans” vector? Does it generalize as well as others, and are the responses similar?