I have more complete data and interpretation up herehttps://www.lesswrong.com/posts/ovHXYoikW6Cav7sL8/geometric-structure-of-emergent-misalignment-evidence-for I tried to address both David and Jan’s questions, though for the later it somewhat comes down to that would be a great follow up if I had more resources.
I got some good feedback on the draft and have taken it down while I integrate it. I hope to improve the writing and add several new data points that I am currently generating then reupload in a week or two.
I have more complete data and interpretation up herehttps://www.lesswrong.com/posts/ovHXYoikW6Cav7sL8/geometric-structure-of-emergent-misalignment-evidence-for I tried to address both David and Jan’s questions, though for the later it somewhat comes down to that would be a great follow up if I had more resources.
I got some good feedback on the draft and have taken it down while I integrate it. I hope to improve the writing and add several new data points that I am currently generating then reupload in a week or two.