Absolutely fascinating. While I have a list of competing priorities, this is what my mind drifts off to investigating ever since I watched the 80,000 Hours episode with Kyle Fish discussing various instances of Claude behavior. I’m working on getting my 3D interpretability visualization tool working in realtime first, and then this “attractor state” area is #1 on my list. (I’d provide a link, but being new, any links lead to auto-rejection, which has been frustrating.)
Absolutely fascinating. While I have a list of competing priorities, this is what my mind drifts off to investigating ever since I watched the 80,000 Hours episode with Kyle Fish discussing various instances of Claude behavior. I’m working on getting my 3D interpretability visualization tool working in realtime first, and then this “attractor state” area is #1 on my list. (I’d provide a link, but being new, any links lead to auto-rejection, which has been frustrating.)