Agreed. My comment was not a criticism of the post. I think the depth of deception makes interpretability nearly impossible in the sense that you are going to find deception triggers in nearly all actions as models become increasingly sophisticated.
Agreed. My comment was not a criticism of the post. I think the depth of deception makes interpretability nearly impossible in the sense that you are going to find deception triggers in nearly all actions as models become increasingly sophisticated.