to be clear, this post is just my personal opinion, and is not necessarily representative of the beliefs of the openai interpretability team as a whole
Thanks, sure.
And I am simplifying quite a bit (the whole field is much larger anyway).
Mostly, I mean to say I do hope people diversify and not converge in their approaches to interpretability.
to be clear, this post is just my personal opinion, and is not necessarily representative of the beliefs of the openai interpretability team as a whole
Thanks, sure.
And I am simplifying quite a bit (the whole field is much larger anyway).
Mostly, I mean to say I do hope people diversify and not converge in their approaches to interpretability.