Hi! Chris Olah was head of interpretability at OpenAI and now at Anthropic, he is probably an important person. @Neel Nanda might be the guru of interpretability on this forum (his guide to starting out).
Hi! Chris Olah was head of interpretability at OpenAI and now at Anthropic, he is probably an important person. @Neel Nanda might be the guru of interpretability on this forum (his guide to starting out).