Video/​animation: Neel Nanda explains what mechanistic interpretability is

Link post

Nice little video—audio is Neel Nanda explaining what mechanistic interpretability is and why he does it, and it’s illustrated by the illustrious Hamish Doodles. Excerpted from the AXRP episode.

(It’s not technically animation I think, but I don’t know what other single word to use for “pictures that move a bit and change”)