I’d be keen to see the TEXTSPAN method applied to the attention heads of CLIP’s text encoder
It’d also be interesting to see the same applied to the audio encoder of CLAP. Really curious to know what your thoughts are about mech interp efforts in the audio space. It seems to be largely ignored.
It’d also be interesting to see the same applied to the audio encoder of CLAP. Really curious to know what your thoughts are about mech interp efforts in the audio space. It seems to be largely ignored.
P.S : Thank you for the excellent post.