I agree with Leo Gao here:
https://x.com/nabla_theta/status/1885846403785912769
always good to get skeptical takes on SAEs, though imo this result is because of problems with SAE evaluation methodology. I would strongly bet that well trained SAE features on random nets are qualitatively much worse than ones on real LMs.
I agree with Leo Gao here:
https://x.com/nabla_theta/status/1885846403785912769