Strongly agree about the importance of ambitious mech interp.
My personal belief is that we should go back to tiny “toy” models, starting with 1-layer models and fully understand them before scaling up to 2-layer models, then 4-layer models, etc.
I put together a proposal to start a research lab focused on ambitious mech interp for small models—please reply or ping if you’re interested in discussing:
Strongly agree about the importance of ambitious mech interp.
My personal belief is that we should go back to tiny “toy” models, starting with 1-layer models and fully understand them before scaling up to 2-layer models, then 4-layer models, etc.
I put together a proposal to start a research lab focused on ambitious mech interp for small models—please reply or ping if you’re interested in discussing:
https://docs.google.com/document/d/14WJK81ZM6IcF8igVxwmFLuTNunYyuVTPKGG5iRdb2Nk/edit?usp=sharing