The recent goodfire paper seems to me a step into that direction. Also going completely synthetic for the training data might be a way.
The recent goodfire paper seems to me a step into that direction. Also going completely synthetic for the training data might be a way.