Relevant paper: MONET: MIXTURE OF MONOSEMANTIC EXPERTS FOR TRANSFORMERS LessWrong: Monet: Mixture of Monosemantic Experts for Transformers Explained
Relevant paper: MONET: MIXTURE OF MONOSEMANTIC EXPERTS FOR TRANSFORMERS
LessWrong: Monet: Mixture of Monosemantic Experts for Transformers Explained