Anything that improves interpretability seems like a no brainer.
With the caveat that it has to actually improve interpretability in a provable fashion, and not just make us think it did, like we saw with CoTR.
Anything that improves interpretability seems like a no brainer.
With the caveat that it has to actually improve interpretability in a provable fashion, and not just make us think it did, like we saw with CoTR.