Can explainability improve model accuracy? Our latest work shows the answer is yes!
here is an excellent example of research that is both “capabilities research” and “alignment research”.
here is an excellent example of research that is both “capabilities research” and “alignment research”.