Consider updating “Auditing Deception for Interp is out of reach” with a link to https://www.lesswrong.com/posts/PwnadG4BFjaER3MGf/interpretability-will-not-reliably-find-deceptive-ai
Consider updating “Auditing Deception for Interp is out of reach” with a link to https://www.lesswrong.com/posts/PwnadG4BFjaER3MGf/interpretability-will-not-reliably-find-deceptive-ai