RSS

Jason Gross

Karma: 212

Com­pact Proofs of Model Perfor­mance via Mechanis­tic Interpretability

24 Jun 2024 19:27 UTC
92 points
3 comments8 min readLW link
(arxiv.org)