Louis Jaburi: Compact Proofs of Model Performance via Mechanistic Interpretability

Orpheus23 Feb 2026 16:44 UTC

2 points

0 comments1 min readLW link

Contact: o@horizonomega.org

Speaker: Louis Jaburi (Independent researcher) Talk: Compact Proofs of Model Performance via Mechanistic Interpretability

Part of the Guaranteed Safe AI Seminars, a monthly online series on AI systems with quantitative safety guarantees.

Recording: https://youtu.be/m_2JnJglx9g

Readings: arXiv:2406.11779, arXiv:2410.07476

YouTube playlist: https://www.youtube.com/playlist?list=PLOutnjp2BEJeQM2J49_KvdpuZlaQXPboy

Orpheus23 Feb 2026 16:44 UTC

2 points

0 comments1 min readLW link

No comments.