Louis Jaburi: Compact Proofs of Model Performance via Mechanistic Interpretability

  • Contact: o@hori­zonomega.org

Speaker: Louis Jaburi (Independent researcher) Talk: Compact Proofs of Model Performance via Mechanistic Interpretability

Part of the Guaranteed Safe AI Seminars, a monthly online series on AI systems with quantitative safety guarantees.

Recording: https://​​youtu.be/​​m_2JnJglx9g

Readings: arXiv:2406.11779, arXiv:2410.07476

YouTube playlist: https://​​www.youtube.com/​​playlist?list=PLOutnjp2BEJeQM2J49_KvdpuZlaQXPboy

No comments.