Ivan Vendrov comments on Safety Implications of LeCun’s path to machine intelligence

Ivan Vendrov 16 Jul 2022 17:41 UTC
6 points
4
My read of LeCun in that conversation is that he doesn’t think in terms of outer alignment / value alignment at all, but rather in terms of implementing a series of “safeguards” that allow humans to recover if the AI behaves poorly (See Steven Byrnes’ summary).
I think this paper helps clarify why he believes this—he had something like this architecture in mind, and so outer alignment seemed basically impossible. Independently, he believes it’s unnecessary because the obvious safeguards will prove sufficient.