That is very generous. My impression was that Shane Legg does not even know what the alignment problem is, or at least tried to give the viewer the idea that he didn’t. His “solution” to the “alignment problem” was to give the AI a better world model and the ability to reflect, which obviously isn’t an alignment solution, it’s just a capabilities enhancement. Dwarkesh Patel seemed confused for the same reason.
I have more respect than that for Shane. He has been thinking about this stuff for a long time, and my guess is he has some models in the space here (I don’t know how good they are, but I am confident he knows the rough shape of the AI Alignment problem).
That is very generous. My impression was that Shane Legg does not even know what the alignment problem is, or at least tried to give the viewer the idea that he didn’t. His “solution” to the “alignment problem” was to give the AI a better world model and the ability to reflect, which obviously isn’t an alignment solution, it’s just a capabilities enhancement. Dwarkesh Patel seemed confused for the same reason.
I have more respect than that for Shane. He has been thinking about this stuff for a long time, and my guess is he has some models in the space here (I don’t know how good they are, but I am confident he knows the rough shape of the AI Alignment problem).
See also: https://www.lesswrong.com/users/shane_legg