Wei Dai comments on Solving adversarial attacks in computer vision as a baby version of general AI alignment

Wei Dai 2 Feb 2026 23:17 UTC
LW: 2 AF: 2
0
AF

We can view the problem as a proving ground for ideas and techniques to be later applied to the AI alignment problem at large.

Do you have any examples of such ideas and techniques? Are any of the ideas and techniques in your paper potentially applicable to general AI alignment?