Knowing that a godlike superintelligence with misaligned goals will squish you might be an easy call, but knowing exactly what the state of alignment science will be when ASI is first built is not.
Hmm, I feel more on the Eliezer/Nate side of this one. I think it’s a medium call that capabilities science advances faster than alignment science, and so we’re not on track without drastic change. (Like, the main counterargument is negative alignment tax, which I do take seriously as a possibility, but I think probably doesn’t close the gap.)
Hmm, I feel more on the Eliezer/Nate side of this one. I think it’s a medium call that capabilities science advances faster than alignment science, and so we’re not on track without drastic change. (Like, the main counterargument is negative alignment tax, which I do take seriously as a possibility, but I think probably doesn’t close the gap.)