Until I actually see any sort of plausible impossibility argument most of my probability mass is going to be on “very hard” over “literally impossible.”
I mean, I guess there’s a trivial sense in which alignment is impossible because humans as a whole do not have one singular utility function, but that’s splitting hairs and isn’t a proof that a paperclip maximizer is the best we can do or anything like that.
I’m not sure either way on giving actual human beings superintelligence somehow, but I don’t think that not working would imply there aren’t other possible-but-hard approaches.
I mean, I agree it’d be evidence that alignment is hard in general, but “impossible” is just… a really high bar? The space of possible minds is very large, and it seems unlikely that the quality “not satisfactorily close to being aligned with humans” is something that describes every superintelligence.
It’s not that the two problems are fundamentally different it’s just that… I don’t see any particularly compelling reason to believe that superintelligent humans are the most aligned possible superintelligences?
Until I actually see any sort of plausible impossibility argument most of my probability mass is going to be on “very hard” over “literally impossible.”
I mean, I guess there’s a trivial sense in which alignment is impossible because humans as a whole do not have one singular utility function, but that’s splitting hairs and isn’t a proof that a paperclip maximizer is the best we can do or anything like that.
I’m not sure either way on giving actual human beings superintelligence somehow, but I don’t think that not working would imply there aren’t other possible-but-hard approaches.
I mean, I agree it’d be evidence that alignment is hard in general, but “impossible” is just… a really high bar? The space of possible minds is very large, and it seems unlikely that the quality “not satisfactorily close to being aligned with humans” is something that describes every superintelligence.
It’s not that the two problems are fundamentally different it’s just that… I don’t see any particularly compelling reason to believe that superintelligent humans are the most aligned possible superintelligences?