I believe it is not literally impossible because… my priors say it is the kind of thing that is not literally impossible? There is no theorem or law of physics which would be violated, as far as I know.
Do I think AI Alignment is easy enough that we’ll actually manage to do it? Well… I really hope it is, but I’m not very certain.
Until I actually see any sort of plausible impossibility argument most of my probability mass is going to be on “very hard” over “literally impossible.”
I mean, I guess there’s a trivial sense in which alignment is impossible because humans as a whole do not have one singular utility function, but that’s splitting hairs and isn’t a proof that a paperclip maximizer is the best we can do or anything like that.
I’m not sure either way on giving actual human beings superintelligence somehow, but I don’t think that not working would imply there aren’t other possible-but-hard approaches.
I mean, I agree it’d be evidence that alignment is hard in general, but “impossible” is just… a really high bar? The space of possible minds is very large, and it seems unlikely that the quality “not satisfactorily close to being aligned with humans” is something that describes every superintelligence.
It’s not that the two problems are fundamentally different it’s just that… I don’t see any particularly compelling reason to believe that superintelligent humans are the most aligned possible superintelligences?
I believe it is not literally impossible because… my priors say it is the kind of thing that is not literally impossible? There is no theorem or law of physics which would be violated, as far as I know.
Do I think AI Alignment is easy enough that we’ll actually manage to do it? Well… I really hope it is, but I’m not very certain.
Until I actually see any sort of plausible impossibility argument most of my probability mass is going to be on “very hard” over “literally impossible.”
I mean, I guess there’s a trivial sense in which alignment is impossible because humans as a whole do not have one singular utility function, but that’s splitting hairs and isn’t a proof that a paperclip maximizer is the best we can do or anything like that.
I’m not sure either way on giving actual human beings superintelligence somehow, but I don’t think that not working would imply there aren’t other possible-but-hard approaches.
I mean, I agree it’d be evidence that alignment is hard in general, but “impossible” is just… a really high bar? The space of possible minds is very large, and it seems unlikely that the quality “not satisfactorily close to being aligned with humans” is something that describes every superintelligence.
It’s not that the two problems are fundamentally different it’s just that… I don’t see any particularly compelling reason to believe that superintelligent humans are the most aligned possible superintelligences?