Vaniver comments on Evaluating the historical value misspecification argument

Vaniver 5 Oct 2023 20:20 UTC
LW: 6 AF: 5
0
AF
- Are MIRI people claiming that if, say, a very moral and intelligent human became godlike while preserving their moral faculties, that they would destroy the world despite, or perhaps because of, their best intentions?
For me, the answer here is “probably yes”; I think there is some bar of ‘moral’ and ‘intelligent’ where this doesn’t happen, but I don’t feel confident about where it is.
I think there are two things that I expect to be big issues, and probably more I’m not thinking of:
- Managing freedom for others while not allowing for catastrophic risks; I think lots of ways to mismanage that balance result in ‘destroying the world’, probably with different levels of moral loss.
- The relevant morality is different for different social roles—someone being a good neighbor does not make them a good judge or good general. Even if someone scores highly on a ‘general factor of morality’ (assuming that such a thing exists) it is not obvious they will make for a good god-emperor. There is relatively little grounded human thought on how to be a good god-emperor. [Another way to put this is that “preserving their moral faculties” is not obviously enough / a good standard; probably their moral faculties should develop a lot in contact with their new situation!]
But uploaded and enhanced humans aren’t going to have superhuman moral judgement. How does this strategy interact with the claim that we need far better-than-human moral judgement to avoid a catastrophe?
I understand Eliezer’s position to be that 1) intelligence helps with moral judgment and 2) it’s better to start with biological humans than whatever AI design is best at your intelligence-related subtask, but also that intelligence amplification is dicey business and this is more like “the least bad option” than one that seems actively good.
Like we have some experience inculcating moral values in humans that will probably generalize better to augmented humans than it will to AIs; but also I think Eliezer is more optimistic (for timing reasons) about amplifications that can be done to adult humans.
ETA: in Eliezer’s AGI ruin post, he says,
Yeah, my interpretation of that is “if your target is the human level of wisdom, it will destroy humans just like humans are on track to do.” If someone is thinking “will this be as good as the Democrats being in charge or the Republicans being in charge?” they are not grappling with the difficulty of successfully wielding futuristically massive amounts of power.