There is this problem with human psychology that threatening someone with torture doesn’t contribute to their better judgement.
If threatening someone with eternal torture would magically raise their intelligence over 9000 and give them ability to develop a correct theory of Friendliness and reliably make them build a Friendly AI in five years… then yes, under these assumptions, threatening people with eternal torture could be the morally correct thing to do.
But human psychology doesn’t work this way. If you start threatening people with torture, they are more likely to make mistakes in their reasoning. See: motivated reasoning, “ugh” fields, etc.
Therefore, the hypothetical AI threatening people with torture for… well, pretty much for not being perfectly epistemically and instrumentally rational… would decrease the probability of Friendly AI being built correctly. Therefore, I don’t consider this hypothetical AI to be Friendly.
yeah, the horror lies in the idea that it might be morally CORRECT for an FAI to engage in eternal torture of some people.
There is this problem with human psychology that threatening someone with torture doesn’t contribute to their better judgement.
If threatening someone with eternal torture would magically raise their intelligence over 9000 and give them ability to develop a correct theory of Friendliness and reliably make them build a Friendly AI in five years… then yes, under these assumptions, threatening people with eternal torture could be the morally correct thing to do.
But human psychology doesn’t work this way. If you start threatening people with torture, they are more likely to make mistakes in their reasoning. See: motivated reasoning, “ugh” fields, etc.
Therefore, the hypothetical AI threatening people with torture for… well, pretty much for not being perfectly epistemically and instrumentally rational… would decrease the probability of Friendly AI being built correctly. Therefore, I don’t consider this hypothetical AI to be Friendly.
[removed]