What if Peely had a secondary goal to not harm humans? What is stopping it from accomplishing goal number 1 in accordance with goal number 2? Why should we assume that a superintelligent entity would be incapable of holding multiple values?
A key question is if the typical goal-directed superintelligence would assign any significant value to humans. If it does, that greatly reduces the threat from superintelligence. We have a somewhat relevant article earlier in the sequence: AI’s goals may not match ours.
BTW, if you’re up for helping up improve the article, would you mind answering some questions? Like: do you feel like our article was “epistemically co-operative”? That is, do you think it helps readers orient themselves in the discussion on AI safety, makes the assumptions clear, and generally tries to explain rather than persuade? What’s your general level of familiarity with AI Safety?
What if Peely had a secondary goal to not harm humans? What is stopping it from accomplishing goal number 1 in accordance with goal number 2? Why should we assume that a superintelligent entity would be incapable of holding multiple values?
A key question is if the typical goal-directed superintelligence would assign any significant value to humans. If it does, that greatly reduces the threat from superintelligence. We have a somewhat relevant article earlier in the sequence: AI’s goals may not match ours.
BTW, if you’re up for helping up improve the article, would you mind answering some questions? Like: do you feel like our article was “epistemically co-operative”? That is, do you think it helps readers orient themselves in the discussion on AI safety, makes the assumptions clear, and generally tries to explain rather than persuade? What’s your general level of familiarity with AI Safety?