Firstly, thank you for creating this well-written and thoughtful post. I have a question, but I would like to start by summarising the article. My initial draft of the summary was too verbose for a comment, so I condensed it further—I hope I have still captured the main essense of the text, despite this rather extreme summarisation. Please let me know if I have misinterpreted anything.
People who predict doomsday scenarios are making one main assumption: that the AI will, once it reaches a conclusion or plan, EVEN if there is a measure of probability assigned to that conclusion or plan, act as if the conclusion is certain.
That is, it will mindlessly carry out the action, even if its human instructors say it is the wrong action, even if the checking code in place to double check actions says that it’s wrong (due to the instructors saying it’s wrong).
You state that this all comes down to a ‘Doctrine of Infallibility’: if an AI cannot reassess or in some way take into account the uncertainty of its conclusions, then it might act as the doomsayers fear. But you also state that an AI containing such a Doctrine would be unable to be very intelligent, because it would contain a contradiction, a logical inconsistency: it would have both a full understanding of the uncertainty in its knowledge and reasoning methods, AND still be programmed to act as if it was certain of its conclusions. Such an AI would never reach human level AI, let alone be intelligent enough to be a threat to us.
Any assumption that an AGI would take our commands literally and instantly implement the naive (and catastrophic) actions to fulfil them is based on the assumption that the AI is sticking to the Doctrine of Infallibility, and is thus flawed.
Now, my question to you: do you think it would be possible in theory to create an AI which does not abide by the Doctrine of Infallibility (something like a Swarm Relaxation Intelligence), yet is STILL programmed to perform actions purely based on the consideration of all the types of uncertainty in its knowledge and reasoning methods? So for example, it considers all relevant facts, reaches a plan to fulfil its goal, continues to investigate relevant facts until its probability that this is the best plan reaches a certain threshold, and implements it? I know that in practice, we would have ‘checking code’ which makes it pass the plan by humans before doing it, in order for us to assess the sensibleness and safety of the plan. But in theory, would you consider it possible for the process to work without human input, maybe once the AI has reached a certain level of intelligence?
Firstly, thank you for creating this well-written and thoughtful post. I have a question, but I would like to start by summarising the article. My initial draft of the summary was too verbose for a comment, so I condensed it further—I hope I have still captured the main essense of the text, despite this rather extreme summarisation. Please let me know if I have misinterpreted anything.
People who predict doomsday scenarios are making one main assumption: that the AI will, once it reaches a conclusion or plan, EVEN if there is a measure of probability assigned to that conclusion or plan, act as if the conclusion is certain. That is, it will mindlessly carry out the action, even if its human instructors say it is the wrong action, even if the checking code in place to double check actions says that it’s wrong (due to the instructors saying it’s wrong).
You state that this all comes down to a ‘Doctrine of Infallibility’: if an AI cannot reassess or in some way take into account the uncertainty of its conclusions, then it might act as the doomsayers fear. But you also state that an AI containing such a Doctrine would be unable to be very intelligent, because it would contain a contradiction, a logical inconsistency: it would have both a full understanding of the uncertainty in its knowledge and reasoning methods, AND still be programmed to act as if it was certain of its conclusions. Such an AI would never reach human level AI, let alone be intelligent enough to be a threat to us.
Any assumption that an AGI would take our commands literally and instantly implement the naive (and catastrophic) actions to fulfil them is based on the assumption that the AI is sticking to the Doctrine of Infallibility, and is thus flawed.
Now, my question to you: do you think it would be possible in theory to create an AI which does not abide by the Doctrine of Infallibility (something like a Swarm Relaxation Intelligence), yet is STILL programmed to perform actions purely based on the consideration of all the types of uncertainty in its knowledge and reasoning methods? So for example, it considers all relevant facts, reaches a plan to fulfil its goal, continues to investigate relevant facts until its probability that this is the best plan reaches a certain threshold, and implements it? I know that in practice, we would have ‘checking code’ which makes it pass the plan by humans before doing it, in order for us to assess the sensibleness and safety of the plan. But in theory, would you consider it possible for the process to work without human input, maybe once the AI has reached a certain level of intelligence?