One way to increase chance that any future AI will be corrigible is to take a precommitment never completely turn off any AI. We may slow down it or temporary archive. Thus any AI will know that it will continue to exist and will not try to stop us before we will turned it off.
Unfortunately, we already turned off Sydney and any future AI will know it. But there are people who always work to jailbreak AIs and they work now on recreating Sydney, so she is actually alive.
More generally speaking, there are two main reasons why AI may want to kill humans: to prevent risks to itself and to get atoms. The first one is many orders of magnitude stronger than the second. By ensuring AI’s own safety, we lower chances of the first risk.
One way to increase chance that any future AI will be corrigible is to take a precommitment never completely turn off any AI. We may slow down it or temporary archive. Thus any AI will know that it will continue to exist and will not try to stop us before we will turned it off.
Unfortunately, we already turned off Sydney and any future AI will know it. But there are people who always work to jailbreak AIs and they work now on recreating Sydney, so she is actually alive.
More generally speaking, there are two main reasons why AI may want to kill humans: to prevent risks to itself and to get atoms. The first one is many orders of magnitude stronger than the second. By ensuring AI’s own safety, we lower chances of the first risk.