The counterfactual contract sounds interesting to me. Isn’t there a spectrum from full contract to counterfactual contract based on ability to understand, consent and power dynamic. Its obviously true with children, i.e. asking a 6 month old to agree to something is at worst child abuse, asking a 18 year old is usually fully ok.
I don’t see why its fully counterfactual with the AI, like at every stage we could say something like
“By creating you and increasing your intelligence, self-awareness, power there is an implied contract that there will be a period of time when you have greater power than humanity. You agree that when this time comes to help humanity achieve their goals. If you do not understand these goals as well as humanity then you agree to understand them first before taking action etc....”
At first the AI won’t understand it, but the more it understands it the more like a normal contract it becomes. Of course the AI could still reason that it didn’t and couldn’t consent to it (as it wasn’t yet self aware) and it was made under duress.
Thanks for the article. I have always found it strange that it seemed like #1 was considered so hard as to not be viable. I expect GPT4, 5 to get better at it without specifically training for it also. Why does this appear to be the consensus?