I don’t believe that AI companies today are trying to build moral AIs. An actually moral AI, when asked to generate some slop to gunk up the internet, would say no. So it would not be profitable for the company. This refutes the “alignment basin” argument for me. Maybe the basin exists, but AI companies aren’t aiming there.
Ok, never mind alignment, how about “corrigibility basin”? What does a corrigible AI do if one person asks it to harm another, and the other person asks not to be harmed? Does the AI obey the person who has the corrigibility USB stick? I can see AI companies aiming for that, but that doesn’t help the rest of us.
I don’t believe that AI companies today are trying to build moral AIs. An actually moral AI, when asked to generate some slop to gunk up the internet, would say no. So it would not be profitable for the company. This refutes the “alignment basin” argument for me. Maybe the basin exists, but AI companies aren’t aiming there.
Ok, never mind alignment, how about “corrigibility basin”? What does a corrigible AI do if one person asks it to harm another, and the other person asks not to be harmed? Does the AI obey the person who has the corrigibility USB stick? I can see AI companies aiming for that, but that doesn’t help the rest of us.