I think much of the fear (aka probability mass of AI-doom) is not from the coherence of misaligned goals, but from the competence at implementing anything that’s not an aligned-goal.
I’m not trying to address the entire case for doom, which involves numerous contingent facts and both abstract and empirical claims. I could be be right or wrong about coherence, and doom might still be improbable or probable in either case. I’m trying to… talk around my difficulties with the more narrow view that (~approximately) AI entities trained to have great capabilities are thereby likely to have coherent single ends.
I’m not trying to address the entire case for doom, which involves numerous contingent facts and both abstract and empirical claims. I could be be right or wrong about coherence, and doom might still be improbable or probable in either case. I’m trying to… talk around my difficulties with the more narrow view that (~approximately) AI entities trained to have great capabilities are thereby likely to have coherent single ends.
One might view me as attempting to take part in a long conversation including, for instance, “Why assume AGIs will optimize for fixed goals”.