John Danaher on ‘The Superintelligent Will’

Philosopher John Danaher has written an explication and critique of Bostrom’s “orthogonality thesis” from “The Superintelligent Will.” To quote the conclusion:

Summing up, in this post I’ve considered Bostrom’s discussion of the orthogonality thesis. According to this thesis, any level of intelligence is, within certain weak constraints, compatible with any type of final goal. If true, the thesis might provide support for those who think it possible to create a benign superintelligence. But, as I have pointed out, Bostrom’s defence of the orthogonality thesis is lacking in certain respects, particularly in his somewhat opaque and cavalier dismissal of normatively thick theories of rationality.

As it happens, none of this may affect what Bostrom has to say about unfriendly superintelligences. His defence of that argument relies on the convergence thesis, not the orthogonality thesis. If the orthogonality thesis turns out to be false, then all that happens is that the kind of convergence Bostrom alludes to simply occurs at a higher level in the AI’s goal architecture.

What might, however, be significant is whether the higher-level convergence is a convergence towards certain moral beliefs or a convergence toward nihilistic beliefs. If it is the former, then friendliness might be necessitated, not simply possible. If it is the latter, then all bets are off. A nihilistic agent could do pretty anything since, no goals would be rationally entailed.