Denying the orthogonality thesis looks like claims that an AI built with one set of values will tend to change those values in a particular direction as it becomes cleverer
You can also argue that not all value-capacity pairs are stable or compatible with self-improvement.
Yeah, I was a bit fast and loose—there are plenty of other ways to deny the orthogonality thesis, I just focused on the one I think is most common in the wild.
You can also argue that not all value-capacity pairs are stable or compatible with self-improvement.
Yeah, I was a bit fast and loose—there are plenty of other ways to deny the orthogonality thesis, I just focused on the one I think is most common in the wild.