Or maybe use a weaker version of the orthogonality thesis (and not call it “orthogonality” which sounds like claiming full independence). And also emphasize that there are multiple lines of argument and not get stuck debating a particular one.
Right, at least mentioning that there is a more abstract argument that doesn’t depend on particular scenarios could be useful (for example, in Luke’s Facing the Singularity).
The robust part of “orthogonality” seems to be the idea that with most approaches to AGI (including neuromorphic or evolved, with very few exceptions such as WBE, which I wouldn’t call AGI, just faster humans with more dangerous tools for creating an AGI), it’s improbable that we end up with something close to human values, even if we try, and that greater optimization power of a design doesn’t address this issue (while aggravating the consequences, potentially all the way to a fatal intelligence explosion). I don’t think it’s too early to draw this weaker conclusion (and stronger statements seem mostly irrelevant for the argument).
This version is essentially Eliezer’s “complexity and fragility of values”, right? I suggest we keep calling it that, instead of “orthogonality” which again sounds like a too strong claim which makes it less likely for people to consider it seriously.
This version is essentially Eliezer’s “complexity and fragility of values”, right?
Basically, but there is a separate point here that greater optimization power doesn’t help with the problem and instead makes it worse. I agree that the word “orthogonality” is somewhat misleading.
Right, at least mentioning that there is a more abstract argument that doesn’t depend on particular scenarios could be useful (for example, in Luke’s Facing the Singularity).
The robust part of “orthogonality” seems to be the idea that with most approaches to AGI (including neuromorphic or evolved, with very few exceptions such as WBE, which I wouldn’t call AGI, just faster humans with more dangerous tools for creating an AGI), it’s improbable that we end up with something close to human values, even if we try, and that greater optimization power of a design doesn’t address this issue (while aggravating the consequences, potentially all the way to a fatal intelligence explosion). I don’t think it’s too early to draw this weaker conclusion (and stronger statements seem mostly irrelevant for the argument).
This version is essentially Eliezer’s “complexity and fragility of values”, right? I suggest we keep calling it that, instead of “orthogonality” which again sounds like a too strong claim which makes it less likely for people to consider it seriously.
Basically, but there is a separate point here that greater optimization power doesn’t help with the problem and instead makes it worse. I agree that the word “orthogonality” is somewhat misleading.
David Dalrymple was nice enough to illustrate my concern with “orthogonality” just as we’re talking about it. :)
...which also presented an opportunity to make a consequentialist argument for FAI under the assumption that all AGIs are good.