The big thing Eliezer seems to believe, which I don’t think any mainstream AI people believe, is that shoving a consequentialist with preferences about the real world into your optimization algorithm is gonna be the key to making it a lot more powerful.
From the article you linked:
Point two: The class of catastrophe I’m worried about mainly happens when a system design is supposed to contain two consequentialists that are optimizing for different consequences, powerfully enough that they will, yes, backchain from Y to X whenever X is a means of influencing or bringing about Y, doing lookahead on more than one round, and so on. When someone talks about building a system design out of having two of *those* with different goals, and relying on their inability to collude, that is the point at which I worry that we’re placing ourselves into the sucker’s game of trying to completely survey a rich strategic space well enough to outsmart something smarter than us.
[emphasis mine]
The piece seems to be about how trying to control AI by dividing power is a bad idea, because then we’re doomed if they ever figure out how to get along with each other.
Why would you put two consequentialists in your system that are optimizing for different sets of consequences? A consequentialist is a high-level component, not a low-level one. Anthropomorphic bias might lead you to believe that a “consequentialist agent” is ontologically fundamental, a conceptual atom which can’t be divided. But this doesn’t really seem to be true from a software perspective.
From the article you linked:
[emphasis mine]
The piece seems to be about how trying to control AI by dividing power is a bad idea, because then we’re doomed if they ever figure out how to get along with each other.
Why would you put two consequentialists in your system that are optimizing for different sets of consequences? A consequentialist is a high-level component, not a low-level one. Anthropomorphic bias might lead you to believe that a “consequentialist agent” is ontologically fundamental, a conceptual atom which can’t be divided. But this doesn’t really seem to be true from a software perspective.