Several times, I’ve heard Eliezer say something like “a powerful consequentialist AI could run on ‘is’ statements only, without any ‘ought’ statements”, and I don’t think I’ve ever heard him explain clearly what the difference is between the two categories of statements that he’s tracking.
The classical Humean distinction seems to posit that all “motivational force” is derived from “ought” statements, so it seems like he thinks about it differently than Hume.
What Hume observed is that there are some sentences that involve an “is,” some sentences involve “ought,” and if you start from sentences that only have “is” you can’t get sentences that involve “oughts” without a ought introduction rule, or assuming some other previous “ought.” Like: it’s currently cloudy outside. That’s a statement of simple fact. Does it therefore follow that I shouldn’t go for a walk? Well, only if you previously have the generalization, “When it is cloudy, you should not go for a walk.” Everything that you might use to derive an ought would be a sentence that involves words like “better” or “should” or “preferable,” and things like that. You only get oughts from other oughts. That’s the Hume version of the thesis.
The way I would say it is that there’s a separable core of “is” questions. In other words: okay, I will let you have all of your “ought” sentences, but I’m also going to carve out this whole world full of “is” sentences that only need other “is” sentences to derive them.
Sam: I don’t even know that we need to resolve this. For instance, I think the is-ought distinction is ultimately specious, and this is something that I’ve argued about when I talk about morality and values and the connection to facts. But I can still grant that it is logically possible (and I would certainly imagine physically possible) to have a system that has a utility function that is sufficiently strange that scaling up its intelligence doesn’t get you values that we would recognize as good. It certainly doesn’t guarantee values that are compatible with our wellbeing. Whether “paperclip maximizer” is too specialized a case to motivate this conversation, there’s certainly something that we could fail to put into a superhuman AI that we really would want to put in so as to make it aligned with us.
Eliezer: I mean, the way I would phrase it is that it’s not that the paperclip maximizer has a different set of oughts, but that we can see it as running entirely on “is” questions. That’s where I was going with that. There’s this sort of intuitive way of thinking about it, which is that there’s this sort of ill-understood connection between “is” and “ought” and maybe that allows a paperclip maximizer to have a different set of oughts, a different set of things that play in its mind the role that oughts play in our mind.
Sam: But then why wouldn’t you say the same thing of us? The truth is, I actually do say the same thing of us. I think we’re running on “is” questions as well. We have an “ought”-laden way of talking about certain “is” questions, and we’re so used to it that we don’t even think they are “is” questions, but I think you can do the same analysis on a human being.
Eliezer: The question “How many paperclips result if I follow this policy?” is an “is” question. The question “What is a policy such that it leads to a very large number of paperclips?” is an “is” question. These two questions together form a paperclip maximizer. You don’t need anything else. All you need is a certain kind of system that repeatedly asks the “is” question “What leads to the greatest number of paperclips?” and then does that thing. Even if the things that we think of as “ought” questions are very complicated and disguised “is” questions that are influenced by what policy results in how many people being happy and so on.
Ok, looking now at the transcript, it looks like he’s saying that wiring together certain “is” questions can produce “wanting” that we label “ought”. I think he’s prematurely deflating the argument, because, IIUC, in this ontology, the “ought” questions are about what “is” questions to have wired together in one’s brain.
I think what this is saying is that an agent doesn’t need to be able to reflect on its goals and decide that they’re the “right” ones in order to be capable/dangerous. It just has to be the sort of agent that pursues those goals. Stockfish will beat you without believing that it “ought” to play to win.
Several times, I’ve heard Eliezer say something like “a powerful consequentialist AI could run on ‘is’ statements only, without any ‘ought’ statements”, and I don’t think I’ve ever heard him explain clearly what the difference is between the two categories of statements that he’s tracking.
The classical Humean distinction seems to posit that all “motivational force” is derived from “ought” statements, so it seems like he thinks about it differently than Hume.
Has this been explained anywhere?
My model of Eliezer’s model wouldn’t say that. Link?
https://intelligence.org/2018/02/28/sam-harris-and-eliezer-yudkowsky/ and I also recall seeing this in some tweets.
Ok, looking now at the transcript, it looks like he’s saying that wiring together certain “is” questions can produce “wanting” that we label “ought”. I think he’s prematurely deflating the argument, because, IIUC, in this ontology, the “ought” questions are about what “is” questions to have wired together in one’s brain.
I think what this is saying is that an agent doesn’t need to be able to reflect on its goals and decide that they’re the “right” ones in order to be capable/dangerous. It just has to be the sort of agent that pursues those goals. Stockfish will beat you without believing that it “ought” to play to win.