And not just about love. Creativity. Curiosity. Excitement. Autonomy. Other people. Morality. Our children’s children.
That’s a nice list, but also disturbing in a way. I hope that FAI’s understanding of “extrapolated human volition” doesn’t reduce to “pick the values that humans profess in public”.
Yay for scapegoating, rent seeking, humor at the expense of low status social groups, unreflective support of information cascades, asserting social dominance, solitaire, and adultery!
One of the implications of this is that if a superintelligence ever does work out humanity’s coherent extrapolated volition and develop the means to implement it, and for some inexplicable reason asked humanity to endorse the resulting plan before implementing it, humanity would presumably reject it… perhaps not outright, but it would contain too many things that we profess to abhor for most of us to endorse it out loud.
You’d get a million versions of “Hey, that’s a great start, but this bit here with the little kid in the basement in the middle of the city, could you maybe get rid of that part, and oh by the way paint the shed blue?” and the whole thing would die in committee.
The FAI would more likely implement the CEV progressively than in one go. Any change too drastic at once will be rejected. But if you go by steps, it’s much easier to accept.
Also, don’t underestimate the persuasion power a super-intelligence would have. For the same reason an AI box would not work, a powerful enough AI (friendly or not) will find a way to persuade most of humanity to accept its plans, because it’ll understand why our rejections come from and find way to counter and circumvent them, and use enough superstimulus or offers of future superstimulus.
I completely agree about the ability to circumvent humanity’s objections, either by propaganda as you describe, or just by ignoring those objections altogether and doing what it thinks best. Of course, if for whatever reason the system were designed to require uncoerced consent before implementing its plans, it might not use that ability. (Designing it to require consent but to also be free to coerce that consent via superstimuli seems simply silly: neither safe nor efficient.)
Coercion is not binary. I was not thinking into the AI threatening to blow Earth if we refuse the plan, or exposing us to quick burst of superstimulus so high we would do anything to get it again, or lying about its plans, not any of those forms of “cheating”.
But even an AI which is forbidden to use those techniques, and require “uncoerced” consent as : no lying, no threats, not creating addiction, … would be able to present the plan (without lying, even by omission, on its content) in such a way that we’ll accept it relatively easily. Superstimulus for example doesn’t need to be used to create addiction or to blackmail, but just as natural, genuine consequence of accepting the plan. Things we might find horrifying because they are too alien would be presented with a clear analogy, or as the conclusion of a slow introductory path, in which no step is too much of inferential distance, …
I agree with you that, if a sufficiently powerful superintelligence is constrained to avoid any activities that a human would honestly classify as “coercion,” “threat,” “blackmail,” “addiction,” or “lie by omission” and is constrained to only induce changes in belief via means that a human would honestly classify as “natural” and “genuine,” it can nevertheless induce humans to accept its plan while satisfying those constraints.
I don’t think that prevents such a superintelligence from inducing humans to accept its plan through the use of means that would horrify us had we ever thought to consider them.
It’s also not at all clear to me that the fact that X would horrify me if I’d thought to consider it is sufficient grounds to reject using X.
Most of those seem to me things humans would not do much “if we knew more, thought faster, were more the people we wished we were, had grown up farther together”, to take Eliezer’s words as definition of CEV. Those are things humans do now because they don’t know enough (about game theory, fun theory, …), they don’t think fast enough of the consequences, they suffer from different kind of akrasia and are not “the people they wished they were”, and they didn’t grow up far enough together.
That’s one of the thing I really like about CEV : it acknowledges that was most humans spontaneously do now are not what our CEV is.
Yay for scapegoating, rent seeking, humor at the expense of low status social groups, unreflective support of information cascades, asserting social dominance, solitaire, and adultery!
I needed to find the context but that found I have to say this is the best comment I’ve seen all month! Deceptively insightful.
The immediate parent of the comment in question? You can imagine what it looked like seeing Will’s comment as it appears in the recent comments thread. There are at least three wildly different messages it could be conveying based on what it is a response to. This was the best case.
Yay for scapegoating, rent seeking, humor at the expense of low status social groups, unreflective support of information cascades, asserting social dominance, solitaire, and adultery!
That’s a nice list, but also disturbing in a way. I hope that FAI’s understanding of “extrapolated human volition” doesn’t reduce to “pick the values that humans profess in public”.
I know! I hope it is also able to pick up things like , and .
That’s a nice list, but also disturbing in a way. I hope that FAI’s understanding of “extrapolated human volition” doesn’t reduce to “pick the values that humans profess in public”.
Yay for scapegoating, rent seeking, humor at the expense of low status social groups, unreflective support of information cascades, asserting social dominance, solitaire, and adultery!
...and jaywalking. Don’t forget jaywalking.
One of the implications of this is that if a superintelligence ever does work out humanity’s coherent extrapolated volition and develop the means to implement it, and for some inexplicable reason asked humanity to endorse the resulting plan before implementing it, humanity would presumably reject it… perhaps not outright, but it would contain too many things that we profess to abhor for most of us to endorse it out loud.
You’d get a million versions of “Hey, that’s a great start, but this bit here with the little kid in the basement in the middle of the city, could you maybe get rid of that part, and oh by the way paint the shed blue?” and the whole thing would die in committee.
The FAI would more likely implement the CEV progressively than in one go. Any change too drastic at once will be rejected. But if you go by steps, it’s much easier to accept.
Also, don’t underestimate the persuasion power a super-intelligence would have. For the same reason an AI box would not work, a powerful enough AI (friendly or not) will find a way to persuade most of humanity to accept its plans, because it’ll understand why our rejections come from and find way to counter and circumvent them, and use enough superstimulus or offers of future superstimulus.
I completely agree about the ability to circumvent humanity’s objections, either by propaganda as you describe, or just by ignoring those objections altogether and doing what it thinks best. Of course, if for whatever reason the system were designed to require uncoerced consent before implementing its plans, it might not use that ability. (Designing it to require consent but to also be free to coerce that consent via superstimuli seems simply silly: neither safe nor efficient.)
This thread has made it clear to me that designing a superintelligence that asks for consent before doing stuff is a bad idea. Thanks!
Coercion is not binary. I was not thinking into the AI threatening to blow Earth if we refuse the plan, or exposing us to quick burst of superstimulus so high we would do anything to get it again, or lying about its plans, not any of those forms of “cheating”.
But even an AI which is forbidden to use those techniques, and require “uncoerced” consent as : no lying, no threats, not creating addiction, … would be able to present the plan (without lying, even by omission, on its content) in such a way that we’ll accept it relatively easily. Superstimulus for example doesn’t need to be used to create addiction or to blackmail, but just as natural, genuine consequence of accepting the plan. Things we might find horrifying because they are too alien would be presented with a clear analogy, or as the conclusion of a slow introductory path, in which no step is too much of inferential distance, …
I agree with you that, if a sufficiently powerful superintelligence is constrained to avoid any activities that a human would honestly classify as “coercion,” “threat,” “blackmail,” “addiction,” or “lie by omission” and is constrained to only induce changes in belief via means that a human would honestly classify as “natural” and “genuine,” it can nevertheless induce humans to accept its plan while satisfying those constraints.
I don’t think that prevents such a superintelligence from inducing humans to accept its plan through the use of means that would horrify us had we ever thought to consider them.
It’s also not at all clear to me that the fact that X would horrify me if I’d thought to consider it is sufficient grounds to reject using X.
Most of those seem to me things humans would not do much “if we knew more, thought faster, were more the people we wished we were, had grown up farther together”, to take Eliezer’s words as definition of CEV. Those are things humans do now because they don’t know enough (about game theory, fun theory, …), they don’t think fast enough of the consequences, they suffer from different kind of akrasia and are not “the people they wished they were”, and they didn’t grow up far enough together.
That’s one of the thing I really like about CEV : it acknowledges that was most humans spontaneously do now are not what our CEV is.
You can’t have your dynamic inconsistency and eat it too.
I needed to find the context but that found I have to say this is the best comment I’ve seen all month! Deceptively insightful.
I’m not sure what you mean by context. Is there a specific reference at work here?
The immediate parent of the comment in question? You can imagine what it looked like seeing Will’s comment as it appears in the recent comments thread. There are at least three wildly different messages it could be conveying based on what it is a response to. This was the best case.
Ah. Not used to checking recent comments.
Don’t forget satire and sarcasm! ;-)
And, as we see in the case of what Will seems to be doing here—irony and accepting-optimistic-cynicism (we need a word for that.)
How many upvoters do you reckon interpreted my comment the way you did versus the way Eby apparently did?
...Vs. interpreting it as non-ironic endorsement of the items you list.
Ironic versus non-ironic endorsement is somewhat blurred in this case I think.
It’s not a comprehensive list, it’s a few appealing things that are within one step of inferential distance of the target audience.
I know! I hope it is also able to pick up things like , and .