Red Queen hypothesis means that humans are probably the latest step in a long sequence of fast (on evolutionary time scale) value changes. So does Coherent Extrapolated Volition (CEV) intend to
1) extrapolate all the future co-evolutionary battles humans would have and predict the values of the terminal species as our CEV, or is it intended somehow to
2) freeze the values humans have at the point in time we develop FAI and build a cocoon around humanity which will let it keep this (nearly) arbitrarily picked point in its evolution forever?
If it is 1), it seems the AI doesn’t have much of a job to do. Presumably interfere against existential risks to humanity and its successor species, perhaps keep extremely reliable stocks for repopulating if humanity or its successor manages still to kill itself. Maybe even in a less extreme interpretation, FAI does what is required to keep humanity and its successors as the pinnacle species, stealing adaptations from unrelated species that actually manage to threaten us and our successors, so we sort of have 1′) which is extrapolate to a future where the pinnacle species is always a descendant of ours.
If 2), it would seem FAI could simply build a sim that freezes in place the evolutionary pressures that brought us to this point as well as freezing in to place our own current state. And then run that sim forever, the sim simply removes genetic mutation from the sim and perhaps has active rebalancing to work against any natural selection which is currently going on.
We could have BOTH futures, those who prefer 2) go live in the Sim that they have always thought was indistinguishable from reality anyway, and those who prefer 1 stay here in the real world and play out their part in evolving whatever comes next. Indeed, the sim of 2) might serve as a form of storage/insurance against existential threats, a source from which human history can be restarted from its point at 0 year FAI whenever needed.
Does CEV crash in to Red Queen hypothesis in interesting ways? Could a human value be to roll the dice on our own values in hopes of developing an even more effective species?
Neither. CEV is supposed to look at what humanity would want if they were smarter, faster, and more the people they wished they were. It finds the end of the evolution of how we change if we are controlled by ourselves, not by the blind idiot god.
It finds the end of the evolution of how we change if we are controlled by ourselves, not by the blind idiot god.
Well considering that we at the point we create the FAI are completely a product of the blind idiot god, and so our CEV is some extrapolation of where that blind idiot had gotten us to at the point we finally got the FAI going, it seems very difficult to me to say that the blind idiot god has at all been taken out of the picture.
I guess the idea is that by US being smart and the FAI being even smarter, we are able to whittle down our values until we get rid of the froth, dopey things like being a virgin when you are married and never telling a lie, move through the 6 stages of morality to the top one, the FAI discovers the next 6 or 12 stages and runs sims or something to cut even more foam and crust until there’s only one or two really essential things left.
Of course those one or two things were still placed there by the blind idiot god. And if something other than them had been placed by the blind idiot, CEV would have come up with something else. It does not seem there is any escaping this blind idiot. So what is the value of a scheme who’s appeal is the appearance of escaping the blind idiot if the appearance is false?
We are not escaping the blind idiot god in the sense if it not having any control. We are escaping in the sense that we have full control. To some extent, they overlap, but that doesn’t matter. I only care about being in control, not about everything else not being in control.
So what is the value of a scheme who’s appeal is the appearance of escaping the blind idiot if the appearance is false?
The value is in escaping the parts that harm us. Evolution made me enjoy chocolate, and evolution also made me grow old and die. I would love to have an eternal happy life. I don’t see any good reason to get rid of the chocolate; although I would accept to trade it for something better.
CEV is supposed to refer to the values of current humans. However, this does not necessarily imply that an FAI would prevent the creation of non-human entities. I’d expect that many humans (including me) would assign some value to the existence of interesting entities with somewhat different (though not drastically different) values than ours, and the satisfaction of those values. Thus a CEV would likely assign some value to the preferences of a possible human successor species by proxy through our values.
Thus a CEV would likely assign some value to the preferences of a possible human successor species by proxy through our values.
An interesting question, is the CEV dynamic? As we spent decades or millennia in the walled gardens built for us by the FAI would the FAI be allowed to drift its own values through some dynamic process of checking with the humans within its walls to see how its values might be drifting? I had been under the impression that it would not, but that might have been my own mistake.
No. CEV is the coherent extrapolation of what we-now value.
Edit: Dynamic value systems likely aren’t feasible for recursively self-improving AIs, since an agent with a dynamic goal system has incentive to modify into an agent with a static goal system, as that is what would best fulfill its current goals.
It’s not dynamic. It isn’t our values in the sense of what we’d prefer right now. It’s what we’d prefer if we were smarter, faster, and more the people that we wished we were. In short, it’s what we’d end up with if it was dynamic.
It’s not dynamic. It isn’t our values in the sense of what we’d prefer right now. It’s what we’d prefer if we were smarter, faster, and more the people that we wished we were. In short, it’s what we’d end up with if it was dynamic.
Unless the FAI freezes our current evolutionary state, at least as involves our values, the result we would wind up with if CEV derivation was dynamic would be different from what we would end up with if it is just some extrapolation from what current humans want now.
Even if there were some reason to think our current values were optimal for our current environment, which there is actually reason to think they are NOT, we would still have no reason to think they were optimal in a future environment.
Of course being effectively kept in a really really nice zoo by the FAI, we would not be experiencing any kind of NATURAL selection anymore, and evidence certainly suggests that our volition is to be taller, smarter, have bigger dicks and boobs, be blonder, tanner, and happier, all of which our zookeeper FAI should be able to move us (or our descendants) towards while carrying out necessary eugenics to keep our genome healthy in the absence of natural selection pressures. Certainly CEV keeps us from wanting defective, crippled, and genetically diseased children, so this seems a fairly safe prediction.
It would seem as defined that CEV would have to be fixed at the value it was set at when FAI was created. That no matter how smart, how tall, how blond, how curvaceous or how pudendous we became we would still be constantly pruned back to the CEV of 2045 humans.
As to our values not even being optimal for our current environment fuhgedaboud our future environment, it is pretty widely recognized that we are evolved for the hunter gatherer world of 10,000 years ago, with familial groups of a few hundred, the necessity for survival of hostile reaction against outsiders, and systems which allow fear to distort in extreme ways our rational estimations of things.
I wonder if the FAI will be sad to not be able to see what evolution in its unlimited ignorance would have come up with for us? Maybe they will push a few other species to become intelligent and social and let them duke it out and have natural selection run with them. As long as their species that our CEV didn’t feel too overly warm and fuzzy about this shouldn’t be a problem for them. And certain as a human in the walled garden I would LOVE to be studying what evolution does beyond what it has done to us, so this would seem like a fine and fun thing for the FAI to do to keep at least my part of the CEV entertained.
Even if there were some reason to think our current values were optimal for our current environment, which there is actually reason to think they are NOT, we would still have no reason to think they were optimal in a future environment.
Type error. You can evaluate the optimality of actions in an environment with respect to values. Values being optimal with respect to an environment is not a thing that makes sense. Unless you mean to refer to whether or not our values are optimal in this environment with respect to evolutionary fitness, in which case obviously they are not, but that’s not very relevant to CEV.
all of which our zookeeper FAI should be able to move us (or our descendants) towards while carrying out necessary eugenics to keep our genome healthy in the absence of natural selection pressures.
An FAI can be far more direct than that. Think something more along the lines of “doing surgery to make our bodies work the way we want them to” than “eugenics”.
Type error. … Unless you mean to refer to whether or not our values are optimal in this environment with respect to evolutionary fitness, in which case obviously they are not, but that’s not very relevant to CEV.
You are right about the assumptions I made and I tend to agree it is erroneous.
Your post helps me refine my concern about CEV. It must be that I am expecting the CEV will NOT reflect MY values. In particular, I am suggesting that the CEV will be too conservative in the sense of over-valuing humanity as it currently is and therefore undervaluing humaity as it eventually would be with further evolution, further self-modification.
Probably what drives my fear of CEV not reflecting MY values is dopey, low probability. In my case it is an aspect of “Everything that comes from organized religion is automatically stupid.” To me, CEV and FAI are the modern dogma, man discovering his natural god does not exist, but deciding he can build his own. An all-loving (Friendly) all powerful (self-modifying AI after FOOM) father-figure to take care of us (totally bound by our CEV).
Of course there could be real reasons that CEV will not work. Is there any kind of existence proof for a non-trivial CEV? For the most part values such as “lying is wrong” “stealing is wrong” “help your neighbors” all seem like simplifying abstractions that are abandoned by the more intelligent because they are simply not flexible enough. The essence of nation-to-nation conflict is covert, illegal competition between powerful government organizations that takes place in the virtual absence of all other values other than “we must prevail.” I would presume a nation which refused to fight dirty at any level would be less likely to prevail and so such high mindedness would have no place in the future, and therefore no place in the CEV. That is, the fact that I with normal-ish intelligence can see that most values are a simple map for how humanity should interoperate to survive and the map is not the territory, an extrapolation to if we were MUCH smarter would likely remove all the simple landmarks we have on the maps suitable for our current distribution of IQ.
Then consider the value much of humanity places on accomplishment, and the understanding that coddling, keeping as pets, keeping safe, protecting, is at odds with accomplishment, and get really really smart about that and a CEV is likely to not have much in it about protecting us, even from ourselves.
So perhaps the CEV is a very sparse thing indeed, requiring only that humanity, its successors or assigns, survive. Perhaps FAI sits there not doing a whole hell of a lot that seems useful to us at our level of understanding, with its designers kicking it wondering where they went wrong.
I guess what I’m really getting too is perhaps our CEV, perhaps when you use as much intelligence as you can to extrapolate where our values go in the long long run, you get to the same place the blind idiot was going all along- survival. I understand many here will say no you are missing out on the bad vs good things in our current life, how we can cheat death but keep our taste for chocolate. Their hypothesis is that CEV has them still cheating death and keeping their taste for chocolate. I am hypothesizing that CEV might well have the juggernaut of the evolution of intelligence, and not any of the individuals or even species that are parts of that evolution, as its central value. I am not saying I know it will, what I am saying is I don’t know why everybody else has already decided they can safely predict that even a human 100X or 1000X as smart as they are doesn’t crush them the way we crush a bullfrog when his stream is in the way of our road project or shopping mall.
Evolution may be run by a blind idiot but it has gotten us this far. It is rare that something as obviously expensive as death would be kept in place for trivial reasons. Certainly the good news for those who hate death is the evidence is that lifespans are more valuable in smart species, I think we live twice as long as most other trends against other species would suggest we should, so maybe the optimum continues to go in that direction. But considering how increased intelligence and understanding is usually the enemy of hatred, it seems at least a possibility that needs to be considered that CEV doesn’t even stop us from dying.
It must be that I am expecting the CEV will NOT reflect MY values. In particular, I am suggesting that the CEV will be too conservative in the sense of over-valuing humanity as it currently is and therefore undervaluing humaity as it eventually would be with further evolution, further self-modification.
CEV is supposed to value the same thing that humanity values, not value humanity itself. Since you and other humans value future slightly-nonhuman entities living worthwhile lives, CEV would assign value to them by extension.
Is there any kind of existence proof for a non-trivial CEV?
That’s kind of a tricky question. Humans don’t actually have utility functions, which is why the “coherent extrapolated” part is important. We don’t really know of a way to extract an underlying utility function from non-utility-maximizing agents, so I guess you could say that the answer is no. On the other hand, humans are often capable of noticing when it is pointed out to them that their choices contradict each other, and, even if they don’t actually change their behavior, can at least endorse some more consistent strategy, so it seems reasonable that a human, given enough intelligence, working memory, time to think, and something to point out inconsistencies, could come up with a consistent utility function that fits human preferences about as well as a utility function can. As far as I understand, that’s basically what CEV is.
CEV is likely to not have much in it about protecting us, even from ourselves.
Do you want to die? No? Then humanity’s CEV would assign negative utility to you dying, so an AI maximizing it would protect you from dying.
I am not saying I know it will, what I am saying is I don’t know why everybody else has already decided they can safely predict that even a human 100X or 1000X as smart as they are doesn’t crush them the way we crush a bullfrog when his stream is in the way of our road project or shopping mall.
If some attempt to extract a CEV has a result that is horrible for us, that means that our method for computing the CEV was incorrect, not that CEV would be horrible to us. In the “what would a smarter version of me decide?” formulation, that smarter version of you is supposed to have the same values you do. That might be poorly defined since humans don’t have coherent values, but CEV is defined as that which it would be awesome from our perspective for a strong AI to maximize, and using the utility function that a smarter version of ourselves would come up with is a proposed method for determining it.
Criticisms of the form “an AI maximizing our CEV would do bad thing X” involve a misunderstanding of the CEV concept. Criticisms of the form “no one has unambiguously specified a method of computing our CEV that would be sure to work, or even gotten close to doing so” I agree with.
My thought on CEV not actually including much individual protection followed something like this: I don’t want to die. I don’t want to live in a walled garden taken care of as though I was a favored pet. Apply intelligence to that and my FAI does what for me? Mostly lets me be since it is smart enough to realize that a policy of protecting my life winds up turning me into a favored pet. This is sort of the distinction ask somewhat what they want you might get stories of candy and leisure, look at them when they are happiest you might see when they are doing meaningful and difficult work and living in a healthy manner. Apply high intelligence and you are unlikely to promote candy and leisure. Ultimately, I think humanity careening along on its very own planet as the peak species, creating intelligence in the universe where previously there was none is very possibly as good as it can get for humanity, and I think it plausible FAI would be smart enough to realize that and we might be surprised how little it seemed to interfere. I also think it is pretty hard working part time to predict what something 1000X smarter than I am will conclude about human values, so I hardly imagine what I am saying is powerfully convincing to anybody who doesn’t lean that way, I’m just explaining why or how an FAI could wind up doing almost nothing, i.e. how CEV could wind up being trivially empty in a way.
THe other aspect of being empty for CEV I was not thinking our own internal contradictions although that is a good point. I was thinking disagreement across humanity. Certainly we have seen broad ranges of valuations on human life and equality and broadly different ideas about what respect should look like and what punishment should look like. THese indicate to me that a human CEV as opposed to a French CEV or even a Paris CEV, might well be quite sparse when designed to keep only what is reasonably common to all humanity and all potential humanity. If morality turns out to be more culturally determined than genetically, we could still have a CEV, but we would have to stop claiming it was human and admit it was just us, and when we said FAI we meant friendly to us but unfriendly to you. The baby-eaters might turn out to be the Indonesians or the Inuits in this case.
I know how hard it is to reach consensus in a group of humans exceeding about 20, I’m just wondering how much a more rigorous process applied across billions is going to come up with.
we would still be constantly pruned back to the CEV of 2045 humans
Two connotational objections: 1) I don’t think that “constantly pruned back” is an appropriate metaphor for “getting everything you have ever desired”. The only thing that would prevent us from doing X would be the fact that after reflection we love non-X. 2) The extrapolated 2045 humans would be probably as different from the real 2045 humans, as the 2045 humans are different from the MINUS 2045 humans.
I wonder if the FAI will be sad to not be able to see what evolution in its unlimited ignorance would have come up with for us?
Sad? Why, unless we program it to be? Also, with superior recursively self-improving intelligence it could probably make a good estimate of what would have happened in an alternative reality where all AIs are magically destroyed. But such estimate would most likely be a probability distribution of many different possibilities, not one specific goal.
I’m dubious about the extrapolation—the universe is more complex than the AI, and the AI may not be able to model how our values would change as a result of unmediated choices and experiense.
I am not sure how obvious is the part that there are multiple possible futures. Most likely, the AI would not be able to model all of them. However, without AI most of them wouldn’t happen anyway.
It’s like saying “if I don’t roll a die, I lose the chance of rolling 6”, to which I add “and if you do roll the die, you still have 5⁄6 probability of not rolling 6″. Just to make it clear that by avoiding the “spontaneous” future of humankind, we are not avoiding one specific future magically prepared for us by destiny. We are avoiding the whole probability distribution, which contains many possible futures, both nice and ugly.
Just because AI can model something imperfectly, it does not mean that without the AI the future would be perfect, or even better on average than with the AI.
‘Unmediated’ may not have been quite the word to convey what I meant.
My impression is that CEV is permanently established very early in the AI’s history, but I believe that what people are and want (including what we would want if we knew more, thought faster, were more the people we wished we were, and had grown up closer together) will change, both because people will be doing self-modification and because they will learn more.
What I mean is that if you looked at what people valued, and gave them the ability to self-modify, and somehow kept them from messing up and accidentally doing something that they didn’t want to do, you’d have something like CEV but dynamic. CEV is the end result of this.
with random mutations and natural selection, old values can disappear and new values can appear in a population. The success of the new values depends only on their differential ability to keep their carriers in children, not on their “friendliness” to the old values of the parents, which is what FAI respecting CEV is meant to accomplish.
The Red Queen Hypothesis is (my paraphrase for purposes of this post) that a lot of the evolution that takes place is not to adapt to unliving environment but to the living and most importantly also evolving environment in which we live, on which we feed, and which does its damdest to feed on us. Imagine a set of smart primates who have already done pretty well against dumber animals by evolving more complex vocal and gestural signalling, and larger neocortices so that complex plans worthy of being communicated can be formulated and understood when communicated. But they lack the concept of handing off something they have with the expectation that they might get something they want even more in trade. THIS is essentially one of the hypotheses of Matt Ridley’s book “The Rational Optimist,” that homo sapiens is a born trader, while the other primates are not. Without trading, economies of scale and specialization do almost no good. With trading and economies of scale and specialization, a large energy investment in a super-hot brain and some wicked communication gear and skills really pays off.
Subspecies with the right mix of generosity, hypocrisy, selfishness, lust, power hunger, and self-righteousness will ultimately eat the lunch of their too generous or too greedy to cooperate or too lustful to raise their children or too complacent to seek out powerful mates brethren and sistern. This is value drift brought to you by the Red Queen.
Red Queen hypothesis means that humans are probably the latest step in a long sequence of fast (on evolutionary time scale) value changes. So does Coherent Extrapolated Volition (CEV) intend to
1) extrapolate all the future co-evolutionary battles humans would have and predict the values of the terminal species as our CEV, or is it intended somehow to
2) freeze the values humans have at the point in time we develop FAI and build a cocoon around humanity which will let it keep this (nearly) arbitrarily picked point in its evolution forever?
If it is 1), it seems the AI doesn’t have much of a job to do. Presumably interfere against existential risks to humanity and its successor species, perhaps keep extremely reliable stocks for repopulating if humanity or its successor manages still to kill itself. Maybe even in a less extreme interpretation, FAI does what is required to keep humanity and its successors as the pinnacle species, stealing adaptations from unrelated species that actually manage to threaten us and our successors, so we sort of have 1′) which is extrapolate to a future where the pinnacle species is always a descendant of ours.
If 2), it would seem FAI could simply build a sim that freezes in place the evolutionary pressures that brought us to this point as well as freezing in to place our own current state. And then run that sim forever, the sim simply removes genetic mutation from the sim and perhaps has active rebalancing to work against any natural selection which is currently going on.
We could have BOTH futures, those who prefer 2) go live in the Sim that they have always thought was indistinguishable from reality anyway, and those who prefer 1 stay here in the real world and play out their part in evolving whatever comes next. Indeed, the sim of 2) might serve as a form of storage/insurance against existential threats, a source from which human history can be restarted from its point at 0 year FAI whenever needed.
Does CEV crash in to Red Queen hypothesis in interesting ways? Could a human value be to roll the dice on our own values in hopes of developing an even more effective species?
Neither. CEV is supposed to look at what humanity would want if they were smarter, faster, and more the people they wished they were. It finds the end of the evolution of how we change if we are controlled by ourselves, not by the blind idiot god.
Well considering that we at the point we create the FAI are completely a product of the blind idiot god, and so our CEV is some extrapolation of where that blind idiot had gotten us to at the point we finally got the FAI going, it seems very difficult to me to say that the blind idiot god has at all been taken out of the picture.
I guess the idea is that by US being smart and the FAI being even smarter, we are able to whittle down our values until we get rid of the froth, dopey things like being a virgin when you are married and never telling a lie, move through the 6 stages of morality to the top one, the FAI discovers the next 6 or 12 stages and runs sims or something to cut even more foam and crust until there’s only one or two really essential things left.
Of course those one or two things were still placed there by the blind idiot god. And if something other than them had been placed by the blind idiot, CEV would have come up with something else. It does not seem there is any escaping this blind idiot. So what is the value of a scheme who’s appeal is the appearance of escaping the blind idiot if the appearance is false?
We are not escaping the blind idiot god in the sense if it not having any control. We are escaping in the sense that we have full control. To some extent, they overlap, but that doesn’t matter. I only care about being in control, not about everything else not being in control.
By luck, we got some things right. We don’t have to get rid of them just because we got them by a random process.
The value is in escaping the parts that harm us. Evolution made me enjoy chocolate, and evolution also made me grow old and die. I would love to have an eternal happy life. I don’t see any good reason to get rid of the chocolate; although I would accept to trade it for something better.
CEV is supposed to refer to the values of current humans. However, this does not necessarily imply that an FAI would prevent the creation of non-human entities. I’d expect that many humans (including me) would assign some value to the existence of interesting entities with somewhat different (though not drastically different) values than ours, and the satisfaction of those values. Thus a CEV would likely assign some value to the preferences of a possible human successor species by proxy through our values.
An interesting question, is the CEV dynamic? As we spent decades or millennia in the walled gardens built for us by the FAI would the FAI be allowed to drift its own values through some dynamic process of checking with the humans within its walls to see how its values might be drifting? I had been under the impression that it would not, but that might have been my own mistake.
No. CEV is the coherent extrapolation of what we-now value.
Edit: Dynamic value systems likely aren’t feasible for recursively self-improving AIs, since an agent with a dynamic goal system has incentive to modify into an agent with a static goal system, as that is what would best fulfill its current goals.
It’s not dynamic. It isn’t our values in the sense of what we’d prefer right now. It’s what we’d prefer if we were smarter, faster, and more the people that we wished we were. In short, it’s what we’d end up with if it was dynamic.
Unless the FAI freezes our current evolutionary state, at least as involves our values, the result we would wind up with if CEV derivation was dynamic would be different from what we would end up with if it is just some extrapolation from what current humans want now.
Even if there were some reason to think our current values were optimal for our current environment, which there is actually reason to think they are NOT, we would still have no reason to think they were optimal in a future environment.
Of course being effectively kept in a really really nice zoo by the FAI, we would not be experiencing any kind of NATURAL selection anymore, and evidence certainly suggests that our volition is to be taller, smarter, have bigger dicks and boobs, be blonder, tanner, and happier, all of which our zookeeper FAI should be able to move us (or our descendants) towards while carrying out necessary eugenics to keep our genome healthy in the absence of natural selection pressures. Certainly CEV keeps us from wanting defective, crippled, and genetically diseased children, so this seems a fairly safe prediction.
It would seem as defined that CEV would have to be fixed at the value it was set at when FAI was created. That no matter how smart, how tall, how blond, how curvaceous or how pudendous we became we would still be constantly pruned back to the CEV of 2045 humans.
As to our values not even being optimal for our current environment fuhgedaboud our future environment, it is pretty widely recognized that we are evolved for the hunter gatherer world of 10,000 years ago, with familial groups of a few hundred, the necessity for survival of hostile reaction against outsiders, and systems which allow fear to distort in extreme ways our rational estimations of things.
I wonder if the FAI will be sad to not be able to see what evolution in its unlimited ignorance would have come up with for us? Maybe they will push a few other species to become intelligent and social and let them duke it out and have natural selection run with them. As long as their species that our CEV didn’t feel too overly warm and fuzzy about this shouldn’t be a problem for them. And certain as a human in the walled garden I would LOVE to be studying what evolution does beyond what it has done to us, so this would seem like a fine and fun thing for the FAI to do to keep at least my part of the CEV entertained.
Type error. You can evaluate the optimality of actions in an environment with respect to values. Values being optimal with respect to an environment is not a thing that makes sense. Unless you mean to refer to whether or not our values are optimal in this environment with respect to evolutionary fitness, in which case obviously they are not, but that’s not very relevant to CEV.
An FAI can be far more direct than that. Think something more along the lines of “doing surgery to make our bodies work the way we want them to” than “eugenics”.
Do not anthropomorphize an AI.
You are right about the assumptions I made and I tend to agree it is erroneous.
Your post helps me refine my concern about CEV. It must be that I am expecting the CEV will NOT reflect MY values. In particular, I am suggesting that the CEV will be too conservative in the sense of over-valuing humanity as it currently is and therefore undervaluing humaity as it eventually would be with further evolution, further self-modification.
Probably what drives my fear of CEV not reflecting MY values is dopey, low probability. In my case it is an aspect of “Everything that comes from organized religion is automatically stupid.” To me, CEV and FAI are the modern dogma, man discovering his natural god does not exist, but deciding he can build his own. An all-loving (Friendly) all powerful (self-modifying AI after FOOM) father-figure to take care of us (totally bound by our CEV).
Of course there could be real reasons that CEV will not work. Is there any kind of existence proof for a non-trivial CEV? For the most part values such as “lying is wrong” “stealing is wrong” “help your neighbors” all seem like simplifying abstractions that are abandoned by the more intelligent because they are simply not flexible enough. The essence of nation-to-nation conflict is covert, illegal competition between powerful government organizations that takes place in the virtual absence of all other values other than “we must prevail.” I would presume a nation which refused to fight dirty at any level would be less likely to prevail and so such high mindedness would have no place in the future, and therefore no place in the CEV. That is, the fact that I with normal-ish intelligence can see that most values are a simple map for how humanity should interoperate to survive and the map is not the territory, an extrapolation to if we were MUCH smarter would likely remove all the simple landmarks we have on the maps suitable for our current distribution of IQ.
Then consider the value much of humanity places on accomplishment, and the understanding that coddling, keeping as pets, keeping safe, protecting, is at odds with accomplishment, and get really really smart about that and a CEV is likely to not have much in it about protecting us, even from ourselves.
So perhaps the CEV is a very sparse thing indeed, requiring only that humanity, its successors or assigns, survive. Perhaps FAI sits there not doing a whole hell of a lot that seems useful to us at our level of understanding, with its designers kicking it wondering where they went wrong.
I guess what I’m really getting too is perhaps our CEV, perhaps when you use as much intelligence as you can to extrapolate where our values go in the long long run, you get to the same place the blind idiot was going all along- survival. I understand many here will say no you are missing out on the bad vs good things in our current life, how we can cheat death but keep our taste for chocolate. Their hypothesis is that CEV has them still cheating death and keeping their taste for chocolate. I am hypothesizing that CEV might well have the juggernaut of the evolution of intelligence, and not any of the individuals or even species that are parts of that evolution, as its central value. I am not saying I know it will, what I am saying is I don’t know why everybody else has already decided they can safely predict that even a human 100X or 1000X as smart as they are doesn’t crush them the way we crush a bullfrog when his stream is in the way of our road project or shopping mall.
Evolution may be run by a blind idiot but it has gotten us this far. It is rare that something as obviously expensive as death would be kept in place for trivial reasons. Certainly the good news for those who hate death is the evidence is that lifespans are more valuable in smart species, I think we live twice as long as most other trends against other species would suggest we should, so maybe the optimum continues to go in that direction. But considering how increased intelligence and understanding is usually the enemy of hatred, it seems at least a possibility that needs to be considered that CEV doesn’t even stop us from dying.
CEV is supposed to value the same thing that humanity values, not value humanity itself. Since you and other humans value future slightly-nonhuman entities living worthwhile lives, CEV would assign value to them by extension.
That’s kind of a tricky question. Humans don’t actually have utility functions, which is why the “coherent extrapolated” part is important. We don’t really know of a way to extract an underlying utility function from non-utility-maximizing agents, so I guess you could say that the answer is no. On the other hand, humans are often capable of noticing when it is pointed out to them that their choices contradict each other, and, even if they don’t actually change their behavior, can at least endorse some more consistent strategy, so it seems reasonable that a human, given enough intelligence, working memory, time to think, and something to point out inconsistencies, could come up with a consistent utility function that fits human preferences about as well as a utility function can. As far as I understand, that’s basically what CEV is.
Do you want to die? No? Then humanity’s CEV would assign negative utility to you dying, so an AI maximizing it would protect you from dying.
If some attempt to extract a CEV has a result that is horrible for us, that means that our method for computing the CEV was incorrect, not that CEV would be horrible to us. In the “what would a smarter version of me decide?” formulation, that smarter version of you is supposed to have the same values you do. That might be poorly defined since humans don’t have coherent values, but CEV is defined as that which it would be awesome from our perspective for a strong AI to maximize, and using the utility function that a smarter version of ourselves would come up with is a proposed method for determining it.
Criticisms of the form “an AI maximizing our CEV would do bad thing X” involve a misunderstanding of the CEV concept. Criticisms of the form “no one has unambiguously specified a method of computing our CEV that would be sure to work, or even gotten close to doing so” I agree with.
My thought on CEV not actually including much individual protection followed something like this: I don’t want to die. I don’t want to live in a walled garden taken care of as though I was a favored pet. Apply intelligence to that and my FAI does what for me? Mostly lets me be since it is smart enough to realize that a policy of protecting my life winds up turning me into a favored pet. This is sort of the distinction ask somewhat what they want you might get stories of candy and leisure, look at them when they are happiest you might see when they are doing meaningful and difficult work and living in a healthy manner. Apply high intelligence and you are unlikely to promote candy and leisure. Ultimately, I think humanity careening along on its very own planet as the peak species, creating intelligence in the universe where previously there was none is very possibly as good as it can get for humanity, and I think it plausible FAI would be smart enough to realize that and we might be surprised how little it seemed to interfere. I also think it is pretty hard working part time to predict what something 1000X smarter than I am will conclude about human values, so I hardly imagine what I am saying is powerfully convincing to anybody who doesn’t lean that way, I’m just explaining why or how an FAI could wind up doing almost nothing, i.e. how CEV could wind up being trivially empty in a way.
THe other aspect of being empty for CEV I was not thinking our own internal contradictions although that is a good point. I was thinking disagreement across humanity. Certainly we have seen broad ranges of valuations on human life and equality and broadly different ideas about what respect should look like and what punishment should look like. THese indicate to me that a human CEV as opposed to a French CEV or even a Paris CEV, might well be quite sparse when designed to keep only what is reasonably common to all humanity and all potential humanity. If morality turns out to be more culturally determined than genetically, we could still have a CEV, but we would have to stop claiming it was human and admit it was just us, and when we said FAI we meant friendly to us but unfriendly to you. The baby-eaters might turn out to be the Indonesians or the Inuits in this case.
I know how hard it is to reach consensus in a group of humans exceeding about 20, I’m just wondering how much a more rigorous process applied across billions is going to come up with.
You can just average across each individual.
Yes, “humanity” should be interpreted as referring to the current population.
Two connotational objections: 1) I don’t think that “constantly pruned back” is an appropriate metaphor for “getting everything you have ever desired”. The only thing that would prevent us from doing X would be the fact that after reflection we love non-X. 2) The extrapolated 2045 humans would be probably as different from the real 2045 humans, as the 2045 humans are different from the MINUS 2045 humans.
Sad? Why, unless we program it to be? Also, with superior recursively self-improving intelligence it could probably make a good estimate of what would have happened in an alternative reality where all AIs are magically destroyed. But such estimate would most likely be a probability distribution of many different possibilities, not one specific goal.
I’m dubious about the extrapolation—the universe is more complex than the AI, and the AI may not be able to model how our values would change as a result of unmediated choices and experiense.
I am not sure how obvious is the part that there are multiple possible futures. Most likely, the AI would not be able to model all of them. However, without AI most of them wouldn’t happen anyway.
It’s like saying “if I don’t roll a die, I lose the chance of rolling 6”, to which I add “and if you do roll the die, you still have 5⁄6 probability of not rolling 6″. Just to make it clear that by avoiding the “spontaneous” future of humankind, we are not avoiding one specific future magically prepared for us by destiny. We are avoiding the whole probability distribution, which contains many possible futures, both nice and ugly.
Just because AI can model something imperfectly, it does not mean that without the AI the future would be perfect, or even better on average than with the AI.
‘Unmediated’ may not have been quite the word to convey what I meant.
My impression is that CEV is permanently established very early in the AI’s history, but I believe that what people are and want (including what we would want if we knew more, thought faster, were more the people we wished we were, and had grown up closer together) will change, both because people will be doing self-modification and because they will learn more.
The overwhelming majority of dynamic value systems do not end in CEV.
What I mean is that if you looked at what people valued, and gave them the ability to self-modify, and somehow kept them from messing up and accidentally doing something that they didn’t want to do, you’d have something like CEV but dynamic. CEV is the end result of this.
What does the Red Queen hypothesis have to do with value change?
with random mutations and natural selection, old values can disappear and new values can appear in a population. The success of the new values depends only on their differential ability to keep their carriers in children, not on their “friendliness” to the old values of the parents, which is what FAI respecting CEV is meant to accomplish.
The Red Queen Hypothesis is (my paraphrase for purposes of this post) that a lot of the evolution that takes place is not to adapt to unliving environment but to the living and most importantly also evolving environment in which we live, on which we feed, and which does its damdest to feed on us. Imagine a set of smart primates who have already done pretty well against dumber animals by evolving more complex vocal and gestural signalling, and larger neocortices so that complex plans worthy of being communicated can be formulated and understood when communicated. But they lack the concept of handing off something they have with the expectation that they might get something they want even more in trade. THIS is essentially one of the hypotheses of Matt Ridley’s book “The Rational Optimist,” that homo sapiens is a born trader, while the other primates are not. Without trading, economies of scale and specialization do almost no good. With trading and economies of scale and specialization, a large energy investment in a super-hot brain and some wicked communication gear and skills really pays off.
Subspecies with the right mix of generosity, hypocrisy, selfishness, lust, power hunger, and self-righteousness will ultimately eat the lunch of their too generous or too greedy to cooperate or too lustful to raise their children or too complacent to seek out powerful mates brethren and sistern. This is value drift brought to you by the Red Queen.