wedrifid comments on In favour of a selective CEV initial dynamic

wedrifid 24 Oct 2011 10:19 UTC
5 points
0

You are treading on treacherous moral ground! Your “jerk” may be my best mate (OK, he’s a bit intense… but you are no angel either!). Your “suicidal fanatic” may be my hero.

If so then I don’t want your volition extrapolated either. Because that would destroy everything I hold dear as well (given the extent to which you would either care about their dystopic values yourself or care about them getting those same values achieved).

Also, I can understand “I really don’t want the volition of ANYONE to be extrapolated in such a way as it could destroy all that I hold dear”

I obviously would prefer an FAI to extrapolate only MY volition. Any other preference is a trivial reductio to absurdity. The reason to support the implementation of an FAI that extrapolates more generally is so that I can cooperate with other people whose preferences are not too much different to mine (and in some cases may even resolve to be identical). Cooperative alliances are best formed with people with compatible goals and not those whose success would directly sabotage your own.

why pick on psychopaths, suicidal fanatics and jerks in particular?

Do I need to write a post “Giving a few examples does not assert a full specification of a set”? I’m starting to feel the need to have such a post to link to pre-emptively.
- D_Alex 27 Oct 2011 8:07 UTC
  0 points
  0
  Parent
  
  I don’t want your volition extrapolated either.
  
  You are a jerk!
  
  . . . .
  
  See where this approach gets us?
  - wedrifid 27 Oct 2011 9:53 UTC
    0 points
    0
    Parent
    
    See where this approach gets us?
    
    Not anywhere closer to understanding how altruism and morality apply to extrapolated volition for a start.
    
    Note that the conditions that apply to the quote but that are not included are rather significant. Approximately it is conditional on your volition being to help other agents do catastrophically bad things to the future light cone.
    
    What I am confident you do not understand is that excluding wannabe accomplices to Armageddon from the set of agents given to a CEV implementation does not even rule out (or even make unlikely) the resultant outcome taking into consideration all the preferences of those who are not safe to include (and just ignoring the obnoxiously toxic ones).
    - D_Alex 31 Oct 2011 5:22 UTC
      0 points
      0
      Parent
      
      What I am confident you do not understand is that excluding wannabe accomplices to Armageddon from the set of agents given to a CEV implementation does not even rule out (or even make unlikely) the resultant outcome taking into consideration all the preferences of those who are not safe to include (and just ignoring the obnoxiously toxic ones).
      
      I barely understand this sentence. Do you mean: Excluding “jerks” from CEV does not guarantee that their destructive preferences will not be included?
      
      If so, I totally do not agree with you, as my opinion is: Including “jerks” in CEV will not pose a danger, and saves the trouble of determining who is a “jerk” in the first place.
      
      This is based on the observation that “jerks” are a minority, an opinion that “EV-jerks” are practically non-existent, and an understanding that where a direct conflict exist between EV of a minority and EV of a majority, it is the EV of a majority that will prevail in the CEV. If you disagree with any of these, please elaborate, but use a writing style that does not exceed the comprehension abilities of an M. Eng.
      - wedrifid 31 Oct 2011 6:48 UTC
        1 point
        0
        Parent
        I hope you are right. But that is what it is, hope. I cannot know with any confidence that and Artificial Intelligence implementing CEV is Friendly. I cannot know if it will result in me and the people I care about continuing to live. It may result in something that, say, Robin Hanson considers desirable (and I would consider worse than simple extinction.)
        
        Declaring CEV to be optimal amounts to saying “I have faith that everyone is allright on the inside and we would all get along if we thought about it a bit more. Bullshit. That’s a great belief to have if you want to signal your personal ability to enforce cooperation in your social environment but not a belief that you want actual decision makers to have. Or, at least, not one you want them to simply assume without huge amounts of both theoretical and empirical research.
        
        (Here I should again refer you to the additional safeguards Eliezer proposed/speculated on for in case CEV results in Jerkiness. This is the benefit of being able to acknowledge that CEV isn’t good by definition. You can plan ahead just in case!)
        
        If you disagree with any of these, please elaborate, but use a writing style that does not exceed the comprehension abilities of an M. Eng.
        
        It is primarily a question of understanding (and being willing to understand) the content.
        
        This is based on the observation that “jerks” are a minority, an opinion that “EV-jerks” are practically non-existent
        
        You don’t know that. Particularly since EV is not currently sufficiently defined to make any absolute claims. EV doesn’t magically make people nice or especially cooperative unless you decide to hack in a “make nicer” component to the extrapolation routine.
        
        and an understanding that where a direct conflict exist between EV of a minority and EV of a majority, it is the EV of a majority that will prevail in the CEV
        
        You don’t know that either. The ‘coherence’ part of CEV is even less specified than the EV part. Majority rule is one way of resolving conflicts between competing agents. It isn’t the only one. But I don’t even know that AI> results in something I would consider Friendly. Again, there is a decent chance that it is not-completely-terrible but that isn’t something to count on without thorough research and isn’t an ideal to aspire to either. Just something that may need to be compromised down to.
        lessdazed 31 Oct 2011 7:52 UTC
        0 points
        0
        Parent
        
        The ‘coherence’ part of CEV is even less specified than the EV part.
        
        One possibility is the one inclined to shut down rather than do anything not neutral or better from every perspective. This system is pretty likely useless, but likely to be safe too, and not certainly useless. Variants allow some negatives, but I don’t know how one would draw a line—allowing everyone a veto and requiring negotiation with them would be pretty safe, but also nearly useless.
        
        EV doesn’t magically make people nice or especially cooperative
        
        I’m not sure exactly what you’re implying so I’ll state something you may or may not agree with. It seems likely it makes people more cooperative in some areas, and has unknown implications in other areas, so as to whether it makes them ultimately more or less cooperative, that is unknown. But the little we can see is of cooperation increasing, and it would be unreasonable to be greatly surprised in the event that were found to be the overwhelming net effect.
        
        But I don’t even know that AI> results in something I would consider Friendly.
        
        As most possible minds don’t care about humans, I object to using “unfriendly” to mean “an AI that would result in a world that I don’t value.” I think it better to use “unfriendly” to mean those minds indifferent to humans and the few hateful ones. Those that have value according to many but not all, such as perhaps those that seriously threaten to torture people, but only when they know those threatened will buckle, are better thought of as being a subspecies of Friendly AI.
        wedrifid 31 Oct 2011 13:09 UTC
        0 points
        0
        Parent
        
        As most possible minds don’t care about humans, I object to using “unfriendly” to mean “an AI that would result in a world that I don’t value.” I think it better to use “unfriendly” to mean those minds indifferent to humans and the few hateful ones. Those that have value according to many but not all, such as perhaps those that seriously threaten to torture people, but only when they know those threatened will buckle, are better thought of as being a subspecies of Friendly AI.
        
        I disagree. I will never refer to anything that wants to kill or torture me as friendly. Because that would be insane. AIs that are friendly to certain other people but not to me are instances of uFAIs in the same way that paperclippers are uFAIs (that are Friendly to paperclips). I incidentally also reject FAI and FAI. Although in the latter case I would still choose it as an alternative to nothing (which likely defaults to extinction).
        
        Mind you the nomenclature isn’t really sufficient to the task either way. I prefer to make my meaning clear of ambiguities. So if talking about “Friendly” AI that will kill me I tend to use the quotes that I just used while if I am talking about something that is Friendly to a specific group I’ll parameterize.
        lessdazed 31 Oct 2011 16:21 UTC
        0 points
        0
        Parent
        
        I will never refer to anything that wants to kill or torture me as friendly
        
        OK—this is included under what I would suggest to call “Friendly”, certainly if it only wanted to do so instrumentally, so we have a genuine disagreement. This is a good example for you to raise, as most even here might agree with how you put that.
        
        Nonetheless, my example is not included under this, so let’s be sure not to talk past each other. It was intended to be a moderate case, one in which you might not call something friendly when many others here would* - one in which a being wouldn’t desire to torture you, and would be bluffing if only in the sense that it had scrupulously avoided possible futures in which anyone would be tortured, if not in other senses (i.e. it actually would torture you, if you chose the way you won’t).
        
        As for not killing you, that sounds like an obviously badly phrased genie wish. As a similar point to the one you expressed would be reasonable and fully contrast with mine I’m surprised you added that.
        
        One can go either way (or other or both ways) on this labeling. I am apparently buying into the mind-projection fallacy and trying to use “Friendly” the way terms like “funny” or “wrong” are regularly used in English. If every human but me “finds something funny”, it’s often least confusing to say it’s “a funny thing that isn’t funny to me” or “something everyone else considers wrong that I don’t consider “wrong” (according to the simplest way of dividing concept-space) that is also advantageous for me”. You favor taking this new term and avoiding using the MPF, unlike for other English terms, and having it be understood that listeners are never to infer meaning as if the speaker was committing it, I favor just using it like any other term.
        
        So:
        
        Mind you the nomenclature isn’t really sufficient to the task either way
        
        My way, a being that wanted to do well by some humans and not others would be objectively both Friendly and Unfriendly, so that might be enough to make my usage inferior. But if my molecules are made out of usefulonium, and no one else’s are, I very much mind a being exploiting me for that, but wouldn’t mind other humans calling that being friendly when it uses the usefulonium to shield the Earth from a supernova, or whatever—and it’s not just not minding by comparison, either.
        
        *I mean both when others refer to beings making analogous threats to them and to the one that would make them to you.
      - lessdazed 31 Oct 2011 7:25 UTC
        0 points
        0
        Parent
        
        Do you mean: Excluding “jerks” from CEV does not guarantee that their destructive preferences will not be included?
        
        If so, I totally do not agree with you
        
        Through me, my dog is included. All the more so mothers’ sons!
        
        an understanding that where a direct conflict exist between EV of a minority and EV of a majority, it is the EV of a majority that will prevail in the CEV.
        
        I don’t think this is true, the safeguard that’s safe is to shut down if a conflict exists. That way, either things are simply better or no worse; judging between cases when each case has some advantages over the other is tricky.
- lessdazed 26 Oct 2011 13:33 UTC
  0 points
  0
  Parent
  
  If so then I don’t want your volition extrapolated either. Because that would destroy everything I hold dear
  
  How? As is, psychopaths have some influence, and I don’t consider the world worthless. Whatever their slice of a much larger pie, how would that be a difference in kind, something other than a lost opportunity?
  - wedrifid 26 Oct 2011 23:15 UTC
    4 points
    0
    Parent
    There is a reasonable good chance that when averaged out by the currently unspecified method used by the CEV process that any abominable volitions are offset by volitions that are at least vaguely acceptable. But that doesn’t mean including Jerks (where ‘Jerk’ is defined as agents whose extrapolated volitions are deprecated) in the process that determines the fate of the universe is The Right Thing To Do any more than including paperclippers, superhappies and babyeaters in the process is obviously The Right Thing To Do.
    
    CEV might turn out OK. Given the choice of setting loose a {Superintelligence Optimising CEV} or {Nothing At All nothing at all and we all go extinct} I’ll choose the former. There are also obvious political reasons why such a compromise might be necessary.
    
    If anyone thinks that CEV is not a worse thing to set loose than CEV then they are not being altruistic or moral they are being confused about a matter of fact.
    
    Disclaimer that is becoming almost mandatory in this kind of discussion: altruism, ethics and morality belong inside utility functions and volitions not in game theory or abstract optimisation processes.
    - lessdazed 27 Oct 2011 0:00 UTC
      0 points
      0
      Parent
      
      But that doesn’t mean including Jerks (where ‘Jerk’ is defined as agents whose extrapolated volitions are deprecated) in the process that determines the fate of the universe is The Right Thing To Do
      
      Sure, inclusion is a thing that causes good and bad outcomes, and not necessarily net good outcomes.
      
      There are also obvious political reasons why such a compromise might be necessary.
      
      Sure, but it’s not logically necessary that it’s a compromise, though it might be. It might be that the good outweighs the bad, or not, I’m not sure from where I stand.
      
      If anyone thinks that CEV is not a worse thing to set loose than CEV then they are not being altruistic or moral they are being confused about a matter of fact.
      
      Because I value inclusiveness more than zero, that’s not necessarily true. It’s probably true, or better yet, if one includes the best of the obvious Jerks with the rest of humanity, it’s quite probably true. All else equal, I’d rather an individual be in than out, so if someone is all else equal worse than useless but only light ballast, having them is a net good.
      
      Disclaimer
      
      It’s Adam and Eve, not Adam and Vilfredo Pareto!
      - wedrifid 27 Oct 2011 0:26 UTC
        0 points
        0
        Parent
        
        Disclaimer
        
        It’s Adam and Eve, not Adam and Vilfredo Pareto!
        
        Huh? Chewbacca?
        lessdazed 27 Oct 2011 0:59 UTC
        0 points
        0
        Parent
        I think your distinction is artificial, can you use it to show how an example question is a wrong question and another isn’t, and show how your distinction sorts among those two types well?
        wedrifid 27 Oct 2011 1:18 UTC
        0 points
        0
        Parent
        Your Adam and and Eve reply made absolutely no sense and this question makes only slightly more. I cannot relate what you are saying to the disclaimer that you partially quote (except one way that implies you don’t understand the subject matter—which I prefer not to assume). I cannot answer a question about what I am saying when I cannot see how on earth it is relevant.
- D_Alex 25 Oct 2011 3:41 UTC
  −1 points
  0
  Parent
  You missed my point 3 times out of 3. Wait, I’ll put down the flyswatter and pick up this hammer...:
  
  Excluding certain persons from CEV creates issues that CEV was intended to resolve in the first place. The mechanic you suggest—excluding persons that YOU deem to be unfit—might look attractive to you, but it will not be universally acceptable.
  
  Note that “our coherent extrapolated volition is our wish if we knew more, were smarter...” etc . The EVs of yourself and that suicidal fanatic should be pretty well aligned—you both probably value freedom, justice, friendship, security and like good food, sex and World of Warcraft(1)… you just don’t know why he believes that suicidal fanaticism is the right way under his circumstances, and he is, perhaps, not smart enough to see other options to strive for his values.
  
  Can I also ask you to re-read CEV, paying particular attention to Q4 and Q8 in the PAQ section? They deal with the instinctive discomfort of including everyone in the CEV.
  
  (1) that was a backhand with the flyswatter, which I grabbed with my left hand just then.
  - wedrifid 25 Oct 2011 6:35 UTC
    8 points
    0
    Parent
    
    Note that “our coherent extrapolated volition is our wish if we knew more, were smarter...” etc . The EVs of yourself and that suicidal fanatic should be pretty well aligned—you both probably value freedom
    
    No. I will NOT assume that extrapolating the volition of people with vastly different preferences to me will magically make them compatible with mine. The universe is just not that convenient. Pretending it is while implementing a FAI is suicidally naive.
    
    Can I also ask you to re-read CEV, paying particular attention to Q4 and Q8 in the PAQ section? They deal with the instinctive discomfort of including everyone in the CEV.
    
    I’m familiar with the document, as well as approximately everything else said on the subject here, even in passing. This includes Eliezer propozing ad-hoc work arounds to the “What if people are jerks?” problem.
    - D_Alex 25 Oct 2011 7:07 UTC
      −6 points
      0
      Parent
      
      No. I will NOT assume
      
      Quite right, don’t assume. Think it through. Then you may be less inclined to pepper your posts with non-sequiturs like “magically”, “pretending” and “naive”.
      
      I’m familiar with the document, as well as approximately everything else said on the subject here, even in passing.
      
      Great! But, IMHO, you have a tendency to miss the point. So:
      
      Can I also ask you to re-read CEV, paying particular attention to Q4 and Q8 in the PAQ section? They deal with the instinctive discomfort of including everyone in the CEV.
  - lessdazed 26 Oct 2011 13:26 UTC
    0 points
    0
    Parent
    
    be pretty well aligned
    
    What do you mean? As an analogy, .01% sure and 99.99% sure are both states of uncertainty. EVs are exactly the same or they aren’t. If someone’s unmuddled EV is different than mine—and it will be—I am better off with mine influencing the future alone rather than the future being influenced by both of us, unless my EV sufficiently values that person’s participation.
    
    My current EV places some non-infinite value on each person’s participation. You can assume for the sake of argument each person’s EV would more greatly value this.
    
    You can correctly assume that for each person, all else equal, I’d rather have them than not, (though not necessarily at the cost of having the universe diverted from my wishes) but I don’t really see why the death of most of the single ring species that is everything alive today makes selecting humans alone for CEV the right thing to do in a way that avoids the problem of excluding the disenfranchised whom the creators don’t care sufficiently about.
    
    If enough humans value what other humans want, and more so when extrapolated, it’s an interlocking enough network to scoop up all humans but the biologist who spends all day with chimpanzees (dolphins, octopuses, dogs, whatever) is going to be a bit disappointed by the first-order exclusion of his or her friends from consideration.
    - D_Alex 27 Oct 2011 7:51 UTC
      −4 points
      0
      Parent
      I mean, once they both take pains to understand each other’s situation and have a good, long think about it, they would find that they will agree on the big issues and be able to easily accommodate their differences. I even suspect that overall they would value the fact that certain differences exist.
      
      EVs can, of course, be exactly the same, or differ to some degree. But—provided we restrict ourselves to humans—the basic human needs and wants are really quite consistent across an overwhelming majority. There is enough material (on the web and in print) to support this.
      
      Wedrifid (IMO) is making a mistake of confusing some situation dependent subgoals (like say “obliterate Israel” or “my way or the highway”) with high level goals.
      
      I have not thought about extending CEV beyond human species, apart from taking into account the wishes of your example biologists etc. I suspect it would not work, because extrapolating wishes of “simpler” creatures would be impossible. See http://xkcd.com/605/.
      - wedrifid 28 Oct 2011 18:31 UTC
        1 point
        0
        Parent
        
        Wedrifid (IMO) is making a mistake of confusing some situation dependent subgoals (like say “obliterate Israel” or “my way or the highway”) with high level goals.
        
        You are mistaken. That I entertain no such confusion should be overwhelmingly clear from reading nearby comments.
      - TheOtherDave 27 Oct 2011 13:03 UTC
        1 point
        0
        Parent
        
        I have not thought about extending CEV beyond human species, apart from taking into account the wishes of your example biologists etc. I suspect it would not work, because extrapolating wishes of “simpler” creatures would be impossible.
        
        That sounds awfully convenient. If there really is a threshold of how “non-simple” a lifeform has to be to have coherently extrapolatable volitions, do you have any particular evidence that humans clear that threshold and, say, dolphins don’t?
        
        For my part, I suspect strongly that any technique that arrives reliably at anything that even remotely approximates CEV for a human can also be used reliably on many other species. I can’t imagine what that technique would be, though.
        
        (Just for clarity: that’s not to say one has to take other species’ volition into account, any more than one has to take other individuals’ volition into account.)
        D_Alex 31 Oct 2011 5:37 UTC
        2 points
        0
        Parent
        The lack of threshold is exactly the issue. If you include dolphins and chimpanzees, explicitly, you’d be in a position to apply the same reasoning to include parrots and dogs, then rodents and octopi, etc, etc.
        
        Eventually you’ll slide far enough down this slippery slope to reach caterpillars and parasitic wasps. Now, what would a wasp want to do, if it understood how its acts affect the other creatures worthy of inclusion in the CEV?
        
        This is what I see as the difficulty in extrapolating the wishes of simpler creatures. Perhaps in fact there is a coherent solution, but having only thought about this a little, I suspect there might not be one.
        lessdazed 31 Oct 2011 7:19 UTC
        1 point
        0
        Parent
        
        lack of threshold...then rodents...parasitic wasps
        
        We don’t have to care. If everyone or nearly all were convinced that something less than 20 pounds had no moral value, or a person less than 40 days old, or whatever, that would be that.
        
        Also, as some infinite sums have finite limits, I do not think that small things necessarily make summing humans’ or the Earth’s morality impossible.
        TheOtherDave 31 Oct 2011 18:09 UTC
        0 points
        0
        Parent
        Ah, OK. Sure, if your concern is that, if we extrapolated the volition of such creatures, we would find that they don’t cohere, I’m with you. I have similar concerns about humans, actually.
        
        I’d thought you were saying that we’d be unable to extrapolate it in the first place, which is a different problem.
  - pedanterrific 25 Oct 2011 3:48 UTC
    −2 points
    0
    Parent
    
    Can I also ask you to re-read CEV, paying particular attention to Q4 and Q8 in the PAQ section?
    
    Just, uh… just making sure: you do know that wedrifid has more fourteen thousand karma for a reason, right? It’s actually not solely because he’s an oldtimer, he can be counted on to have thought about this stuff pretty thoroughly.
    
    Edit: I’m not saying “defer to him because he has high status”, I’m saying “this is strong evidence that he is not an idiot.”
    - D_Alex 25 Oct 2011 4:13 UTC
      0 points
      0
      Parent
      I admit to being a little embarrassed as I wrote that paragraph, because this sort of thing can come across as “fuck you”. Not my intent at all, just that the reference is relevant, well written, supports my point—and is too long to quote.
      
      Having said that, your comment is pretty stupid. Yes, he has heaps more karma here—so what? I have more karma here than R. Dawkins and B. Obama combined!
      - pedanterrific 25 Oct 2011 4:22 UTC
        6 points
        0
        Parent
        (I prefer “Godspeed!”)
        
        The “so what” is, he’s already read it. Also, he’s, you know, smart. A bit abrasive (or more than a bit), but still. He’s not going to go “You know, you’re right! I never thought about it that way, what a fool I’ve been!”
        
        Edit: Discussed here.
        wedrifid 25 Oct 2011 6:19 UTC
        1 point
        0
        Parent
        
        A bit of an ethical egoist (or more than a bit), but still.
        
        I suppose “ethical egoism” fits. But only in some completely subverted “inclusive ethical egoist” sense in which my own “self interest” already takes into account all my altruistic moral and ethical values. ie. I’m basically not an ethical egoist at all. I just put my ethics inside the utility function where they belong.
        pedanterrific 25 Oct 2011 6:35 UTC
        2 points
        0
        Parent
        Duly noted! (I apologize for misconstruing you, also.)