Eliezer Yudkowsky comments on A Rationalist’s Tale

Eliezer Yudkowsky 11 Sep 2011 1:31 UTC
21 points
0
So I’m thinking to myself, around six years ago, “I can at least manage to publish timeless decision theory, right? That’s got to be around the safest idea I have, it couldn’t get any safer than that while still being at all interesting. I mean, yes, there’s these possible ways you could let these ideas eat your brain but who could possibly be smart enough to understand TDT and still manage to fall for that?”

Lesson learned.

I spent a year or so diligently studying rationality as a SingInst Visiting Fellow followed by realizing that I was a few levels above nearly any other aspiring rationalist.

And this is what several levels above me looks like? I’m not omnipotent, yet, but I have a deed or two to my name at this point; for example, when I write Harry Potter fanfiction, it reliably ends up as the most popular HP fanfiction on the Internet. (Those of you who didn’t get here following HPMOR can rule out selection effects at this point.) Several levels above me should make it noticeably easier to show your power in a third-party-noticeable fashion, and the fact that you can’t do so should cause you to question yourself.

It’s the opposite of the lesson I usually try to teach, but in this one case I’ll say it: it’s not the world that’s mad, it’s you.
- JoshuaZ 11 Sep 2011 20:26 UTC
  18 points
  0
  Parent
  
  And this is what several levels above me looks like? I’m not omnipotent, yet, but I have a deed or two to my name at this point; for example, when I write Harry Potter fanfiction, it reliably ends up as the most popular HP fanfiction on the Internet. (Those of you who didn’t get here following HPMOR can rule out selection effects at this point.) Several levels above me should make it noticeably easier to show your power in a third-party-noticeable fashion, and the fact that you can’t do so should cause you to question yourself.
  
  This doesn’t obviously follow to me. There are skill sets which aren’t due to rationality. Your own skill sets may be due in part to better writing capability and general intelligence.
  - homunq 12 Sep 2011 2:15 UTC
    12 points
    0
    Parent
    Mad skillz doesn’t imply rationality. Lack of demonstrable skillz does strongly decrease the probability of mad rashunalitea.
- Will_Newsome 11 Sep 2011 22:58 UTC
  13 points
  0
  Parent
  You misinterpreted me, I wasn’t claiming to be several levels above you. That’s my fault for being unclear.
- CronoDAS 11 Sep 2011 4:26 UTC
  12 points
  0
  Parent
  
  I mean, yes, there’s these possible ways you could let these ideas eat your brain but who could possibly be smart enough to understand TDT and still manage to fall for that?”
  
  Make something idiotproof and the universe will build a better idiot.
- cousin_it 12 Sep 2011 9:45 UTC
  10 points
  0
  Parent
  Don’t hold yourself responsible when people go funny in the head on TDT-related matters. Quantum mechanics and relativity have turned much more brains to mush, does that mean they shouldn’t have been published?
  - Vladimir_Nesov 12 Sep 2011 11:26 UTC
    12 points
    0
    Parent
    That would be a valid argument against, of course a relatively very weak one. Resist the temptation to make issues one-sided.
- Will_Newsome 11 Sep 2011 23:07 UTC
  10 points
  0
  Parent
  I got my intuitions from ADT, not TDT, and I would’ve gotten all the same ideas from Anna/Steve even if you hadn’t popularized decision theory. (The general theme had been around since Wei Dai in the early 2000′s, no?) So you shouldn’t learn that lesson to too great an extent.
- lessdazed 11 Sep 2011 2:05 UTC
  9 points
  0
  Parent
  Reading charitably, he may mean you are a rationalist, and the other visiting fellows were peer aspiring rationalists. Also, he did say “nearly.”
  - Will_Newsome 11 Sep 2011 23:03 UTC
    8 points
    0
    Parent
    Thanks; yeah, I wasn’t writing carefully, but I didn’t mean to say that “I am a significantly better rationalist than anybody else on the planet”, I meant to say “there are important subskills of rationality where I seem to be at roughly the SingInst Research Fellow level of rationality and high above the Less Wrong poster level of rationality”. My apologies for being so unclear.
- XiXiDu 11 Sep 2011 12:39 UTC
  5 points
  0
  Parent
  
  It’s the opposite of the lesson I usually try to teach, but in this one case I’ll say it: it’s not the world that’s mad, it’s you.
  
  I don’t think he is “mad”, at least not if you press him enough. A few weeks ago I posted the following comment on one of his Facebook submissions:
  
  Will, this off-topic, I’m curious. What would you do if 1.) any action would be ethically indifferent 2.) expected utility hypothesis was bunk 3.) all that really counted was what you want based on naive introspection?
  
  I’m asking because you (and others) seem to increasingly lose yourself in logical implications of maximizing expected utility and ethical considerations.
  
  Take care that you don’t confuse squiggles on paper with reality.
  
  His reply (emphasis mine):
  
  Alexander, I don’t think that’s a particularly good model of my actual reasoning. The simple arguments I have for thinking about what I think about don’t involve Pascalian reasoning or conjunctions of weird beliefs, and when it comes to policy I am one of the most vocal critics on LW of the unfortunate trend where otherwise smart people attempt to implement complicated policies due to the output of some incredibly brittle model, often without even taking into account opportunity costs or even considering any obviously better meta-level policies. That is insanity, and completely unrelated to any of the kinds of thinking that I do.
  
  The reasons for my current obsessions are pretty simple, though it’s worth noting that I am intentionally keeping my options very, very open.
  
  Seed AI appears to be very possible to engineer. “Provably”-FAI isn’t obviously possible to engineer given potential time constraints. If we could make a seed AI that was reflective enough, for example due strong founding in what Steve Rayhawk wants from a “Creatorless Decision Theory”, and we had strong arguments about attractors that such an agent might fall into, and we had reason to believe that it might converge on something like FAI, then there might come a time when we should launch such a seed AI, even without all the proofs—for example due to being in a politically or existentially volatile situation.
  
  Between BigNum-maximizer Goedel machine-like foomers and provably-FAI foomers, there’s a long continuum of AIs that are more or less reflective on the source of their utility function and what it means that some things rather than some other things caused that particular utility function to be there rather than some other one. The typical SingInst argument that a given AGI will be some kind of strict literalist with respect to what it thinks is its utility function is simply not very strong. In fact, it even contradicts Omohundro’s Basic AI Drives paper, which briefly addresses the topic: “For one thing, it has to make those objectives clear to itself. If its objectives are only implicit in the structure of a complex circuit or program, then future modifications are unlikely to preserve them. Systems will therefore be motivated to reflect on their goals and to make them explicit.” Some small amount of reflection would seem to open the door for arbitrarily large amounts of reflection, especially if the AI is simultaneously modifying its decision theory—obviously we’d rather avoid an argument of degree where unchained intuitions are allowed to run amok.
  
  We can make the debate more technical by looking at Goedel machines and program semantics. I have some relevant ideas but perhaps Schmidhueber’s talk about some Goedel machine implementations in a few days at AGI2011 will prove enlightening.
  
  I’m already losing steam, so we’ll just call that Part One. Part Two and maybe a Part Three will talk about: decision theories upon self-modification; decision theory in context; abstract models of optimization & morality; timeless control and game theory of the big red button; and probably other miscellaneous related ideas.
  
  But after all that I don’t really know how to answer your question. Wants… Even if somehow the thousand aversions that are shoulds were no longer supposed to compel me, they’d still be there, and I’d still be motivationally paralyzed, or whatever it is I am. I’d probably do the exact same things I’m doing now: living in Berkeley with my girlfriend, eating good food, regularly visiting some of the coolest people on Earth to talk about some of the most interesting ideas in all of history. All of that sounds pretty optimal as far as living on a budget of zero dollars goes. If the aversions were lifted, but I was still me, then I haven’t a good idea what I’d do. I’d be happy to immerse myself in the visual arts community, perhaps, or if I thought I could be brilliant I’d revolutionize music cognition and write by far the best artificial composer algorithms. I’d go to various excellent universities for a year or two, and if somehow I found an easy way to make money along the way, e.g. with occasional programming jobs, then I’d frequently travel to Europe and then Asia. I imagine I’d spent very many months in Germany, especially Bavaria. Walking along green mountains or resting under trees in meadow orchards, ideally with a MacBook Pro and a drawing tablet handy. I’d do much meditation and probably progress very quickly, and at some point I expect I’d develop a sort of self-refuge. But I don’t know, I’m just saying things that sound nice as if can’t have, and I may very well end up doing most of them no matter what future I lead.
  
  It seems to me that he’s still with the rest of humanity when it comes to what he is doing on a daily basis and his underlying desires.
  - Vladimir_Nesov 11 Sep 2011 13:51 UTC
    11 points
    0
    Parent
    
    I don’t think he is “mad”, at least not if you press him enough.
    
    (You argue that the madness in question, if present, is compartmentalized. The intended sense of “madness” (normal use on LW) includes the case of compartmentalized madness, so your argument doesn’t seem to disagree with Eliezer’s position.)
    - Will_Newsome 11 Sep 2011 23:59 UTC
      2 points
      0
      Parent
      ((For those who haven’t seen it yet: http://lesswrong.com/lw/2q6/compartmentalization_in_epistemic_and/ ))
  - FeepingCreature 22 Nov 2011 16:40 UTC
    0 points
    0
    Parent
    Belatedly.
    
    “For one thing, it has to make those objectives clear to itself. If its objectives are only implicit in the structure of a complex circuit or program, then future modifications are unlikely to preserve them. Systems will therefore be motivated”
    
    Hold on. Motivated by what? If its objectives are only implicit in the structure, then why would these objectives include their self-preservation?
- Will_Newsome 11 Sep 2011 23:56 UTC
  0 points
  0
  Parent
  BTW, this is neat: http://arxiv.org/PS_cache/arxiv/pdf/0804/0804.3678v1.pdf
  
  It’s an attempt to better unify causal graphs with algorithmic information. The sections about various Markov properties is I think very important for explaining differences between CDT and TDT, ’cuz you can talk more clearly about exactly where a decision problem can’t be solved due to Markov condition limitations.