Fake Fake Utility Functions

Fol­lowup to: Most of my posts over the last month...

Every now and then, you run across some­one who has dis­cov­ered the One Great Mo­ral Prin­ci­ple, of which all other val­ues are a mere deriva­tive con­se­quence.

I run across more of these peo­ple than you do. Only in my case, it’s peo­ple who know the amaz­ingly sim­ple util­ity func­tion that is all you need to pro­gram into an ar­tifi­cial su­per­in­tel­li­gence and then ev­ery­thing will turn out fine...

It’s in­cred­ible how one lit­tle is­sue can re­quire so much pre­req­ui­site ma­te­rial. My origi­nal sched­ule called for “Fake Utility Func­tions” to fol­low “Fake Jus­tifi­ca­tion” on Oct 31.

Talk about your plan­ning fal­lacy. I’ve been plan­ning to post on this topic in “just a few days” for the past month. A fun lit­tle demon­stra­tion of un­der­es­ti­mated in­fer­en­tial dis­tances.

You see, be­fore I wrote this post, it oc­curred to me that if I wanted to prop­erly ex­plain the prob­lem of fake util­ity func­tions, it would be helpful to illus­trate a mis­take about what a sim­ple op­ti­miza­tion crite­rion im­plied. The strongest real-world ex­am­ple I knew was the Tragedy of Group Selec­tion­ism. At first I thought I’d men­tion it in pass­ing, within “Fake Utility Func­tions”, but I de­cided the Tragedy of Group Selec­tion­ism was a long enough story that it needed its own blog post...

So I started to write “The Tragedy of Group Selec­tion­ism”. A few hours later, I no­ticed that I hadn’t said any­thing about group se­lec­tion­ism yet. I’d been too busy in­tro­duc­ing ba­sic evolu­tion­ary con­cepts. Select all the in­tro­duc­tory stuff, cut, Com­pose New Post, paste, ti­tle… “An Alien God”. Then keep writ­ing un­til the “Alien God” post gets too long, and start tak­ing sep­a­rate sub­jects out into their own posts: “The Won­der of Evolu­tion”, “Evolu­tions Are Stupid”, and at this point it be­came clear that, since I was plan­ning to say a few words on evolu­tion any­way, that was the time. Be­sides, a ba­sic fa­mil­iar­ity with evolu­tion would help to shake peo­ple loose of their hu­man as­sump­tions when it came to vi­su­al­iz­ing non­hu­man op­ti­miza­tion pro­cesses.

So, fi­nally I posted “The Tragedy of Group Selec­tion­ism”. Now I was ready to write “Fake Utility Func­tions”, right? The post that was sup­posed to come im­me­di­ately af­ter­ward? So I thought, but each time I tried to write the post, I ended up re­curs­ing on a pre­req­ui­site post in­stead. Such as “Fake Selfish­ness”, “Fake Mo­ral­ity”, and “Fake Op­ti­miza­tion Cri­te­ria”.

When I got to “Fake Op­ti­miza­tion Cri­te­ria”, I re­ally thought I could do “Fake Utility Func­tions” the next day. But then it oc­curred to me that I’d never ex­plained why a sim­ple util­ity func­tion wouldn’t be enough. We are a thou­sand shards of de­sire, as I said in “Thou Art God­shat­ter”. Only that first re­quired dis­cussing “Evolu­tion­ary Psy­chol­ogy”, which re­quired ex­plain­ing that hu­man minds are “Adap­ta­tion-Ex­e­cuters, not Fit­ness-Max­i­miz­ers”, plus the differ­ence be­tween “Protein Re­in­force­ment and DNA Con­se­quen­tial­ism”.

Fur­ther­more, I’d never re­ally ex­plained the differ­ence be­tween “Ter­mi­nal Values and In­stru­men­tal Values”, with­out which I could hardly talk about util­ity func­tions.

Surely now I was ready? Yet I thought about con­ver­sa­tions I’d had over the years, and how peo­ple seem to think a sim­ple in­struc­tion like “Get my mother out of that burn­ing build­ing!” con­tains all the mo­ti­va­tions that shape a hu­man plan to res­cue her, so I thought that first I’d do “The Hid­den Com­plex­ity of Wishes”. But, re­ally, the hid­den com­plex­ity of plan­ning, and all the spe­cial cases needed to patch the ge­nie’s wish, was part of the gen­eral prob­lem of record­ing out­puts with­out ab­sorb­ing the pro­cess that gen­er­ates the out­puts—as I ex­plained in “Ar­tifi­cial Ad­di­tion” and “Truly Part Of You”. You don’t want to keep the lo­cal goal de­scrip­tion and dis­card the non­lo­cal util­ity func­tion: “Leaky Gen­er­al­iza­tions” and “Lost Pur­poses”.

Plus it oc­curred to me that evolu­tion it­self made an in­ter­est­ing ge­nie, so be­fore all that, came “Con­jur­ing An Evolu­tion To Serve You”.

One kind of lost pur­pose is ar­tifi­cial plea­sure, and “hap­piness” is one of the Fake Utility Func­tions I run into more of­ten: “Not for the Sake of Hap­piness (Alone)”. Similarly, it was worth tak­ing the time to es­tab­lish that fit­ness is not always your friend (“Evolv­ing to Ex­tinc­tion”) and that not ev­ery­thing in the uni­verse is sub­ject to sig­nifi­cant se­lec­tion pres­sures (“No Evolu­tions for Cor­po­ra­tions or Nan­ode­vices”), to avoid the Fake Utility Func­tion of “ge­netic fit­ness”.

Right af­ter “Lost Pur­poses” seemed like a good time to point out the deep link be­tween keep­ing track of your origi­nal goal and keep­ing track of your origi­nal ques­tion: “Pur­pose and Prag­ma­tism”.

Into the home stretch! No, wait, this would be a good time to dis­cuss “Affec­tive Death Spirals”, since that’s one of the main things that goes wrong when some­one dis­cov­ers The One True Valuable Thingy—they keep find­ing nicer and nicer things to say about it. Well, you can’t dis­cuss af­fec­tive death spirals un­less you first dis­cuss “The Affect Heuris­tic”, but I’d been mean­ing to do that for a while any­way. “Evalua­bil­ity” illus­trates the af­fect heuris­tic and leads to an im­por­tant point about “Un­bounded Scales and Fu­tur­ism”. The sec­ond key to af­fec­tive death spirals is “The Halo Effect”, which we can see illus­trated in “Su­per­hero Bias” and “Mere Mes­si­ahs”. Then it’s on to af­fec­tive death spirals and how to “Re­sist the Happy Death Spiral” and “Un­crit­i­cal Su­per­crit­i­cal­ity”.

A bonus irony is that “Fake Utility Func­tions” isn’t a grand cli­max. It’s just one of many Less Wrong posts rele­vant to my AI work, with plenty more sched­uled. This par­tic­u­lar post just turned out to re­quire just a lit­tle more pre­req­ui­site ma­te­rial which—I thought on each oc­ca­sion—I would have to write any­way, sooner or later.

And that’s why blog­ging is difficult, and why it is nec­es­sary, at least for me. I would have been doomed, yea, ut­terly doomed, if I’d tried to write all this as one pub­li­ca­tion rather than as a se­ries of blog posts. One month is noth­ing for this much ma­te­rial.

But now, it’s done! Now, af­ter only slightly more than an ex­tra month of pre­req­ui­site ma­te­rial, I can do the blog post origi­nally sched­uled for Novem­ber 1st!

Ex­cept...

Now that I think about it...

This post is pretty long already, right?

So I’ll do the real “Fake Utility Func­tions” to­mor­row.