The Stamp Collector

I’m writ­ing a se­ries of posts about re­plac­ing guilt mo­ti­va­tion over on Mind­ingOurWay, and I plan to post the meatier /​ more sub­stan­tive posts in that se­ries to LessWrong. This one is an alle­gory de­signed to re­mind peo­ple that they are al­lowed to care about the outer world, that they are not cursed to only ever care about what goes on in their heads.


Once upon a time, a group of naïve philoso­phers found a robot that col­lected trin­kets. Well, more speci­fi­cally, the robot seemed to col­lect stamps: if you pre­sented this robot with a choice be­tween var­i­ous trin­kets, it would always choose the op­tion that led to­wards it hav­ing as many stamps as pos­si­ble in its in­ven­tory. It ig­nored dice, bot­tle caps, alu­minum cans, sticks, twigs, and so on, ex­cept in­so­far as it pre­dicted they could be traded for stamps in the next turn or two. So, of course, the philoso­phers started call­ing it the “stamp col­lec­tor.”

Then, one day, the philoso­phers dis­cov­ered com­put­ers, and de­duced out that the robot was merely a soft­ware pro­gram run­ning on a pro­ces­sor in­side the robot’s head. The pro­gram was too com­pli­cated for them to un­der­stand, but they did man­age to de­duce that the robot only had a few sen­sors (on its eyes and in­side its in­ven­tory) that it was us­ing to model the world.

One of the philoso­phers grew con­fused, and said, “Hey wait a sec, this thing can’t be a stamp col­lec­tor af­ter all. If the robot is only build­ing a model of the world in its head, then it can’t be op­ti­miz­ing for its real in­ven­tory, be­cause it has no ac­cess to its real in­ven­tory. It can only ever act ac­cord­ing to a model of the world that it re­con­structs in­side its head!”

“Ah, yes, I see,” an­other philoso­pher an­swered. “We did it a dis­ser­vice by nam­ing it a stamp col­lec­tor. The robot does not have true ac­cess to the world, ob­vi­ously, as it is only see­ing the world through sen­sors and build­ing a model in its head. There­fore, it must not ac­tu­ally be max­i­miz­ing the num­ber of stamps in its in­ven­tory. That would be im­pos­si­ble, be­cause its in­ven­tory is out­side of its head. Rather, it must be max­i­miz­ing its in­ter­nal stamp counter in­side its head.”

So the naïve philoso­phers nod­ded, pleased with this, and then they stopped won­der­ing how the stamp col­lec­tor worked.


There are a num­ber of flaws in this rea­son­ing. First of all, these naïve philoso­phers have made the ho­muncu­lus er­ror. The robot’s pro­gram may not have “true ac­cess” to how many stamps were in its in­ven­tory (what­ever that means), but it also didn’t have “true ac­cess” to it’s in­ter­nal stamp counter.

The robot is not oc­cu­pied by some ho­muncu­lus that has do­minion over the in­nards but not the out­ards! The ab­stract pro­gram doesn’t have “true” ac­cess to the reg­ister hold­ing the stamp counter and “fake” ac­cess to the in­ven­tory. Steer­ing re­al­ity to­wards re­gions where the in­ven­tory has lots of stamps in it is the same sort of thing as steer­ing re­al­ity to­wards re­gions where the stamp-counter-reg­ister has high-num­ber-pat­terns in it. There’s not a magic cir­cle con­tain­ing the mem­ory but not the in­ven­tory, within which the robot’s ho­muncu­lus has do­minion; the robot pro­gram has just as lit­tle ac­cess to the “true hard­ware” as it has to the “true stamps.”

This brings us to the sec­ond flaw in their rea­son­ing rea­son­ing, that of try­ing to ex­plain choice with a choice-thing. You can’t ex­plain why a wall is red by say­ing “be­cause it’s made of tiny red atoms;” this is not an ex­pla­na­tion of red-ness. In or­der to ex­plain red-ness, you must ex­plain it in terms of non-red things. And yet, hu­mans have a bad habit of ex­plain­ing con­fus­ing things in terms of them­selves. Why does liv­ing flesh re­spond to men­tal com­mands, while dead flesh doesn’t? Why, be­cause the liv­ing flesh con­tains Élan Vi­tal. Our naïve philoso­phers have made the same mis­take: they said, “How can it pos­si­bly choose out­comes in which the in­ven­tory has more stamps? Aha! It must be by choos­ing out­comes in which the stamp counter is higher!,” and in do­ing so, they have ex­plained choice in terms of choice, rather than in terms of some­thing more ba­sic.

It is not an ex­pla­na­tion to say “it’s try­ing to get stamps into its in­ven­tory be­cause it’s try­ing to max­i­mize its stamp-counter.” An ex­pla­na­tion would look more like this: the robot’s com­puter runs a pro­gram which uses sense-data to build a model of the world. That model of the world con­tains a rep­re­sen­ta­tion of how many stapms are in the in­ven­tory. The pro­gram then iter­ates over some set of available ac­tions, pre­dicts how many stamps would be in the in­ven­tory (ac­cord­ing to the model) if it took that ac­tion, and out­puts the ac­tion which leads to the most pre­dicted stamps in its pos­ses­sion.

We could also pos­tu­late that the robot con­tains a pro­gram which mod­els the world, pre­dicts how the world would change for each ac­tion, and then pre­dicts how that out­come would af­fect some spe­cific place in in­ter­nal mem­ory, and then se­lects the ac­tion which max­i­mizes the in­ter­nal counter. That’s pos­si­ble! You could build a ma­chine like that! It’s a strictly more com­pli­cated hy­poth­e­sis, and so it gets a com­plex­ity penalty, but at least it’s an ex­pla­na­tion!

And, for­tu­nately for us, it’s a testable ex­pla­na­tion: we can check what the robot does, when faced with the op­por­tu­nity to di­rectly in­crease the stamp-counter-reg­ister (with­out ac­tu­ally in­creas­ing how many stamps it has). Let’s see how that goes over among our naïve philoso­phers…


Hey, check it out: I iden­ti­fied the stamp counter in­side the robot’s mem­ory. I can’t read it, but I did find a way to in­crease its value. So I gave the robot the fol­low­ing op­tions: take one stamp, or take zero stamps and I’ll in­crease the stamp counter by ten. Guess which one it took?

“Well, of course, it would choose the lat­ter!” one of the naïve philoso­phers an­swers im­me­di­ately.

Nope! It took the former.

“… Huh! That means that the stampy­ness of re­fus­ing to have the stamp counter tam­pered with must worth be more than 10 stamps!”

Huh? What is “stampy­ness”?

“Why, stampy­ness is the robot’s in­ter­nal mea­sure of how much tak­ing a cer­tain ac­tion would in­crease its stamp counter.”

What? That’s ridicu­lous. I’m pretty sure it’s just col­lect­ing stamps.

“Im­pos­si­ble! The pro­gram doesn’t have ac­cess to how many stamps it re­ally has; that’s a prop­erty of the outer world. The robot must be op­ti­miz­ing ac­cord­ing to val­ues that are ac­tu­ally in its head.”

Here, let’s try offer­ing it the fol­low­ing op­tions: ei­ther I’ll give it one stamp, or I’ll in­crease its stamp counter by Ack­er­mann(g64, g64) — oh look, it took the stamp.”

“Wow! That was a very big num­ber, so that al­most surely mean that the stampy­ness of re­fus­ing is de­pen­dent upon how much stampy­ness it’s re­fus­ing! It must be very happy, be­cause you just gave it a lot of stampy­ness by giv­ing it such a com­pel­ling offer to re­fuse.”

Oh, here, look, I just figured out a way to set the stamp counter to max­i­mum. Here, I’ll try offer­ing it a choice be­tween ei­ther (a) one stamp, or (b) I’ll set the stamp counter to maxi — oh look, it already took the stamp.

“In­cred­ible! That must there must be some other counter mea­sur­ing micro-stampy­ness, the amount of stamp­iness it gets im­me­di­ately upon se­lect­ing an ac­tion, be­fore you have a chance to mod­ify it! Ah, yes, that’s the only pos­si­ble ex­pla­na­tion for why it would re­fuse you set­ting the stamp counter to max­i­mum, it must be choos­ing ac­cord­ing to the per­ceived im­me­di­ate micro-stampy­ness of each available ac­tion! Nice job do­ing sci­ence, my dear fel­low, we have learned a lot to­day!”


Ahh! No! Let’s be very clear about this: the robot is pre­dict­ing which out­comes would fol­low from which ac­tions, and it’s rank­ing them, and it’s tak­ing the ac­tions that lead to the best out­comes. Ac­tions are rated ac­cord­ing to what they achieve. Ac­tions do not them­selves have in­trin­sic worth!

Do you see where these naïve philoso­phers went con­fused? They have pos­tu­lated an agent which treats ac­tions like ends, and tries to steer to­wards what­ever ac­tion it most prefers — as if ac­tions were ends unto them­selves.

You can’t ex­plain why the agent takes an ac­tion by say­ing that it ranks ac­tions ac­cord­ing to whether or not tak­ing them is good. That begs the ques­tion of which ac­tions are good!

This agent rates ac­tions as “good” if they lead to out­comes where the agent has lots of stamps in its in­ven­tory. Ac­tions are rated ac­cord­ing to what they achieve; they do not them­selves have in­trin­sic worth.

The robot pro­gram doesn’t con­tain re­al­ity, but it doesn’t need to. It still gets to af­fect re­al­ity. If its model of the world is cor­re­lated with the world, and it takes ac­tions that it pre­dicts leads to more ac­tual stamps, then it will tend to ac­cu­mu­late stamps.

It’s not try­ing to steer the fu­ture to­wards places where it hap­pens to have se­lected the most micro-stampy ac­tions; it’s just steer­ing the fu­ture to­wards wor­lds where it pre­dicts it will ac­tu­ally have more stamps.


Now, let me tell you my sec­ond story:

Once upon a time, a group of naïve philoso­phers en­coun­tered a group of hu­man be­ings. The hu­mans seemed to keep se­lect­ing the ac­tions that gave them plea­sure. Some­times they ate good food, some­times they had sex, some­times they made money to spend on plea­surable things later, but always (for the first few weeks) they took ac­tions that led to plea­sure.

But then one day, one of the hu­mans gave lots of money to a char­ity.

“How can this be?” the philoso­phers asked, “Hu­mans are plea­sure-max­i­miz­ers!” They thought for a few min­utes, and then said, “Ah, it must be that their plea­sure from giv­ing the money to char­ity out­weighed the plea­sure they would have got­ten from spend­ing the money.”

Then a mother jumped in front of a car to save her child.

The naïve philoso­phers were stunned, un­til sud­denly one of their num­ber said “I get it! The im­me­di­ate micro-plea­sure of choos­ing that ac­tion must have out­weighed —


Peo­ple will tell you that hu­mans always and only ever do what brings them plea­sure. Peo­ple will tell you that there is no such thing as al­tru­ism, that peo­ple only ever do what they want to.

Peo­ple will tell you that, be­cause we’re trapped in­side our heads, we only ever get to care about things in­side our heads, such as our own wants and de­sires.

But I have a mes­sage for you: You can, in fact, care about the outer world.

And you can steer it, too. If you want to.