Hedonium’s semantic problem

If this ar­gu­ment is a re-tread of some­thing already ex­ist­ing in the philo­soph­i­cal liter­a­ture, please let me know.

I don’t like Searle’s Chi­nese Room Ar­gu­ment. Not re­ally be­cause it’s wrong. But mainly be­cause it takes an in­ter­est­ing and valid philo­soph­i­cal in­sight/​in­tu­ition and then twists it in the wrong di­rec­tion.

The valid in­sight I see is:

One can­not get a se­man­tic pro­cess (ie one with mean­ing and un­der­stand­ing) purely from a syn­tac­tic pro­cess (one in­volv­ing purely syn­tac­tic/​al­gorith­mic pro­cesses).

I’ll illus­trate both the in­sight and the prob­lem with Searle’s for­mu­la­tion via an ex­am­ple. And then look at what this means for he­do­nium and mind crimes.

Napoleonic exemplar

Con­sider the fol­low­ing four pro­cesses:

  1. Napoleon, at Water­loo, think­ing and di­rect­ing his troops.

  2. A robot, hav­ing taken the place of Napoleon at Water­loo, think­ing in the same way and di­rect­ing his troops in the same way.

  3. A vir­tual Napoleon in a simu­la­tion of Water­loo, similarly think­ing and di­rect­ing his vir­tual troops.

  4. A ran­dom Boltz­mann brain spring­ing into ex­is­tence from the ther­mal ra­di­a­tion of a black hole. This Boltz­mann brain is long-last­ing (24 hours), and, by sheer co­in­ci­dence, hap­pens to mimic ex­actly the thought pro­cesses of Napoleon at Water­loo.

All four men­tal pro­cesses have the same syn­tac­tic prop­er­ties. Searle would draw the se­man­tic line be­tween the first and the sec­ond pro­cess: the or­ganic mind is some­how spe­cial. I would draw the se­man­tic line be­tween the third and the fourth pro­cess. The differ­ence is that in all three of the first pro­cesses, the sym­bols in the brain cor­re­spond to ob­jects in re­al­ity (or vir­tual re­al­ity). They can make rea­son­ably ac­cu­rate pre­dic­tions about what might hap­pen if they do some­thing, and get feed­back val­i­dat­ing or in­firm­ing those pre­dic­tions. Se­man­tic un­der­stand­ing emerges from a cor­re­spon­dence with re­al­ity.

In con­trast the fourth pro­cess is liter­ally in­sane. It’s men­tal pro­cess cor­re­spond to noth­ing in re­al­ity (or at least, noth­ing in its re­al­ity). It emerges by co­in­ci­dence, its pre­dic­tions are wrong or mean­ingless, and it will al­most cer­tainly be im­me­di­ately de­stroyed by pro­cesses it has com­pletely failed to model. The sym­bols ex­ist only within its own head.

There are some in­ter­est­ing edge cases to con­sider here (I chose Napoleon be­cause there are fa­mously many peo­ple de­luded into think­ing they are Napoleon), but that’s enough back­ground. Essen­tially the sym­bol ground­ing prob­lem is solved (maybe by evolu­tion, maybe by de­liber­ate de­sign) sim­ply by hav­ing the sym­bols and the men­tal model be close enough to re­al­ity. The sym­bols in the Boltz­mann-Napoleon’s brain could be any­thing, as far as we know—we just iden­tify it with Napoleon be­cause it’s co­in­ci­den­tally similar. If Napoleon had never ex­isted, we might have no clue as to what Boltz­mann-Napoleon was “think­ing”.

He­do­nium: syn­tax?

The idea be­hind he­do­nium is to take some­thing cor­re­spond­ing to the hap­piest pos­si­ble state, and copy that with max­i­mal effi­ciency across the uni­verse. This can in­volve defin­ing he­dons—the fun­da­men­tal unit of hap­piness—and max­imise them while min­imis­ing do­lors (the fun­da­men­tal units of pain/​suffer­ing/​anti-hap­piness). Sup­pos­edly this would re­sult in the cos­mos be­ing filled to the brim with the great­est pos­si­ble amount of hap­piness and joy. This could maybe be pic­tured as tak­ing the supreme mo­ment of ec­static, or­gas­mic hap­piness of the most joyful per­son ever to live, and filling the cos­mos with that.

Let’s start with the naivest of pos­si­ble he­do­nium ideas, a sim­ple al­gorithm with a hap­piness counter “My_hap­piness” which is ei­ther con­tinu­ally in­creas­ing or set at some (pos­si­bly in­finite or trans-finite) max­i­mum, while the al­gorithm con­tinu­ally re­peats to it­self “I have ul­ti­mate hap­piness!”.

A very naive idea. And one that has an im­me­di­ate and ob­vi­ous flaw: what hap­pens if English were to change so that “hap­piness” and “suffer­ing” ex­changed mean­ings? Then we would have trans­formed the max­i­mally happy uni­verses into a max­i­mally painful one. All at the stroke of a lin­guis­tic pen.

The prob­lem is that that naive he­do­nium ended up be­ing a purely syn­tac­tic con­struc­tion. Refer­ring to noth­ing in the out­side uni­verse, its defi­ni­tion of “hap­piness” was en­tirely de­pen­dent on the lin­guis­tic la­bel “hap­piness”.

It seems that the more grounded and se­man­tic the sym­bols are, the harder it is to get an iso­mor­phism that trans­forms them into some­thing else.

He­do­nium: semantics

So how can we en­sure that we have some­thing that is inar­guably he­do­nium, not just the al­gorith­mic equiv­a­lent of draw­ing a happy face? I’d say there are three main ways that we can check that the sym­bols are grounded/​the hap­piness is gen­uine:

  • Pre­dic­tive ability

  • Sim­plest iso­mor­phism to reality

  • Cor­rect features

If the sym­bols are well grounded in re­al­ity, then the agent should have a de­cent pre­dic­tive abil­ity. Note that the bar is very low here. Some­one who re­al­ises that bat­tles are things that are fought by hu­mans, that in­volve death, and that are won or lost or drawn, is already very much ahead than some­one who thinks that bat­tles are things that in­vite you home for tea and bis­cuits. So a de­cent pre­dic­tion is “some­one will die in this bat­tle”, a bad one is “this bat­tle will wear a white frilly dress”.

Of course, that pre­dic­tion re­lies on the mean­ing of “die” and “white frilly dress”. We can get round this prob­lem by look­ing at pre­dic­tive abil­ity in gen­eral (does the agent win some bets/​achieve a goal it seems to have?). Or we can look at the en­tire struc­ture of the agent’s sym­bolic setup, and the re­la­tion­ships be­tween them. This is what the pro­ject CYC tried to do, by mem­o­ris­ing databases of sen­tences like “Bill Clin­ton be­longs to the col­lec­tion of U.S. pres­i­dents” and “All trees are plants”. The aim was to achieve and AI, which failed. How­ever, if the database is large and com­pli­cated enough, it might be that there is only one sen­si­ble way of ground­ing the sym­bols in re­al­ity. “Sen­si­ble” can here be defined us­ing a com­plex­ity prior.

But be warned! The sen­tences are very much in­tu­ition pumps. “Bill Clin­ton be­longs to the col­lec­tion of U.S. pres­i­dents” ir­re­sistibly makes us think of the real Bill Clin­ton. We need to able to take sen­tences like “So­lar ra­di­a­tion waxes to the bowl of am­bidex­ter­ous anger”, and de­duce af­ter much anal­y­sis of the sen­tences’ struc­tures that “So­lar ra­di­a­tion → Bill Clin­ton”, “waxes → be­longs”, etc...

No­tice there is a con­nec­tion with the sym­bolic ap­proach of GOFAI (“Good Old-Fash­ioned AI). Ba­si­cally GOFAI failed be­cause the sym­bols did not en­code true un­der­stand­ing. The more he­do­nium re­sem­bles GOFAI, the more likely it is to de­void of ac­tual hap­piness (equiv­a­lently, the more likely it is to be iso­mor­phic to some other, non-hap­piness situ­a­tion).

Fi­nally, we can as­sess some of the sym­bols (the more ab­stract ones) by look­ing at their fea­tures (it helps if we have grounded many of the other sym­bols). For in­stance, we think one con­cept might be “nos­tal­gia for the mem­ory of child­hood”. This is some­thing that we ex­pect to be trig­gered when child­hood is brought up, or when the agent sees a house that re­sem­bles its child­hood home, and it is likely to re­sult in cer­tain top­ics of con­ver­sa­tion, and maybe some pre­dictable prim­ing on cer­tain tests.

Of course, it is triv­ially easy to setup an al­gorithm with a “nos­tal­gia for the mem­ory of child­hood” node, a “child­hood con­ver­sa­tion” node, etc..., with the right re­la­tions be­tween them. So, as in this gen­er­al­ised Tur­ing test, it’s more in­dica­tive if the “cor­rect fea­tures” are things the pro­gram­mers did not ex­plic­itly de­sign in.

He­do­nium: ex­am­ples and counterexamples

So, what should we ex­pect from a prop­erly grounded he­do­nium al­gorithm? There are many rea­sons to ex­pect that they will be larger than we might have in­tu­itively thought. Read­ing the word “anger” or see­ing a pic­ture of an an­gry per­son both com­mu­ni­cate “anger” to us, but a full de­scrip­tion of “anger” is much larger and more com­plex than can be com­mu­ni­cated by the word or the pic­ture. Those sug­gest anger by sim­ply re­mind­ing us of our own com­plex in­tu­itive un­der­stand­ing of the term, rather than by ground­ing it.

Let’s start by as­sum­ing that, for ex­am­ple, the he­do­nium ex­pe­rience in­volves some­one “build­ing on the mem­ory of their pre­vi­ously hap­piest ex­pe­rience”, for in­stance. Let’s ground that par­tic­u­lar mem­ory. First of all, we have to ground the con­cept of (hu­man) “mem­ory”. This will re­quire a lot al­gorith­mic in­fras­truc­ture. Re­mem­ber we have to struc­ture the al­gorithm so that even if we la­bel “mem­ory” as “spat­ula”, an out­side an­a­lyst if forced to con­clude that “spat­ula” can only mean mem­ory. This will, at the min­i­mum, in­volve the pro­cess of lay­ing down many ex­am­ples of mem­o­ries, of re­triev­ing them and mak­ing use of them.

This is some­thing that the al­gorithm it­self must do. If the al­gorithm doesn’t do that each time the he­do­nium al­gorithm is run, then the whole con­cept of mem­ory is sim­ply a to­ken in the al­gorithm say­ing “mem­ory is defined in lo­ca­tion X”, which is triv­ially easy to change to some­thing com­pletely differ­ent. Re­mem­ber, the rea­son the al­gorithm needs to ground these con­cepts it­self is to pre­vent it be­ing iso­mor­phic to some­thing else, some­thing very bad. Nor can we get away with a sim­plis­tic overview of a few key mem­o­ries be­ing laid down—we’d be fal­ling back into the GOFAI trap of ex­pect­ing a few key re­la­tion­ships to es­tab­lish the whole con­cept. It seems that for an al­gorithm to talk about mem­ory in a way that makes sense, we re­quire the al­gorithm to demon­strate a whole lot of things about the con­cept.

It won’t be enough, ei­ther, to have a “mem­ory sub­mod­ule” that the main al­gorithm doesn’t run. That’s ex­actly the same as hav­ing an al­gorithm with to­ken say­ing “mem­ory is defined over there”; if you change the con­tent of “over there”, you change the al­gorithm’s se­man­tics with­out chang­ing its syn­tax.

Then, once we have the con­cept of mem­ory down, we have to es­tab­lish the con­tents and emo­tions of that par­tic­u­lar mem­ory, both things that will re­quire the al­gorithm to ac­tively perform a lot of tasks.

Let’s look at a sec­ond ex­am­ple. As­sume now that the al­gorithm thinks “I ex­pect hap­piness to in­crease” or some­thing similar. I’ll spare you the “I” for the mo­ment, and just fo­cus on “ex­pect”. “Ex­pec­ta­tion” is some­thing spe­cific, prob­a­bly best defined by the “cor­rect fea­tures” ap­proach. It says some­thing about fu­ture ob­ser­va­tions. It al­lows for the pos­si­bil­ity of be­ing sur­prised. It al­lows for the pos­si­bil­ity of be­ing up­dated. All these must be demon­stra­ble fea­tures of the “ex­pect” mod­ule, to ground it prop­erly. So the al­gorithm must demon­strate a whole range of chang­ing ex­pec­ta­tions, to be sure that “ex­pects” is more that just a la­bel.

Also, “ex­pec­ta­tion” is cer­tainly not some­thing that will be wrong ev­ery sin­gle time. It’s likely not some­thing that will be right ev­ery sin­gle time. This poses great prob­lems for run­ning the he­do­nium al­gorithm iden­ti­cally mul­ti­ple times: the ex­pec­ta­tions are ei­ther always wrong or always right. The mean­ing of “ex­pec­ta­tion” has been lost, be­cause it no longer has the fea­tures that it should.

There are similar prob­lems with run­ning the same al­gorithm in mul­ti­ple lo­ca­tions (or all across the uni­verse, in the ex­treme case). The first prob­lem is that this might be seen as iso­mor­phic to sim­ply run­ning the al­gorithm once, record­ing it, and run­ning the record­ing ev­ery­where else. Even if this is differ­ent, we might have the prob­lem that an iso­mor­phism mak­ing the he­do­nium into do­lo­rum might be very large com­pared with the size of the he­do­nium al­gorithm—but tiny com­pared with the size of the mul­ti­ple copies of the al­gorithm run­ning ev­ery­where.

But those are minor quib­bles: the main prob­lem is whether the sense of iden­tity of the agent can be grounded suffi­ciently well, while re­main­ing ac­cu­rate if the agent is run trillions upon trillions of times. Are these gen­uine life ex­pe­rience? What if the agent learns some­thing new dur­ing that pe­riod—this seems to stretch the mean­ing of “learn­ing some­thing new”, pos­si­bly break­ing it.

Other is­sues crop up—sup­pose a lot of my iden­tity is tied up with the idea I could ex­plore space around me? In a he­do­nium world, this would be im­pos­si­ble, as the space (phys­i­cal and vir­tual) is taken up by other copies be­ing run in limited vir­tual en­vi­ron­ments. Re­mem­ber it’s not enough to say “the agent could ex­plore space”; if there is no pos­si­bil­ity for the agent to do so “could ex­plore” can be syn­tac­ti­cally re­placed with “couldn’t ex­plore” with­out af­fect­ing the al­gorithm, just its mean­ing.

Th­ese are just the first is­sues that come to mind; if you re­place ac­tual liv­ing and chang­ing agents with he­do­niu­mic copies of them­selves, you have to make those copies have suffi­ciently rich in­ter­ac­tions that all the im­por­tant fea­tures of liv­ing and chang­ing agents are pre­served and grounded uniquely.

Beyond Hedonium

Well, where does that leave us? In­stead of my ini­tial car­i­ca­ture of he­do­nium, what if we had in­stead a vast amount of more com­plex al­gorithms, pos­si­bly stochas­tic and vary­ing, with more choices, more in­ter­ac­tions, more ex­plo­ra­tion, etc… all that is needed to ground them as agents with emo­tions? What it we took those, and then made them as happy as pos­si­ble? Would I ar­gue against that he­do­nium, still?

Prob­a­bly not. But I’m not sure “he­do­nium” is the best de­scrip­tion of that setup. It seems to be agents, that have var­i­ous fea­tures, among which hap­pens to be ex­tremely high hap­piness, rather than pure hap­piness al­gorithms. And that might be a bet­ter way of con­ceiv­ing of them.

Ad­den­dum: mind crimes

Nick Bostrom and oth­ers have brought up the pos­si­bil­ity of AI “mind crimes”, where the AI, sim­ply by virtue of simu­lat­ing hu­mans in po­ten­tially bad situ­a­tions, causes these hu­mans to ex­ist and, pos­si­bly, suffer (and then most likely die as the simu­la­tion ends).

This situ­a­tion seems ex­actly con­verse to the above. For he­do­nium, we want a rich enough in­ter­ac­tion to ground all the sym­bols and leave no am­bi­guity as to what is go­ing on. To avoid mind crimes, we want the op­po­site. We’d be fine if the AI’s pre­dic­tion mod­ules re­turned some­thing like this, as text:

Stu­art was suffer­ing in­tensely, as he re­called ag­o­nis­ing mem­o­ries and tried to re­pair his man­gled arms.

Then as long as we get to safely re­define the syn­tac­tic to­kens “suffer­ing”, “ag­o­nis­ing”, etc..., we should be fine. Note that the AI it­self must have a good ground­ing of “suffer­ing” and so on, so that it knows what to avoid. But as long as the pre­dic­tion mod­ule (the part that runs re­peat­edly) has a sim­ple syn­tac­tic defi­ni­tion, there should be no mind crimes.