Hedonic asymmetries

Link post

Au­to­mat­i­cally crossposted

Creat­ing re­ally good out­comes for hu­man­ity seems hard. We get bored. If we don’t get bored, we still don’t like the idea of joy with­out va­ri­ety. And joyful ex­pe­riences only seems good if they are real and mean­ingful (in some sense we can’t eas­ily pin down). And so on.

On the flip side, cre­at­ing re­ally bad out­comes seems much eas­ier, run­ning into none of the sym­met­ric “prob­lems.” So what gives?

I’ll ar­gue that na­ture is ba­si­cally out to get us, and it’s not a co­in­ci­dence that mak­ing things good is so much harder than mak­ing them bad.

First: some other explanations

Two com­mon an­swers (e.g. see here and com­ments):

  • The worst things that can quickly hap­pen to an an­i­mal in na­ture are much worse than the best things that can quickly hap­pen.

  • It’s easy to kill or maim an an­i­mal, but hard to make things go well, so “ran­dom” ex­pe­riences are more likely to be bad than good.

I think both of these are real, but that the con­sid­er­a­tion in this post is at least as im­por­tant.

Main ar­gu­ment: re­ward er­rors are asymmetric

Sup­pose that I’m build­ing an RL agent who I want to achieve some goal in the world. I can imag­ine differ­ent kinds of er­rors:

  • Pes­simism: the re­wards are too low. Maybe the agent gets a re­ally low re­ward even though noth­ing bad hap­pened.

  • Op­ti­mism: the re­wards are too high. Maybe the agent gets a re­ally high re­ward even though noth­ing good hap­pened, or gets no re­ward even though some­thing bad hap­pened.

Pes­simistic er­rors are no big deal. The agent will ran­domly avoid be­hav­iors that get pe­nal­ized, but as long as those be­hav­iors are rea­son­ably rare (and aren’t the only way to get a good out­come) then that’s not too costly.

But op­ti­mistic er­rors are catas­trophic. The agent will sys­tem­at­i­cally seek out the be­hav­iors that re­ceive the high re­ward, and will use loop­holes to avoid penalties when some­thing ac­tu­ally bad hap­pens. So even if these er­rors are ex­tremely rare ini­tially, they can to­tally mess up my agent.

When we try to cre­ate suffer­ing by go­ing off dis­tri­bu­tion, evolu­tion doesn’t re­ally care. It didn’t build the ma­chin­ery to be ro­bust.

But when we try to cre­ate in­cred­ibly good sta­ble out­comes, we are fight­ing an ad­ver­sar­ial game against evolu­tion. Every an­i­mal for­ever has been play­ing that game us­ing all the tricks it could learn, and evolu­tion has patched ev­ery hole that they found.

In or­der to win this game, evolu­tion can im­ple­ment gen­eral strate­gies like bore­dom, or an aver­sion to mean­ingless plea­sures. Each of these mea­sures makes it harder for us to in­ad­ver­tently find a loop­hole that gets us high re­ward.


Over­all I think this is a rel­a­tively op­ti­mistic view: some of our asym­met­ri­cal in­tu­itions about plea­sure and pain may be mis­cal­ibrated for a world where we are able to out­smart evolu­tion. I think evolu­tion’s tricks just mean that cre­at­ing good wor­lds is difficult rather than im­pos­si­ble, and that we will be able to cre­ate an in­cred­ibly good world as we be­come wiser.

It’s pos­si­ble that evolu­tion solved the overop­ti­mism prob­lem in a way that is ac­tu­ally uni­ver­sal—such that it is in fact im­pos­si­ble to cre­ate out­comes as good as the worst out­comes are bad. But I think that’s un­likely. Evolu­tion’s solu­tion only needed to be good enough to stop our an­ces­tors from find­ing loop­holes, and we are a much more challeng­ing ad­ver­sary.