Clarifying Consequentialists in the Solomonoff Prior

I have spent a long time be­ing con­fused about Paul’s post on con­se­quen­tial­ists in the Solomonoff prior. I now think I un­der­stand the prob­lem clearly enough to en­gage with it prop­erly.

I think the rea­son I was con­fused is to a large de­gree a prob­lem of fram­ing. It seemed to me in the course of dis­cus­sions I had to de­con­fuse my­self to me that similar con­fu­sions are shared by other peo­ple. In this post, I will at­tempt to ex­plain the fram­ing that helped clar­ify the prob­lem for me.

i. A brief sketch of the Solomonoff prior

The Solomonoff, or Univer­sal, prior is a prob­a­bil­ity dis­tri­bu­tion over strings of a cer­tain alpha­bet (usu­ally over all strings of 1s and 0s). It is defined by tak­ing the set of all Tur­ing ma­chines (TMs) which out­put strings, as­sign­ing to each a weight pro­por­tional to

(where L is its de­scrip­tion length), and then as­sign­ing to each string a prob­a­bil­ity equal to the weights of the TMs that com­pute it. The de­scrip­tion length is closely re­lated to the amount of in­for­ma­tion re­quired to spec­ify the ma­chine; I will use de­scrip­tion length and amount of in­for­ma­tion for speci­fi­ca­tion in­ter­change­ably.

(The ac­tual for­mal­ism is in fact a bit more tech­ni­cally in­volved. I think this pic­ture is de­tailed enough, in the sense that my ex­pla­na­tion will map onto the real for­mal­ism about as well.)

The above defines the Solomonoff prior. To perform Solomonoff in­duc­tion, one can also define con­di­tional dis­tri­bu­tions by con­sid­er­ing only those TMs that gen­er­ate strings be­gin­ning with a cer­tain pre­fix. In this post, we’re not in­ter­ested in that pro­cess, but only in the prior.

ii. The Mal­ign Prior Argument

In the post, Paul claims that the prior is dom­i­nated by con­se­quen­tial­ists. I don’t think it is quite dom­i­nated by them, but I think the effect in ques­tion is plau­si­bly real.

I’ll call the key claim in­volved the Mal­ign Prior Ar­gu­ment. On my preferred fram­ing, it goes some­thing like this:

Pre­miss: For some strings, it is eas­ier to spec­ify a Tur­ing Ma­chine that simu­lates a rea­soner which de­cides to pre­dict that string, than it is to spec­ify the in­tended gen­er­a­tor for that string.

Con­clu­sion: There­fore, those strings’ Solomonoff prior prob­a­bil­ity will be dom­i­nated by the weight as­signed to the TM con­tain­ing the rea­soner.

It’s best to ex­plain the idea of an ‘in­tended gen­er­a­tor’ with ex­am­ples. In the case of a cam­era sig­nal as the string, the in­tended gen­er­a­tor is some­thing like a TM that simu­lates the uni­verse, plus a speci­fi­ca­tion of the point in the simu­la­tion where the cam­era in­put should be sam­pled. Ap­prox­i­ma­tions to this, like a low-fidelity simu­la­tion, can also be con­sid­ered in­tended gen­er­a­tors.

There isn’t any­thing spe­cial about the in­tended gen­er­a­tor’s re­la­tion­ship to the string—it’s just one way in which that string can be gen­er­ated. It seems most nat­u­ral to us as hu­mans, and the Oc­camian na­ture of SI feels like it should be bi­ased to­wards such strings, but noth­ing in prin­ci­ple stops some­thing less ‘nat­u­ral’ from be­ing in fact a shorter de­scrip­tion.

This idea of ‘nat­u­ral­ness’ is im­por­tant in un­der­stand­ing what the Mal­ign Prior Ar­gu­ment is about; I will use it roughly to re­fer to some­thing like ‘the set of Tur­ing Machines that don’t in­volve rea­son­ers that at­tempt to in­fluence the prior’, or ‘the set of in­tended gen­er­a­tors’. It’s vague, but I think it gets across the point.

I read most of Paul’s post as an ex­is­tence ar­gu­ment for the pre­miss, us­ing con­se­quen­tial­ists in other wor­lds as the rea­son­ers. I don’t think all such rea­son­ers are like Paul de­scribes; I also doubt that all or even most strings are sub­ject to this effect, but find it very plau­si­ble that some are.

I think the ar­gu­ment is not, at its core, about these rea­son­ers mak­ing the strings they out­put more likely than the ‘true string’. It is con­cern­ing enough that there is any effect at all that these rea­son­ers have on the prior, which is the fact this ar­gu­ment es­tab­lishes.

As a side note, it’s also worth not­ing that this is not about these rea­son­ers break­ing out of the box and tak­ing over our world, al­though that is also a re­lated con­cern one might have.

iii. The sup­port for pre­miss 1

Con­sider a string S’ with very high nat­u­ral K-com­plex­ity (de­scrip­tion length of the in­tended gen­er­a­tor) that shares a pre­fix with a string S that is of high in­ter­est to hu­man-like civil­i­sa­tions.

I claim that the prior prob­a­bil­ity of this string is higher than it ‘nat­u­rally’ ‘should’ be, in the sense that a large part of the weight that com­poses this prob­a­bil­ity is com­ing from a TM that simu­lates a rea­soner that is at­tempt­ing to in­fluence the prior.

The rea­sons this hap­pens are:

  1. A rea­soner in a TM can have an ar­bi­trar­ily long amount of com­pute time to de­cide what strings to out­put.

  2. Spec­i­fy­ing rea­son­ers is cheap rel­a­tive to spec­i­fy­ing the string S’.

  3. There ex­ists a rea­soner whose goals are best served by in­fluenc­ing the prior to make S’ more likely.

1 is a cru­cial prop­erty of the Solomonoff prior that al­lows this to hap­pen. A TM in the Solomonoff prior can think for a very, very long time — enough to e.g. simu­late an Ack­er­man(Ack­er­man(10)) ini­tial world states each for Ack­er­man(Ack­er­man(10)) timesteps. It can perform some­thing close to an ex­haus­tive search of all pos­si­ble civ­i­liza­tions and de­cide to at­tempt to in­fluence the one that is most sus­cep­ti­ble to be in­fluenced, if that’s what it wants to do. This is a ridicu­lous com­pu­ta­tion, but we’re talk­ing about a math­e­mat­i­cal ob­ject, not an ac­tual pro­cess that we run. It’s plau­si­ble that if the prior was also weighted by speed of com­pu­ta­tion, these effects would be far less pro­nounced (and maybe would not arise at all).

To see that 2 and 3 are plau­si­ble, we need to think about S’, which by as­sump­tion is a string with high nat­u­ral K-com­plex­ity. This high com­plex­ity ‘buys’ us the space to spec­ify a rea­soner, and the space to spec­ify val­ues, with­out mak­ing the TM more com­plex than a nat­u­ral gen­er­a­tor of S’. Now, be­cause S is by as­sump­tion of in­ter­est to civil­i­sa­tions, there likely ex­ists a TM con­tain­ing a rea­soner that performs its ex­haus­tive search, finds S, and con­cludes that its val­ues are best served by mak­ing S’ more likely (e.g. to in­fluence the de­ci­sion-mak­ing of civil­i­sa­tions that are think­ing about what S is, given a pre­fix of it known to them).

In a way, this agent uses its sim­plic­ity to give more sim­plic­ity to some other string. That is how the prior gets hi­jacked.

Note that this rea­soner will need to have goals that are sim­pler than the nat­u­ral gen­er­a­tor of S’ in or­der to ac­tu­ally con­tribute to S’ be­ing more likely—oth­er­wise, spec­i­fy­ing its TM would be more ex­pen­sive than spec­i­fy­ing the nat­u­ral gen­er­a­tor of S’.

The above is non-con­struc­tive (in the math­e­mat­i­cal sense), but nev­er­the­less the ex­is­tence of strings S’ that are af­fected thus seems plau­si­ble. The spaces of pos­si­ble TMs and of the strings we (or other users of the Solomonoff prior) could be in­ter­ested in are sim­ply too vast for there not to be such TMs. Whether there are very many of these, or whether they are so much more com­pli­cated than the string S so as to make this effect ir­rele­vant to our in­ter­ests, are differ­ent ques­tions.

iv. Alien consequentialists

In my view, Paul’s ap­proach in his post is a more con­struc­tive strat­egy for es­tab­lish­ing 2 and 3 in the ar­gu­ment above. If cor­rect, it sug­gests a stronger re­sult—not only does it cause the prob­a­bil­ity of S’ to be dom­i­nated by the TM con­tain­ing the rea­soner, it makes the prob­a­bil­ity of S’ roughly com­pa­rable to S, for a wide class of choices of S.

In par­tic­u­lar, the choice of S that is sus­cep­ti­ble to this is some­thing like the cam­era ex­am­ple I used, where the nat­u­ral gen­er­a­tor is S is a speci­fi­ca­tion of our world to­gether with a lo­ca­tion where we take sam­ples from. The alien civil­i­sa­tion is a way to con­struct a Tur­ing Ma­chine that out­puts S’ which has com­pa­rable com­plex­ity to S.

To do that, we spec­ify a uni­verse, then run it for how­ever long we want, un­til we get some­where within it smart agents that de­cide to in­fluence the prior. Since 1 is true, these agents have an ar­bi­trary amount of time to de­cide what they out­put. If S is im­por­tant, there prob­a­bly will be a civil­i­sa­tion some­where in some simu­lated world which will de­cide to at­tempt to in­fluence de­ci­sions based on S, and out­put an ap­pro­pri­ate S’. We then spec­ify the out­put chan­nel to be what­ever they de­cide to use as the out­put chan­nel.

This re­quires a rel­a­tively mod­est amount of in­for­ma­tion—enough to spec­ify the uni­verse, and the lo­ca­tion of the out­put. This is on the same or­der as the nat­u­ral gen­er­a­tor for S it­self, if it is like a cam­era sig­nal.

Try­ing to spec­ify our rea­soner within this space (rea­son­ers that nat­u­rally de­velop in simu­la­tions) does place re­stric­tions on what kind of rea­soner we up end up with. For in­stance, there are now some im­plicit run­time bounds on many of our rea­son­ers, be­cause they likely care about things other than the prior. Nev­er­the­less, the space of our rea­son­ers re­mains vast, in­clud­ing un­al­igned su­per­in­tel­li­gences and other odd minds.

v. Con­clu­sion. Do these ar­gu­ments ac­tu­ally work?

I am mostly con­vinced that there is at least some weird­ness in the Solomonoff prior.

A part of me wants to add ‘es­pe­cially around strings whose pre­fixes are used to make pivotal de­ci­sions’; I’m not sure that is right, be­cause I think scarcely any­one would ac­tu­ally use this prior in its true form—ex­cept, per­haps, an AI rea­son­ing about it ab­stractly and naïvely enough not to be con­cerned about this effect de­spite hav­ing to ex­plic­itly con­sider it.

In fact, a lot of my doubt about the ma­lign Solomonoff prior is con­cen­trated around this con­cern: if the rea­son­ers don’t be­lieve that any­one will act based on the true prior, it seems un­clear why they should spend a lot of re­sources on mess­ing with it. I sup­pose the space is large enough for at least some to get con­fused into do­ing some­thing like this by mis­take.

I think that even if my doubts are cor­rect, there will still be weird­ness as­so­ci­ated with the agents that are speci­fied di­rectly, along the lines of sec­tion iii, if not those that ap­pear in simu­lated uni­verses, as de­scribed in iv.