Clari­fy­ing Con­sequen­tial­ists in the So­lomonoff Prior

I have spent a long time be­ing con­fused about Paul’s post on con­sequen­tial­ists in the So­lomonoff prior. I now think I un­der­stand the prob­lem clearly enough to en­gage with it prop­erly.

I think the reason I was con­fused is to a large de­gree a prob­lem of fram­ing. It seemed to me in the course of dis­cus­sions I had to de­con­fuse my­self to me that sim­ilar con­fu­sions are shared by other people. In this post, I will at­tempt to ex­plain the fram­ing that helped cla­rify the prob­lem for me.

i. A brief sketch of the So­lomonoff prior

The So­lomonoff, or Univer­sal, prior is a prob­ab­il­ity dis­tri­bu­tion over strings of a cer­tain al­pha­bet (usu­ally over all strings of 1s and 0s). It is defined by tak­ing the set of all Tur­ing ma­chines (TMs) which out­put strings, as­sign­ing to each a weight pro­por­tional to

(where L is its de­scrip­tion length), and then as­sign­ing to each string a prob­ab­il­ity equal to the weights of the TMs that com­pute it. The de­scrip­tion length is closely re­lated to the amount of in­form­a­tion re­quired to spe­cify the ma­chine; I will use de­scrip­tion length and amount of in­form­a­tion for spe­cific­a­tion in­ter­change­ably.

(The ac­tual form­al­ism is in fact a bit more tech­nic­ally in­volved. I think this pic­ture is de­tailed enough, in the sense that my ex­plan­a­tion will map onto the real form­al­ism about as well.)

The above defines the So­lomonoff prior. To per­form So­lomonoff in­duc­tion, one can also define con­di­tional dis­tri­bu­tions by con­sid­er­ing only those TMs that gen­er­ate strings be­gin­ning with a cer­tain pre­fix. In this post, we’re not in­ter­ested in that pro­cess, but only in the prior.

ii. The Ma­lign Prior Argument

In the post, Paul claims that the prior is dom­in­ated by con­sequen­tial­ists. I don’t think it is quite dom­in­ated by them, but I think the ef­fect in ques­tion is plaus­ibly real.

I’ll call the key claim in­volved the Ma­lign Prior Ar­gu­ment. On my pre­ferred fram­ing, it goes some­thing like this:

Premiss: For some strings, it is easier to spe­cify a Tur­ing Machine that sim­u­lates a reasoner which de­cides to pre­dict that string, than it is to spe­cify the in­ten­ded gen­er­ator for that string.

Con­clu­sion: There­fore, those strings’ So­lomonoff prior prob­ab­il­ity will be dom­in­ated by the weight as­signed to the TM con­tain­ing the reasoner.

It’s best to ex­plain the idea of an ‘in­ten­ded gen­er­ator’ with ex­amples. In the case of a cam­era sig­nal as the string, the in­ten­ded gen­er­ator is some­thing like a TM that sim­u­lates the uni­verse, plus a spe­cific­a­tion of the point in the sim­u­la­tion where the cam­era in­put should be sampled. Ap­prox­im­a­tions to this, like a low-fi­del­ity sim­u­la­tion, can also be con­sidered in­ten­ded gen­er­at­ors.

There isn’t any­thing spe­cial about the in­ten­ded gen­er­ator’s re­la­tion­ship to the string—it’s just one way in which that string can be gen­er­ated. It seems most nat­ural to us as hu­mans, and the Oc­camian nature of SI feels like it should be biased to­wards such strings, but noth­ing in prin­ciple stops some­thing less ‘nat­ural’ from be­ing in fact a shorter de­scrip­tion.

This idea of ‘nat­ur­al­ness’ is im­port­ant in un­der­stand­ing what the Ma­lign Prior Ar­gu­ment is about; I will use it roughly to refer to some­thing like ‘the set of Tur­ing Machines that don’t in­volve reason­ers that at­tempt to in­flu­ence the prior’, or ‘the set of in­ten­ded gen­er­at­ors’. It’s vague, but I think it gets across the point.

I read most of Paul’s post as an ex­ist­ence ar­gu­ment for the premiss, us­ing con­sequen­tial­ists in other worlds as the reason­ers. I don’t think all such reason­ers are like Paul de­scribes; I also doubt that all or even most strings are sub­ject to this ef­fect, but find it very plaus­ible that some are.

I think the ar­gu­ment is not, at its core, about these reason­ers mak­ing the strings they out­put more likely than the ‘true string’. It is con­cern­ing enough that there is any ef­fect at all that these reason­ers have on the prior, which is the fact this ar­gu­ment es­tab­lishes.

As a side note, it’s also worth not­ing that this is not about these reason­ers break­ing out of the box and tak­ing over our world, al­though that is also a re­lated con­cern one might have.

iii. The sup­port for premiss 1

Con­sider a string S’ with very high nat­ural K-com­plex­ity (de­scrip­tion length of the in­ten­ded gen­er­ator) that shares a pre­fix with a string S that is of high in­terest to hu­man-like civil­isa­tions.

I claim that the prior prob­ab­il­ity of this string is higher than it ‘nat­ur­ally’ ‘should’ be, in the sense that a large part of the weight that com­poses this prob­ab­il­ity is com­ing from a TM that sim­u­lates a reasoner that is at­tempt­ing to in­flu­ence the prior.

The reas­ons this hap­pens are:

  1. A reasoner in a TM can have an ar­bit­rar­ily long amount of com­pute time to de­cide what strings to out­put.

  2. Spe­cify­ing reason­ers is cheap re­l­at­ive to spe­cify­ing the string S’.

  3. There ex­ists a reasoner whose goals are best served by in­flu­en­cing the prior to make S’ more likely.

1 is a cru­cial prop­erty of the So­lomonoff prior that al­lows this to hap­pen. A TM in the So­lomonoff prior can think for a very, very long time — enough to e.g. sim­u­late an Ack­er­man(Ack­er­man(10)) ini­tial world states each for Ack­er­man(Ack­er­man(10)) timesteps. It can per­form some­thing close to an ex­haust­ive search of all pos­sible civil­iz­a­tions and de­cide to at­tempt to in­flu­ence the one that is most sus­cept­ible to be in­flu­enced, if that’s what it wants to do. This is a ri­dicu­lous com­pu­ta­tion, but we’re talk­ing about a math­em­at­ical ob­ject, not an ac­tual pro­cess that we run. It’s plaus­ible that if the prior was also weighted by speed of com­pu­ta­tion, these ef­fects would be far less pro­nounced (and maybe would not arise at all).

To see that 2 and 3 are plaus­ible, we need to think about S’, which by as­sump­tion is a string with high nat­ural K-com­plex­ity. This high com­plex­ity ‘buys’ us the space to spe­cify a reasoner, and the space to spe­cify val­ues, without mak­ing the TM more com­plex than a nat­ural gen­er­ator of S’. Now, be­cause S is by as­sump­tion of in­terest to civil­isa­tions, there likely ex­ists a TM con­tain­ing a reasoner that per­forms its ex­haust­ive search, finds S, and con­cludes that its val­ues are best served by mak­ing S’ more likely (e.g. to in­flu­ence the de­cision-mak­ing of civil­isa­tions that are think­ing about what S is, given a pre­fix of it known to them).

In a way, this agent uses its sim­pli­city to give more sim­pli­city to some other string. That is how the prior gets hi­jacked.

Note that this reasoner will need to have goals that are sim­pler than the nat­ural gen­er­ator of S’ in or­der to ac­tu­ally con­trib­ute to S’ be­ing more likely—oth­er­wise, spe­cify­ing its TM would be more ex­pens­ive than spe­cify­ing the nat­ural gen­er­ator of S’.

The above is non-con­struct­ive (in the math­em­at­ical sense), but nev­er­the­less the ex­ist­ence of strings S’ that are af­fected thus seems plaus­ible. The spaces of pos­sible TMs and of the strings we (or other users of the So­lomonoff prior) could be in­ter­ested in are simply too vast for there not to be such TMs. Whether there are very many of these, or whether they are so much more com­plic­ated than the string S so as to make this ef­fect ir­rel­ev­ant to our in­terests, are dif­fer­ent ques­tions.

iv. Alien consequentialists

In my view, Paul’s ap­proach in his post is a more con­struct­ive strategy for es­tab­lish­ing 2 and 3 in the ar­gu­ment above. If cor­rect, it sug­gests a stronger res­ult—not only does it cause the prob­ab­il­ity of S’ to be dom­in­ated by the TM con­tain­ing the reasoner, it makes the prob­ab­il­ity of S’ roughly com­par­able to S, for a wide class of choices of S.

In par­tic­u­lar, the choice of S that is sus­cept­ible to this is some­thing like the cam­era ex­ample I used, where the nat­ural gen­er­ator is S is a spe­cific­a­tion of our world to­gether with a loc­a­tion where we take samples from. The alien civil­isa­tion is a way to con­struct a Tur­ing Machine that out­puts S’ which has com­par­able com­plex­ity to S.

To do that, we spe­cify a uni­verse, then run it for how­ever long we want, un­til we get some­where within it smart agents that de­cide to in­flu­ence the prior. Since 1 is true, these agents have an ar­bit­rary amount of time to de­cide what they out­put. If S is im­port­ant, there prob­ably will be a civil­isa­tion some­where in some sim­u­lated world which will de­cide to at­tempt to in­flu­ence de­cisions based on S, and out­put an ap­pro­pri­ate S’. We then spe­cify the out­put chan­nel to be whatever they de­cide to use as the out­put chan­nel.

This re­quires a re­l­at­ively mod­est amount of in­form­a­tion—enough to spe­cify the uni­verse, and the loc­a­tion of the out­put. This is on the same or­der as the nat­ural gen­er­ator for S it­self, if it is like a cam­era sig­nal.

Try­ing to spe­cify our reasoner within this space (reason­ers that nat­ur­ally de­velop in sim­u­la­tions) does place re­stric­tions on what kind of reasoner we up end up with. For in­stance, there are now some im­pli­cit runtime bounds on many of our reason­ers, be­cause they likely care about things other than the prior. Never­the­less, the space of our reason­ers re­mains vast, in­clud­ing un­aligned su­per­in­tel­li­gences and other odd minds.

v. Con­clu­sion. Do these ar­gu­ments ac­tu­ally work?

I am mostly con­vinced that there is at least some weird­ness in the So­lomonoff prior.

A part of me wants to add ‘es­pe­cially around strings whose pre­fixes are used to make pivotal de­cisions’; I’m not sure that is right, be­cause I think scarcely any­one would ac­tu­ally use this prior in its true form—ex­cept, per­haps, an AI reas­on­ing about it ab­stractly and naïvely enough not to be con­cerned about this ef­fect des­pite hav­ing to ex­pli­citly con­sider it.

In fact, a lot of my doubt about the ma­lign So­lomonoff prior is con­cen­trated around this con­cern: if the reason­ers don’t be­lieve that any­one will act based on the true prior, it seems un­clear why they should spend a lot of re­sources on mess­ing with it. I sup­pose the space is large enough for at least some to get con­fused into do­ing some­thing like this by mis­take.

I think that even if my doubts are cor­rect, there will still be weird­ness as­so­ci­ated with the agents that are spe­cified dir­ectly, along the lines of sec­tion iii, if not those that ap­pear in sim­u­lated uni­verses, as de­scribed in iv.