What are Universal Inductors, Again?

[At­ten­tion Con­ser­va­tion No­tice: This just re­caps an old re­sult and patches a hole in it, it doesn’t con­tain sub­stan­tially new ideas.]

Univer­sal In­duc­tors can be thought of as log­i­cal in­duc­tors over bit­strings, which are no­table be­cause they can act as a log­i­cal in­duc­tor over any the­ory you want (PA, ZFC, 2nd or­der ar­ith­metic), by fix­ing some effi­ciently com­putable func­tion from bits to sen­tences in the lan­guage of your the­ory, con­di­tion­ing the uni­ver­sal in­duc­tor (which is a prob­a­bil­ity dis­tri­bu­tion over in­finite bit­strings) on the bits which cor­re­spond to proven the­o­rems, and read­ing out the prob­a­bil­ity of other bits/​sen­tences from the re­sult­ing con­di­tional dis­tri­bu­tion. This trick works by The­o­rem 4.7.2 in the log­i­cal in­duc­tion pa­per, “Clo­sure Un­der Con­di­tion­ing”, which shows that con­di­tion­ing the prob­a­bil­ities of a log­i­cal in­duc­tor on a se­quence of non-in­con­sis­tent sen­tences, yields a log­i­cal in­duc­tor over a de­duc­tive pro­cess where all the con­di­tion­als are true.

Also, since they are a prob­a­bil­ity dis­tri­bu­tion over in­finite bit­strings, it makes sense to think about them as hav­ing a prob­a­bil­ity mea­sure over wor­lds (as­sign­ments of state­ments to true or false, for all state­ments) at all finite stages, in­stead of just in the limit.

How­ever, in Scott’s old post about uni­ver­sal in­duc­tors, their con­struc­tion was never de­scribed.

Note that a Univer­sal In­duc­tor cor­re­sponds to a Log­i­cal In­duc­tor, but the as­so­ci­ated Log­i­cal In­duc­tor will not have finite sup­port, and so will be differ­ent from the one con­structed in the pa­per. Never the less, Univer­sal In­duc­tors can be shown to ex­ist us­ing a similar con­struc­tion.

This post will give that con­struc­tion. To be­gin with, a Univer­sal In­duc­tor must fulfill the fol­low­ing two prop­er­ties:

First, it must be a dis­tri­bu­tion over in­finite bit­strings, such that is com­putable for all and all which are finite bit­strings. This gives the prob­a­bil­ity that an in­finite bit­string starts with as a pre­fix.

Se­cond, the the­ory is one where the th atomic state­ment is of the form “the th bit in this in­finite bit­string is a “, and in­duces a func­tion from these state­ments to prob­a­bil­ities, and the se­quence of prob­a­bil­ity as­sign­ments must form a log­i­cal in­duc­tor over the empty de­duc­tive pro­cess. (ie, the in­duc­tor never sees any ev­i­dence of the form “this bit is a “, it just runs for­ever with­out get­ting any feed­back.)

To be­gin with, the su­per­trader con­struc­tion from Sec­tion 5 of the log­i­cal in­duc­tion pa­per is the ex­act same. Traders look at the prices of boolean com­bi­na­tions of atomic state­ments, and use them in con­tin­u­ous func­tions to buy or sell shares in boolean com­bi­na­tions of atomic state­ments.

The in­ter­est­ing part is the con­struc­tion of the space that we’ll be find­ing a fixed-point in. At timestep , there is a most-dis­tant bit that has ever ap­peared in a boolean com­bi­na­tion of bits that a trader has bought/​sold shares in, or looked at the price of. This is bit . If we as­sign prices to all bit­strings of length , then the prob­a­bil­ity of any finite bit­string that is longer can be given by just us­ing the uniform dis­tri­bu­tion on bits af­ter that point. This cor­re­sponds to as­sign­ing 5050 prob­a­bil­ity to all state­ments which are too large to have thought about them yet. There­fore, our space of in­ter­est is the di­men­sional sim­plex of prob­a­bil­ity dis­tri­bu­tions over all bit­strings of length .

Given a point in this space, it is pos­si­ble to read out the prob­a­bil­ity of any par­tic­u­lar boolean com­bi­na­tion of bits by just sum­ming up the prob­a­bil­ity on all -length bit­strings that fit that con­straint, so we can de­ter­mine the prices that the su­per­trader sees.

We can also as­sign all bit­strings but one to a ba­sis vec­tor for the space, by let­ting the ba­sis vec­tor for a bit­string be the vec­tor point­ing to the cor­re­spond­ing ver­tex of the sim­plex from its cen­ter. A pur­chase of one share in a boolean com­bi­na­tion cor­re­sponds to the vec­tor com­prised of the sum of ba­sis vec­tors that cor­re­spond to the bit­strings fulfilling the con­di­tions of the boolean. Note that a pur­chase of the one bit­string we left out can be re-ex­pressed as a pur­chase of 1 dol­lar and sel­l­ing a share of all other bit­strings.

As a sim­ple case of this re-ex­pres­sion, when , (be­cause it has to be a prob­a­bil­ity dis­tri­bu­tion), so the price of the pur­chase is the same. And as for the pur­chase, both pur­chases have the fea­ture that they are worth dol­lar if the world is and dol­lars if the world is some­thing differ­ent.

There­fore, given a point in the sim­plex, the net pur­chase and sel­l­ing of a bunch of differ­ent boolean com­bi­na­tions of bits can be bro­ken down into a whole bunch of vec­tors, which sum up to give a net trade vec­tor , and find­ing the point in the sim­plex that’s clos­est to is the out­put of the con­tin­u­ous func­tion that we can then find the fixed-point of.

Now, given a fixed-point in the in­te­rior of the sim­plex, , so there is no net trade. Given a fixed-point on the bound­ary of the sim­plex, can be in­ter­preted as be­ing com­posed en­tirely of sel­l­ing shares with a price of , which ob­vi­ously has or nega­tive value, al­though this is quite non­triv­ial to show. Once this is proved, it then im­me­di­ately fol­lows that this pro­cess de­scribed above is a Univer­sal In­duc­tor.

From here on, we will sim­ply take care of this last the­o­rem to be shown.

The­o­rem 1:

For all price vec­tors on the bound­ary of the sim­plex that are a fixed-point, the vec­tor rep­re­sent­ing the net trade can be in­ter­preted as be­ing com­posed en­tirely of ’s for all com­po­nents in which has a nonzero en­try, and non­pos­i­tive for all com­po­nents in which has a en­try.

First, some no­ta­tion will be laid out.

, and it is the di­men­sion of the space the sim­plex is in. de­notes the vec­tor of length that is the fixed-point, and de­notes the unique “canon­i­cal in­ter­pre­ta­tion” of this vec­tor as a vec­tor of length with non­nega­tive co­or­di­nates that sum to , which gives the prob­a­bil­ities of the var­i­ous bit­strings. Similarly, de­notes the vec­tor of length which cor­re­sponds to the net trade. Given a con­stant , will de­note a vec­tor with all co­or­di­nates be­ing , with the length clear from con­text. Given a vec­tor and co­or­di­nate a, with be­ing the value of the ‘th co­or­di­nate of , is the vec­tor that is in all co­or­di­nates ex­cept for , and at the ’th co­or­di­nate.

To be­gin, note that, given , there are mul­ti­ple pos­si­ble vec­tors of length which give the net pur­chase of shares. As an ex­am­ple of this, when , and both cor­re­spond to the same trade, be­cause buy­ing a share 0f is equiv­a­lent to sel­l­ing a share of . We will show that has an in­ter­pre­ta­tion as a vec­tor which is ev­ery­where is pos­i­tive, and non­pos­i­tive ev­ery­where is , be­cause this cor­re­sponds to the net trade be­ing in­ter­pretable as buy­ing no shares, and only sel­l­ing shares of bit­strings with a price of , which can­not lead to any gain no mat­ter which bit­string is the true one.

Now, we will fix our ba­sis vec­tors for the space. Be­cause is on the bound­ary of the sim­plex, must have at least one en­try. Fix that ver­tex that cor­re­sponds to that bit­string as the ori­gin of the co­or­di­nate sys­tem, and let the ba­sis be com­prised of the set of unit vec­tors that point from the cen­ter of the sim­plex to the ver­tices that cor­re­spond to all the other bit­strings. This spans the space that the sim­plex is em­bed­ded in. An­noy­ingly, this is not an or­thonor­mal ba­sis, so com­put­ing the dot product will be more com­pli­cated. The dot product be­tween any two differ­ent ba­sis vec­tors is (hat tip to Vanessa from MIRIxDis­cord)

We can start by ask­ing the ques­tion “In this ba­sis, what con­di­tion on a vec­tor cor­re­sponds to be­ing in­side the sim­plex?” The con­di­tion is that there ex­ist a s.t. has all en­tries in and the en­tries sum up to . Note that for speci­fi­cally, by the way we have defined the ba­sis, .

Also, the co­or­di­nates for can be bro­ken down into co­or­di­nates where has en­tries, and co­or­di­nates where has pos­i­tive en­tries. There­fore, given an ar­bi­trary vec­tor , it can be ex­pressed as , where is the vec­tor of co­or­di­nates where is , and is the vec­tor of co­or­di­nates where is pos­i­tive.

The proof strat­egy that will be used is di­vid­ing into cases, and show­ing that for all cases be­sides the ones that prove the the­o­rem, there is an “al­low­able per­tur­ba­tion vec­tor” s.t. lies in the sim­plex, and is closer to than it is to . This im­plies that is not a fixed-point, be­cause it isn’t the point clos­est to , but is a fixed-point, so we have a con­tra­dic­tion. Note that be­ing closer to than to is equiv­a­lent to , which can be rewrit­ten as which can be rewrit­ten as . This last state­ment will be the proof tar­get in the fol­low­ing cases.

Case 1: There is some pair of co­or­di­nates , , s.t , and .

Let be s.t. , , and all other co­or­di­nates are . Note that since (be­cause and adds up to ), and adds up to , then for suffi­ciently small , has all en­tries lie in and sums up to . Then can be taken as , to yield that lies in the sim­plex.

Define as . Note that is at co­or­di­nates and .

Rewrit­ing as , and us­ing the fact that the dot product be­tween any two differ­ent ba­sis vec­tors of our space is , , so .

Now,

and be­cause , for some pos­i­tive .

A similar anal­y­sis ap­plies to show for some pos­i­tive , and for suffi­ciently small , we get our de­sired re­sult that .

Now, the elimi­na­tion of this case shows that must have all en­tries equal some con­stant , while must have all en­tries if it is nonempty. This pro­duces three more ex­haus­tive cases.

Case 2: , is pos­i­tive.

In this case, note that a pur­chase of all shares but one can be rein­ter­preted as buy­ing of all shares but one, and sel­l­ing the re­main­ing share. There­fore, this trade is equiv­a­lent to sel­l­ing shares in the prob­a­bil­ity-zero bit­string that we took as the ori­gin, buy­ing of all shares with pos­i­tive prob­a­bil­ity, and sel­l­ing a non-nega­tive amount of all shares with prob­a­bil­ity (be­cause has all en­tries ). The­o­rem 1 fol­lows.

Case 3: , is .

This case im­me­di­ately proves The­o­rem 1, be­cause , and is or nega­tive on all en­tries.

Case 4: , is nega­tive.

If we can show that this case is im­pos­si­ble, we will be done. Let be some co­or­di­nate where is pos­i­tive.

Let be s.t. , and all other co­or­di­nates are . Let . has all terms lie in for suffi­ciently small , be­cause the is can­celed out when is added, and the is taken out of a pos­i­tive term. Be­cause has en­tries, sums up to . sums up to 1, and sums up to , so the re­sult­ing vec­tor sums up to , and lies in the sim­plex.

Now we will break down the dot prod­ucts.

And for the next pair,

Note that the sec­ond term com­bines with the term from the first dot product we looked at to yield the fol­low­ing equation

Fi­nally, be­cause , by group­ing terms and fac­tor­ing out , we get

and, be­cause all co­or­di­nates in are nega­tive, this equals for some pos­i­tive con­stant . By the same proof path,

For some pos­i­tive con­stant . There­fore, for suffi­ciently small , and Case 4 is im­pos­si­ble, and the the­o­rem fol­lows be­cause only Case 2 and 3 are left.

No comments.