What are Univer­sal In­duct­ors, Again?

[At­ten­tion Con­ser­va­tion Notice: This just re­caps an old res­ult and patches a hole in it, it doesn’t con­tain sub­stan­tially new ideas.]

Univer­sal In­duct­ors can be thought of as lo­gical in­duct­ors over bit­strings, which are not­able be­cause they can act as a lo­gical in­ductor over any the­ory you want (PA, ZFC, 2nd or­der arith­metic), by fix­ing some ef­fi­ciently com­put­able func­tion from bits to sen­tences in the lan­guage of your the­ory, con­di­tion­ing the uni­ver­sal in­ductor (which is a prob­ab­il­ity dis­tri­bu­tion over in­fin­ite bit­strings) on the bits which cor­res­pond to proven the­or­ems, and read­ing out the prob­ab­il­ity of other bits/​sen­tences from the res­ult­ing con­di­tional dis­tri­bu­tion. This trick works by The­orem 4.7.2 in the lo­gical in­duc­tion pa­per, “Clos­ure Under Condi­tion­ing”, which shows that con­di­tion­ing the prob­ab­il­it­ies of a lo­gical in­ductor on a se­quence of non-in­con­sist­ent sen­tences, yields a lo­gical in­ductor over a de­duct­ive pro­cess where all the con­di­tion­als are true.

Also, since they are a prob­ab­il­ity dis­tri­bu­tion over in­fin­ite bit­strings, it makes sense to think about them as hav­ing a prob­ab­il­ity meas­ure over worlds (as­sign­ments of state­ments to true or false, for all state­ments) at all fi­nite stages, in­stead of just in the limit.

However, in Scott’s old post about uni­ver­sal in­duct­ors, their con­struc­tion was never de­scribed.

Note that a Univer­sal In­ductor cor­res­ponds to a Lo­gical In­ductor, but the as­so­ci­ated Lo­gical In­ductor will not have fi­nite sup­port, and so will be dif­fer­ent from the one con­struc­ted in the pa­per. Never the less, Univer­sal In­duct­ors can be shown to ex­ist us­ing a sim­ilar con­struc­tion.

This post will give that con­struc­tion. To be­gin with, a Univer­sal In­ductor must ful­fill the fol­low­ing two prop­er­ties:

First, it must be a dis­tri­bu­tion over in­fin­ite bit­strings, such that is com­put­able for all and all which are fi­nite bit­strings. This gives the prob­ab­il­ity that an in­fin­ite bit­string starts with as a pre­fix.

Se­cond, the the­ory is one where the th atomic state­ment is of the form “the th bit in this in­fin­ite bit­string is a “, and in­duces a func­tion from these state­ments to prob­ab­il­it­ies, and the se­quence of prob­ab­il­ity as­sign­ments must form a lo­gical in­ductor over the empty de­duct­ive pro­cess. (ie, the in­ductor never sees any evid­ence of the form “this bit is a “, it just runs forever without get­ting any feed­back.)

To be­gin with, the su­per­trader con­struc­tion from Sec­tion 5 of the lo­gical in­duc­tion pa­per is the ex­act same. Traders look at the prices of boolean com­bin­a­tions of atomic state­ments, and use them in con­tinu­ous func­tions to buy or sell shares in boolean com­bin­a­tions of atomic state­ments.

The in­ter­est­ing part is the con­struc­tion of the space that we’ll be find­ing a fixed-point in. At timestep , there is a most-dis­tant bit that has ever ap­peared in a boolean com­bin­a­tion of bits that a trader has bought/​sold shares in, or looked at the price of. This is bit . If we as­sign prices to all bit­strings of length , then the prob­ab­il­ity of any fi­nite bit­string that is longer can be given by just us­ing the uni­form dis­tri­bu­tion on bits after that point. This cor­res­ponds to as­sign­ing 5050 prob­ab­il­ity to all state­ments which are too large to have thought about them yet. There­fore, our space of in­terest is the di­men­sional sim­plex of prob­ab­il­ity dis­tri­bu­tions over all bit­strings of length .

Given a point in this space, it is pos­sible to read out the prob­ab­il­ity of any par­tic­u­lar boolean com­bin­a­tion of bits by just sum­ming up the prob­ab­il­ity on all -length bit­strings that fit that con­straint, so we can de­term­ine the prices that the su­per­trader sees.

We can also as­sign all bit­strings but one to a basis vec­tor for the space, by let­ting the basis vec­tor for a bit­string be the vec­tor point­ing to the cor­res­pond­ing ver­tex of the sim­plex from its cen­ter. A pur­chase of one share in a boolean com­bin­a­tion cor­res­ponds to the vec­tor com­prised of the sum of basis vec­tors that cor­res­pond to the bit­strings ful­filling the con­di­tions of the boolean. Note that a pur­chase of the one bit­string we left out can be re-ex­pressed as a pur­chase of 1 dol­lar and selling a share of all other bit­strings.

As a simple case of this re-ex­pres­sion, when , (be­cause it has to be a prob­ab­il­ity dis­tri­bu­tion), so the price of the pur­chase is the same. And as for the pur­chase, both pur­chases have the fea­ture that they are worth dol­lar if the world is and dol­lars if the world is some­thing dif­fer­ent.

There­fore, given a point in the sim­plex, the net pur­chase and selling of a bunch of dif­fer­ent boolean com­bin­a­tions of bits can be broken down into a whole bunch of vec­tors, which sum up to give a net trade vec­tor , and find­ing the point in the sim­plex that’s closest to is the out­put of the con­tinu­ous func­tion that we can then find the fixed-point of.

Now, given a fixed-point in the in­terior of the sim­plex, , so there is no net trade. Given a fixed-point on the bound­ary of the sim­plex, can be in­ter­preted as be­ing com­posed en­tirely of selling shares with a price of , which ob­vi­ously has or neg­at­ive value, al­though this is quite non­trivial to show. Once this is proved, it then im­me­di­ately fol­lows that this pro­cess de­scribed above is a Univer­sal In­ductor.

From here on, we will simply take care of this last the­orem to be shown.

The­orem 1:

For all price vec­tors on the bound­ary of the sim­plex that are a fixed-point, the vec­tor rep­res­ent­ing the net trade can be in­ter­preted as be­ing com­posed en­tirely of ’s for all com­pon­ents in which has a nonzero entry, and non­posit­ive for all com­pon­ents in which has a entry.

First, some nota­tion will be laid out.

, and it is the di­men­sion of the space the sim­plex is in. de­notes the vec­tor of length that is the fixed-point, and de­notes the unique “ca­non­ical in­ter­pret­a­tion” of this vec­tor as a vec­tor of length with non­neg­at­ive co­ordin­ates that sum to , which gives the prob­ab­il­it­ies of the vari­ous bit­strings. Sim­il­arly, de­notes the vec­tor of length which cor­res­ponds to the net trade. Given a con­stant , will de­note a vec­tor with all co­ordin­ates be­ing , with the length clear from con­text. Given a vec­tor and co­ordin­ate a, with be­ing the value of the ‘th co­ordin­ate of , is the vec­tor that is in all co­ordin­ates ex­cept for , and at the ’th co­ordin­ate.

To be­gin, note that, given , there are mul­tiple pos­sible vec­tors of length which give the net pur­chase of shares. As an ex­ample of this, when , and both cor­res­pond to the same trade, be­cause buy­ing a share 0f is equi­val­ent to selling a share of . We will show that has an in­ter­pret­a­tion as a vec­tor which is every­where is pos­it­ive, and non­posit­ive every­where is , be­cause this cor­res­ponds to the net trade be­ing in­ter­pretable as buy­ing no shares, and only selling shares of bit­strings with a price of , which can­not lead to any gain no mat­ter which bit­string is the true one.

Now, we will fix our basis vec­tors for the space. Be­cause is on the bound­ary of the sim­plex, must have at least one entry. Fix that ver­tex that cor­res­ponds to that bit­string as the ori­gin of the co­ordin­ate sys­tem, and let the basis be com­prised of the set of unit vec­tors that point from the cen­ter of the sim­plex to the ver­tices that cor­res­pond to all the other bit­strings. This spans the space that the sim­plex is em­bed­ded in. An­noy­ingly, this is not an or­thonor­mal basis, so com­put­ing the dot product will be more com­plic­ated. The dot product between any two dif­fer­ent basis vec­tors is (hat tip to Vanessa from MIRIxDis­cord)

We can start by ask­ing the ques­tion “In this basis, what con­di­tion on a vec­tor cor­res­ponds to be­ing in­side the sim­plex?” The con­di­tion is that there ex­ist a s.t. has all entries in and the entries sum up to . Note that for spe­cific­ally, by the way we have defined the basis, .

Also, the co­ordin­ates for can be broken down into co­ordin­ates where has entries, and co­ordin­ates where has pos­it­ive entries. There­fore, given an ar­bit­rary vec­tor , it can be ex­pressed as , where is the vec­tor of co­ordin­ates where is , and is the vec­tor of co­ordin­ates where is pos­it­ive.

The proof strategy that will be used is di­vid­ing into cases, and show­ing that for all cases be­sides the ones that prove the the­orem, there is an “al­low­able per­turb­a­tion vec­tor” s.t. lies in the sim­plex, and is closer to than it is to . This im­plies that is not a fixed-point, be­cause it isn’t the point closest to , but is a fixed-point, so we have a con­tra­dic­tion. Note that be­ing closer to than to is equi­val­ent to , which can be re­writ­ten as which can be re­writ­ten as . This last state­ment will be the proof tar­get in the fol­low­ing cases.

Case 1: There is some pair of co­ordin­ates , , s.t , and .

Let be s.t. , , and all other co­ordin­ates are . Note that since (be­cause and adds up to ), and adds up to , then for suf­fi­ciently small , has all entries lie in and sums up to . Then can be taken as , to yield that lies in the sim­plex.

Define as . Note that is at co­ordin­ates and .

Re­writ­ing as , and us­ing the fact that the dot product between any two dif­fer­ent basis vec­tors of our space is , , so .

Now,

and be­cause , for some pos­it­ive .

A sim­ilar ana­lysis ap­plies to show for some pos­it­ive , and for suf­fi­ciently small , we get our de­sired res­ult that .

Now, the elim­in­a­tion of this case shows that must have all entries equal some con­stant , while must have all entries if it is nonempty. This pro­duces three more ex­haust­ive cases.

Case 2: , is pos­it­ive.

In this case, note that a pur­chase of all shares but one can be re­in­ter­preted as buy­ing of all shares but one, and selling the re­main­ing share. There­fore, this trade is equi­val­ent to selling shares in the prob­ab­il­ity-zero bit­string that we took as the ori­gin, buy­ing of all shares with pos­it­ive prob­ab­il­ity, and selling a non-neg­at­ive amount of all shares with prob­ab­il­ity (be­cause has all entries ). The­orem 1 fol­lows.

Case 3: , is .

This case im­me­di­ately proves The­orem 1, be­cause , and is or neg­at­ive on all entries.

Case 4: , is neg­at­ive.

If we can show that this case is im­possible, we will be done. Let be some co­ordin­ate where is pos­it­ive.

Let be s.t. , and all other co­ordin­ates are . Let . has all terms lie in for suf­fi­ciently small , be­cause the is can­celed out when is ad­ded, and the is taken out of a pos­it­ive term. Be­cause has entries, sums up to . sums up to 1, and sums up to , so the res­ult­ing vec­tor sums up to , and lies in the sim­plex.

Now we will break down the dot products.

And for the next pair,

Note that the second term com­bines with the term from the first dot product we looked at to yield the fol­low­ing equation

Fin­ally, be­cause , by group­ing terms and factor­ing out , we get

and, be­cause all co­ordin­ates in are neg­at­ive, this equals for some pos­it­ive con­stant . By the same proof path,

For some pos­it­ive con­stant . There­fore, for suf­fi­ciently small , and Case 4 is im­possible, and the the­orem fol­lows be­cause only Case 2 and 3 are left.