Bounded Oracle Induction

In­tro­duc­tion:

Be­cause log­i­cal in­duc­tion re­lies on the Brouwer fixed-point the­o­rem, and re­flec­tive or­a­cles rely on the Kaku­tani fixed-point the­o­rem which Brouwer is a spe­cial case of, it’s pos­si­ble that log­i­cal in­duc­tion could have been de­rived for the first time from re­flec­tive or­a­cles. At­tempt­ing to do this in the most ob­vi­ous way by hav­ing traders out­put a cir­cuit that takes a bi­nary-search ap­prox­i­ma­tion of the mar­ket as in­put doesn’t pro­duce any in­sights in par­tic­u­lar. How­ever, by at­tempt­ing to re­de­velop the log­i­cal in­duc­tion al­gorithm from scratch with the aid of a bounded re­flec­tive or­a­cle, we ar­rive at a new way of look­ing at log­i­cal in­duc­tion, with the fol­low­ing in­ter­est­ing fea­tures.

1: It col­lapses the dis­tinc­tion be­tween the al­gorithm out­putting the trad­ing cir­cuit, and the trad­ing cir­cuit it­self.

2: All trades can be nat­u­rally in­ter­preted as prob­a­bil­ity mea­sures over bit­strings, with the re­ward given by a sim­ple bet­ting game, in­stead of shares that pay off if some boolean com­bi­na­tion is true. How­ever, the bet­ting game doesn’t in­cen­tivize true be­lief re­port­ing, just mov­ing the prob­a­bil­ity dis­tri­bu­tion of the mar­ket in the right di­rec­tion. This turns out to be iso­mor­phic to the origi­nal for­mu­la­tion of the value of a trade, sub­ject to one ex­tra re­stric­tion on worst-case trade value.

3: The mar­ket prices are also a prob­a­bil­ity mea­sure over bit­strings/​wor­lds, ex­actly like a uni­ver­sal in­duc­tor.

4: It does the re­flec­tive-solomonoff thing of “draw a tur­ing ma­chine with prob­a­bil­ity , use it to pre­dict fu­ture bits”

First, no­ta­tion will be es­tab­lished, and the al­gorithm will be dis­cussed. Then the con­nec­tion be­tween OI-trader scor­ing and LI-trader scor­ing will be es­tab­lished, and ques­tions of how many of the nice prop­er­ties of LI carry over will be dis­cussed. We will finish with ba­sic dis­cus­sion of what this re­sult means, and then there will be an ap­pendix which con­tains the defi­ni­tions of aux­iliary al­gorithms, and the next post will con­tain the proof that the al­gorithm is an Or­a­cle In­duc­tor.

As a quick re­fresher, a bounded re­flec­tive or­a­cle takes as in­put a query of the form , and re­turns if has bounded run­time, and only makes or­a­cle calls to al­gorithms with the same run­time bound, and out­puts with prob­a­bil­ity greater than , and if has bounded run­time and only makes or­a­cle calls to al­gorithms with the same run­time bound, and out­puts with prob­a­bil­ity less than . If the prob­a­bil­ity of a is ex­actly , or if ex­ceeds run­time bounds, the or­a­cle is al­lowed to ran­dom­ize ar­bi­trar­ily.

No­ta­tion:

and are the sets of all bit­strings of length , and the set of all finite bit­strings, re­spec­tively. Ele­ments of these sets are de­noted by . refers to the empty string.

is the set of all satis­fi­able booleans with all vari­ables hav­ing an in­dex , and is the set of all satis­fi­able booleans. is the set of all booleans (in­clud­ing un­satis­fi­able ones), and is the set of all booleans with all vari­ables hav­ing an in­dex . Ele­ments of these sets are de­noted by . is the triv­ial boolean which is satis­fied by all bit­strings. Note that a boolean in­duces a func­tion .

is a func­tion that is time-con­structible, mono­ton­i­cally in­creas­ing, and . It will give the run­time bound for the traders.

is some func­tion up­per-bounded by , that is time-con­structible, mono­ton­i­cally in­creas­ing, and . It gives the most dis­tant bit the or­a­cle in­duc­tor thinks about on turn .

is the func­tion given by , giv­ing the num­ber of iter­a­tions of bi­nary-search de­ployed on turn .

is the func­tion given by . It gives the pro­por­tion of the in­duc­tor dis­tri­bu­tion that is made up by the uniform dis­tri­bu­tion, to en­sure that all bit­strings have a prob­a­bil­ity high enough for bi­nary-search to ac­cu­rately ap­prox­i­mate their prob­a­bil­ity.

con­sults the bounded re­flec­tive or­a­cle about whether re­turns with prob­a­bil­ity greater than .

rounds down to if it is greater than , and then flips a coin that re­turns with prob­a­bil­ity . In our model of com­pu­ta­tion, this op­er­a­tion takes unit time.

The OI Al­gorithm:

In short, the dis­tri­bu­tion in­duced by the or­a­cle in­duc­tion al­gorithm has a small por­tion of the prob­a­bil­ity mass com­posed of the uniform dis­tri­bu­tion, and oth­er­wise the al­gorithm se­lects a tur­ing ma­chine ac­cord­ing to the uni­ver­sal dis­tri­bu­tion, and a “bud­get”, with of the prob­a­bil­ity mass on a bud­get of , much like the log­i­cal in­duc­tor al­gorithm. In this set­ting, all trades can be in­ter­preted as a mix­ture of prob­a­bil­ity dis­tri­bu­tions of the form “con­di­tion the or­a­cle in­duc­tion dis­tri­bu­tion on some boolean be­ing true”, so the bud­geted trader is con­sulted to get a boolean (note that the trader may be ran­dom­ized!), and is used to ap­prox­i­mately repli­cate the prob­a­bil­ity dis­tri­bu­tion pro­duced by con­di­tion­ing the or­a­cle in­duc­tion dis­tri­bu­tion on the re­sult­ing boolean.

This starts with the empty bit­string and re­peat­edly con­can­tates it with which gets the next bit, un­til a bit­string of length is pro­duced. , and are just passed on to .

This defines a boolean con­straint that says that must be true, and the ini­tial pre­fix of the bit­string must equal . Then two more boolean con­straints are gen­er­ated, which are the same ex­cept they spec­ify that the next bit is a or . If only one of the two is a satis­fi­able boolean, the next bit is forced to be or in com­pli­ance with the satis­fi­a­bil­ity re­quire­ment, oth­er­wise, bi­nary search for iter­a­tions on the prob­a­bil­ity of out­putting a bit­string that satis­fies or is used to figure out the prob­a­bil­ity of the next bit be­ing a .

This uses the or­a­cle to im­ple­ment bi­nary-search for rounds on some al­gorithm, and out­put ei­ther a lower-bound, av­er­age, or up­per-bound es­ti­mate of the prob­a­bil­ity of the al­gorithm out­putting .

This uses the or­a­cle to test whether the trader is pos­si­bly over-bud­get, and if so, re­turns the null boolean, oth­er­wise, it re­turns the boolean that the trader re­turns. ran­domly se­lects a day and a world/​bit­string, and re­turns if the world/​bit­string hasn’t been ruled out by that day and the trader is over-bud­get rel­a­tive to that world/​bit­string, so the or­a­cle call is test­ing whether there’s any com­bi­na­tion of wor­lds and days where the trader is over-bud­get.

This ran­domly picks a day and bit­string, and re­turns if the bit­string is plau­si­ble (hasn’t been ruled out by the de­duc­tive pro­cess) on that day, and the trader might be over-bud­get. Note that the strength of the ap­prox­i­ma­tion rises as a func­tion of . This means that at later days, more ac­cu­rate retroac­tive es­ti­mates are used for the value of a trade on pre­vi­ous days. The mess of paren­the­ses in the nu­mer­a­tor es­sen­tially clips the bit­string to the ap­pro­pri­ate length, and uses bi­nary search to find the prob­a­bil­ity that (an ap­prox­i­ma­tion of (the dis­tri­bu­tion pro­duced by con­di­tion­ing the OI dis­tri­bu­tion on (the boolean out­putted by the trader))) as­signs to the bit­string.

This takes a trader, and re­turns the null con­straint if it times out or makes an “out of bounds” or­a­cle call or out­puts an un­satis­fi­able boolean. Other­wise, it takes the bit­string the trader out­puts, in­ter­prets it as some boolean con­straint, clips the con­straint to the ap­pro­pri­ate length, and out­puts that con­straint. Note that be­cause al­gorithms can be ran­dom­ized, this won’t nec­es­sar­ily out­put the same boolean ev­ery time, so the dis­tri­bu­tion pro­duced by should ac­tu­ally be thought of as a prob­a­bil­is­tic mix­ture of con­di­tional dis­tri­bu­tions.

To put all this to­gether, the in­duc­tor se­lects a tur­ing ma­chine and a bud­get with the ap­pro­pri­ate prob­a­bil­ity, and queries the tur­ing ma­chine about what boolean com­bi­na­tion it thinks the true se­quence of bits fulfills. As the tur­ing ma­chine can only re­port one boolean com­bi­na­tion, mix­tures of var­i­ous booleans are im­ple­mented by the tur­ing ma­chine ran­dom­iz­ing. If the past his­tory of the tur­ing ma­chine has it pos­si­bly go­ing over-bud­get on the next “trade”, or vi­o­lates con­di­tions by run­ning over time, or mak­ing ille­gal or­a­cle calls, or pro­vid­ing an im­pos­si­ble-to-fulfill boolean, then the mar­ket just de­faults to ask­ing (an ap­prox­i­ma­tion of) it­self about its own prob­a­bil­ities. Other­wise, the mar­ket out­puts (an ap­prox­i­ma­tion of) its own prob­a­bil­ity dis­tri­bu­tion, con­di­tioned on be­ing in con­for­mance with the trader. A bounded re­flec­tive or­a­cle is ex­ploited as an NP-or­a­cle, and to effec­tively nar­row down the prob­a­bil­ity of it­self out­putting a spe­cific bit­string.

In­ter­pre­ta­tion of a Trade:

The in­ter­pre­ta­tion of a trade from the log­i­cal in­duc­tion pa­per was that you’d lose dol­lars in or­der to ac­quire a share that would be worth dol­lar in wor­lds where ,and dol­lars if .

The in­ter­pre­ta­tion of a trade in this set­ting is that a trader and a mar­ket spread dol­lar amongst var­i­ous bit­strings/​wor­lds ,(which will be lost), giv­ing a prob­a­bil­ity mea­sure, and if the world is re­vealed to be , the trader earns dol­lars in re­turn. (where is the prob­a­bil­ity the trader as­signed to , and is the prob­a­bil­ity the mar­ket as­signed to ). The value of a trader at time in world is then .

Sur­pris­ingly enough, these two in­ter­pre­ta­tions are ac­tu­ally equiv­a­lent! We can take (some) LI trades, and con­vert them into an OI trade with the same value in all wor­lds, and also do the re­verse.

First, let’s say our OI trader spreads its dol­lar ac­cord­ing to (it copies the mar­ket prob­a­bil­ity dis­tri­bu­tion, con­di­tional on the world fulfilling the con­straint ). Then, in all wor­lds that don’t fulfill the boolean, it will lose dol­lar be­cause it as­signed mea­sure to , and in all that fulfill the boolean, be­cause , it gets dol­lars in re­turn. If we mul­ti­ply these two val­ues by , we get dol­lars when it fails, and dol­lars when it suc­ceeds, which is the ex­act same as a log­i­cal-in­duc­tion trader buy­ing one share in . So, the value of the OI trader out­putting the dis­tri­bu­tion has the ex­act same value in all wor­lds as buy­ing shares of , which is the same as spend­ing dol­lar buy­ing shares of .

We in­ter­pret a ran­dom­ized choice of a boolean as the trader spend­ing ( is an or­di­nary prob­a­bil­ity, not the mar­ket price of any­thing)of the prob­a­bil­ity mass on the con­di­tional dis­tri­bu­tion . This sums up to pro­duce a prob­a­bil­ity mea­sure, where .

Gen­er­al­iz­ing a bit, all OI trades are equiv­a­lent to a LI trade where it spends dol­lars buy­ing shares of the boolean cor­re­spond­ing to , and spends dol­lar in to­tal.

Also, if the trader out­puts the null boolean, , so the value of that trade in all wor­lds is 0 and equiv­a­lent to not buy­ing or sel­l­ing any­thing. This can be used to have the equiv­a­lent LI trade spend less than dol­lar buy­ing shares, if some por­tion of is com­posed of .

Go­ing in the re­verse di­rec­tion, from a LI trade to a OI trade, is more difficult, be­cause there is both buy­ing and sel­l­ing of shares. From be­fore, buy­ing share of is equiv­a­lent to the OI trader dis­tribut­ing of its prob­a­bil­ity mass ac­cord­ing to the dis­tri­bu­tion. It’s slightly more in­volved to show, but sel­l­ing share of has equiv­a­lent value in all wor­lds as dis­tribut­ing of the OI trader mass on the dis­tri­bu­tion.

Now, if, at the end of trans­lat­ing the LI trade into OI terms, less than all of the prob­a­bil­ity mass has been spent on var­i­ous con­di­tional dis­tri­bu­tions, the rest of the prob­a­bil­ity mass can be spent on copy­ing since it has value in all wor­lds. But what if the mea­sure of the re­sult­ing trade sums to greater than ? In that case, the ex­treme-worst-case value of the LI trade (all pur­chased shares were worth , all sold shares were worth )is be­low .

There­fore, if there’s a LI trader with a cir­cuit that’s eval­u­at­able in polyno­mial time (not just writable in poly-time), and it never makes a trade with ex­treme worst-case value be­low (ie, it doesn’t blow too much money at any given time), there’s a cor­re­spond­ing OI trader! This in­cludes al­most all the traders from the proofs of the log­i­cal in­duc­tion pa­per. How­ever, the traders from 4.7.2 (con­di­tion­als on the­o­ries) and 4.5.10 (af­fine un­bi­ased­ness from feed­back) need to be adapted.

Con­jec­ture 1: There is an LI trader with ex­treme-worst-case value on each turn of that ex­ploits the mar­ket if it vi­o­lates Affine Un­bi­ased­ness from Feed­back, and an­other such trader that ex­ploits the mar­ket if the mar­ket con­di­tional on is ex­ploitable by a trader with ex­treme-worst-case value on each turn of .

If this con­jec­ture holds, or­a­cle in­duc­tion would in­herit all the nice prop­er­ties of log­i­cal in­duc­tion.

Ex­ploita­tion is defined in the stan­dard way, as the set of plau­si­ble val­ues ac­cord­ing to wor­lds con­sis­tent with the de­duc­tive pro­cess at time , over all , be­ing bounded be­low and un­bounded above.

Note that if we con­sider the mar­ket as com­posed of all the traders prob­a­bil­ity-dis­tri­bu­tions ag­gre­gated to­gether, the pay­off to ev­ery­body cor­re­sponds to tak­ing ev­ery­one’s money, and dis­tribut­ing that money back to ev­ery­one ac­cord­ing to the frac­tion of the prob­a­bil­ity-mass in the win­ning world that was con­tributed from their dis­tri­bu­tion.

Also note that be­cause the OI-traders can im­ple­ment very long and com­plex com­bi­na­tions of dis­tri­bu­tions by ran­dom­iz­ing, OI-traders are able to make some trades that LI-traders can’t, be­cause they can out­put trades that are too long and com­pli­cated for the LI-trader to write down in polyno­mial time. An OI-trader, con­verted to an LI-trade, may even have pur­chases in ev­ery sin­gle boolean, which LI-traders definitely can’t repli­cate.

Con­jec­ture 2: There is an LI trader that runs in poly-time, and an Or­a­cle In­duc­tor that is in­ex­ploitable by poly-time ad­ver­saries, such that the trader ex­ploits the Or­a­cle In­duc­tor, and there is an OI trader that runs in poly-time and a Log­i­cal In­duc­tor that is in­ex­ploitable by poly-time ad­ver­saries, such that the trader ex­ploits the Log­i­cal In­duc­tor.

Facts About OI:

Just like log­i­cal in­duc­tors, there is no trader that runs in time less than that ex­ploits the mar­ket. This is shown by the fol­low­ing the­o­rems which are analo­gous to the the­o­rems es­tab­lish­ing that the log­i­cal in­duc­tion al­gorithm is in­ex­ploitable.

The­o­rem 1: If a trader that can be simu­lated on a UTM in less than time ex­ploits , there is some finite bud­get such that the bud­geted trader ex­ploits .

The­o­rem 2: If a bud­geted trader ex­ploits , the su­per­trader ex­ploits .

The­o­rem 3: The su­per­trader doesn’t ex­ploit .

The proofs will be deferred to the next post.

As for the strength of the bounded re­flec­tive or­a­cle needed to guaran­tee that all or­a­cle calls are well-defined, it is . Again, the proof will be deferred to the next post.

Fu­ture Direc­tions:

The two con­jec­tures would be in­ter­est­ing to set­tle, al­though only the first is truly crit­i­cal to show­ing that this is as pow­er­ful as a log­i­cal in­duc­tor.

The in­ter­pre­ta­tion of trades as prob­a­bil­ity mea­sures over bit­strings, with pay­off given by the pro­por­tion of prob­a­bil­ity mass in the win­ning string con­tributed by a trader is use­ful, al­though the lack of an in­cen­tive for ac­cu­rate be­lief re­port­ing is slightly wor­ry­ing, and pre­vents us from truly at­tribut­ing be­liefs to in­di­vi­d­ual traders.

The close par­allel be­tween this and Reflec­tive-Or­a­cle Solomonoff In­duc­tion may prove to be fruit­ful, and po­ten­tially lead to a var­i­ant of AIXI in this set­ting, which may be ported back to log­i­cal in­duc­tion.

APPENDIX: Aux­iliary Algorithms

In our model of com­pu­ta­tion, as­sume that this takes unit time.

This gen­er­ates a ran­dom bit­string of length .

This ran­domly se­lects an in­te­ger in , with equal prob­a­bil­ity for each in­te­ger.

This is used to en­sure that the traders out­put boolean con­straints that aren’t about un­rea­son­ably dis­tant digits.

This ap­plies a boolean cir­cuit to a bit­string, and pads it with ran­domly se­lected bits if the bit­string is too short. Note that since traders are al­lowed to call the or­a­cle on this func­tion ap­plied to the or­a­cle in­duc­tor, this im­plic­itly as­signs 50% prob­a­bil­ity to all bits that are too dis­tant for the or­a­cle in­duc­tor to have thought about them yet.

This con­verts a bit­string to a boolean which re­quires that the pre­fix of a bit­string equals . The equal­ity should be un­der­stood as if , and if .

This uses the or­a­cle as a SAT-solver, by us­ing to ran­domly gen­er­ate a bit­string of length . If there is a bit­string which fulfills the boolean, there is a prob­a­bil­ity of gen­er­at­ing that bit­string, so the or­a­cle can perfectly dis­crim­i­nate be­tween satis­fi­able and un­satis­fi­able booleans.

This just turns a bit­string into the boolean it en­codes, given some effi­cient en­cod­ing scheme.

This is the de­duc­tive pro­cess, a black­box de­ter­minis­tic al­gorithm which out­puts booleans of in­dex at most in at most time s.t (ie, the con­straints on the true en­vi­ron­ment get more stringent over time but always stay satis­fi­able).