Toward a New Technical Explanation of Technical Explanation

A New Framework

(Thanks to Valen­tine for a dis­cus­sion lead­ing to this post, and thanks to CFAR for run­ning the CFAR-MIRI cross-fer­til­iza­tion work­shop. Val pro­vided feed­back on a ver­sion of this post. Warn­ing: fairly long.)

Eliezer’s A Tech­ni­cal Ex­pla­na­tion of Tech­ni­cal Ex­pla­na­tion, and more­over the se­quences as a whole, used the best tech­ni­cal un­der­stand­ing of prac­ti­cal episte­mol­ogy available at the time* -- the Bayesian ac­count—to ad­dress the ques­tion of how hu­mans can try to ar­rive at bet­ter be­liefs in prac­tice. The se­quences also pointed out sev­eral holes in this un­der­stand­ing, mainly hav­ing to do with log­i­cal un­cer­tainty and re­flec­tive con­sis­tency.

MIRI’s re­search pro­gram has since then made ma­jor progress on log­i­cal un­cer­tainty. The new un­der­stand­ing of episte­mol­ogy—the the­ory of log­i­cal in­duc­tion—gen­er­al­izes the Bayesian ac­count by elimi­nat­ing the as­sump­tion of log­i­cal om­ni­science. Bayesian be­lief up­dates are re­cov­ered as a spe­cial case, but the dy­nam­ics of be­lief change are non-Bayesian in gen­eral. While it might not turn out to be the last word on the prob­lem of log­i­cal un­cer­tainty, it has a large num­ber of de­sir­able prop­er­ties, and solves many prob­lems in a unified and rel­a­tively clean frame­work.

It seems worth ask­ing what con­se­quences this the­ory has for prac­ti­cal ra­tio­nal­ity. Can we say new things about what good rea­son­ing looks like in hu­mans, and how to avoid pit­falls of rea­son­ing?

First, I’ll give a shal­low overview of log­i­cal in­duc­tion and pos­si­ble im­pli­ca­tions for prac­ti­cal epistemic ra­tio­nal­ity. Then, I’ll fo­cus on the par­tic­u­lar ques­tion of A Tech­ni­cal Ex­pla­na­tion of Tech­ni­cal Ex­pla­na­tion (which I’ll ab­bre­vi­ate TEOTE from now on). Put in CFAR ter­minol­ogy, I’m seek­ing a gears-level un­der­stand­ing of gears-level un­der­stand­ing. I fo­cus on the in­tu­itions, with only a min­i­mal ac­count of how log­i­cal in­duc­tion helps make that pic­ture work.

Log­i­cal Induction

There are a num­ber of difficul­ties in ap­ply­ing Bayesian un­cer­tainty to logic. No com­putable prob­a­bil­ity dis­tri­bu­tion can give non-zero mea­sure to the log­i­cal tau­tolo­gies, since you can’t bound the amount of time you need to think to check whether some­thing is a tau­tol­ogy, so up­dat­ing on prov­able sen­tences always means up­dat­ing on a set of mea­sure zero. This leads to con­ver­gence prob­lems, al­though there’s been re­cent progress on that front.

Put an­other way: Log­i­cal con­se­quence is de­ter­minis­tic, but due to Gödel’s first in­com­plete­ness the­o­rem, it is like a stochas­tic vari­able in that there is no com­putable pro­ce­dure which cor­rectly de­cides whether some­thing is a log­i­cal con­se­quence. This means that any com­putable prob­a­bil­ity dis­tri­bu­tion has in­finite Bayes loss on the ques­tion of log­i­cal con­se­quence. Yet, be­cause the ques­tion is ac­tu­ally de­ter­minis­tic, we know how to point in the di­rec­tion of bet­ter dis­tri­bu­tions by do­ing more and more con­sis­tency check­ing. This puts us in a puz­zling situ­a­tion where we want to im­prove the Bayesian prob­a­bil­ity dis­tri­bu­tion by do­ing a kind of non-Bayesian up­date. This was the two-up­date prob­lem.

You can think of log­i­cal in­duc­tion as sup­port­ing a set of hy­pothe­ses which are about ways to shift be­liefs as you think longer, rather than fixed prob­a­bil­ity dis­tri­bu­tions which can only shift in re­sponse to ev­i­dence.

This in­tro­duces a new prob­lem: how can you score a hy­poth­e­sis if it keeps shift­ing around its be­liefs? As TEOTE em­pha­sises, Bayesi­ans out­law this kind of be­lief shift for a rea­son: re­quiring pre­dic­tions to be made in ad­vance elimi­nates hind­sight bias. (More on this later.) So long as you un­der­stand ex­actly what a hy­poth­e­sis pre­dicts and what it does not pre­dict, you can eval­u­ate its Bayes score and its prior com­plex­ity penalty and rank it ob­jec­tively. How do you do this if you don’t know all the con­se­quences of a be­lief, and the be­lief it­self makes shift­ing claims about what those con­se­quences are?

The log­i­cal-in­duc­tion solu­tion is: set up a pre­dic­tion mar­ket. A hy­poth­e­sis only gets credit for con­tribut­ing to col­lec­tive knowl­edge by mov­ing the mar­ket in the right di­rec­tion early. If the mar­ket’s odds on prime num­bers are cur­rently worse than those which the prime num­ber the­o­rem can provide, a hy­poth­e­sis can make money by mak­ing bets in that di­rec­tion. If the mar­ket has already con­verged to those be­liefs, though, a hy­poth­e­sis can’t make any more money by ex­press­ing such be­liefs—so it doesn’t get any credit for do­ing so. If the mar­ket has moved on to even more ac­cu­rate rules of thumb, a trader would only lose money by mov­ing be­liefs back in the di­rec­tion of the prime num­ber the­o­rem.

Math­e­mat­i­cal Understanding

This pro­vides a frame­work in which we can make sense of math­e­mat­i­cal la­bor. For ex­am­ple, a com­mon oc­cur­rence in com­bi­na­torics is that there is a se­quence which we can calcu­late, such as the cata­lan num­bers, by di­rectly count­ing the num­ber of ob­jects of some spe­cific type. This se­quence is bog­gled at like data in a sci­en­tific ex­per­i­ment. Differ­ent pat­terns in the se­quence are ob­served, and hy­pothe­ses for the con­tinu­a­tion of these pat­terns are pro­posed and tested. Often, a sig­nifi­cant goal is the con­struc­tion of a closed form ex­pres­sion for the se­quence.

This looks just like Bayesian em­piri­cism, ex­cept for the fact that we already have a hy­poth­e­sis which en­tirely ex­plains the ob­ser­va­tions. The se­quence is con­structed from a defi­ni­tion which math­e­mat­i­ci­ans made up, and which thus as­signs 100% prob­a­bil­ity to the ob­served data. What’s go­ing on? It is pos­si­ble to par­tially ex­plain this kind of thing in a Bayesian frame­work by act­ing as if the true for­mula were un­known and we were try­ing to guess where the se­quence came from, but this doesn’t ex­plain ev­ery­thing, such as why find­ing a closed form ex­pres­sion would be im­por­tant.

Log­i­cal in­duc­tion ex­plains this by point­ing out how differ­ent time-scales are in­volved. Even if all el­e­ments of the se­quence are calcu­la­ble, a new hy­poth­e­sis can get credit for calcu­lat­ing them faster than the brute-force method. Any­thing which al­lows one to pro­duce cor­rect an­swers faster con­tributes to the effi­ciency of the pre­dic­tion mar­ket in­side the log­i­cal in­duc­tor, and thus, to the over­all math­e­mat­i­cal un­der­stand­ing of a sub­ject. This cleans up the is­sue nicely.

What other epistemic phe­nom­ena can we now un­der­stand bet­ter?

Les­sons for Aspiring Rationalists

Many of these could benefit from a whole post of their own, but here’s some fast-and-loose cor­rec­tions to Bayesian episte­mol­ogy which may be use­ful:

  • Hy­pothe­ses need not make pre­dic­tions about ev­ery­thing. Be­cause hy­pothe­ses are about how to ad­just your odds as you think longer, they can leave most sen­tences alone and fo­cus on a nar­row do­main of ex­per­tise. Every­one was already do­ing this in prac­tice, but the math of Bayesian prob­a­bil­ity the­ory re­quires each hy­poth­e­sis to make a pre­dic­tion about ev­ery ob­ser­va­tion, if you ac­tu­ally look at it. Allow­ing a hy­poth­e­sis to re­main silent on some is­sues in stan­dard Bayesi­anism can cause prob­lems: if you’re not care­ful, a hy­poth­e­sis can avoid falsifi­ca­tion by re­main­ing silent, so you end up in­cen­tivis­ing hy­pothe­ses to re­main mostly silent (and you fail to learn as a re­sult). Pre­dic­tion mar­kets are one way to solve this prob­lem.

  • Hy­pothe­ses buy and sell at the cur­rent price, so they take a hit for leav­ing a now-un­pop­u­lar po­si­tion which they ini­tially sup­ported (but less of a hit than if they’d stuck with it) or com­ing in late to a po­si­tion of grow­ing pop­u­lar­ity. Other stock-mar­ket type dy­nam­ics can oc­cur.

  • Hy­pothe­ses can be like ob­ject-level be­liefs or meta-level be­liefs: you can have a hy­poth­e­sis about how you’re over­con­fi­dent, which gets credit for smooth­ing your prob­a­bil­ities (if this im­proves things on av­er­age). This al­lows you to take into ac­count be­liefs about your cal­ibra­tion with­out get­ting too con­fused about Hofs­tadter’s-law type para­doxes.

You may want to be a bit care­ful and Ch­ester­ton-fence ex­ist­ing Bayescraft, though, be­cause some things are still bet­ter about the Bayesian set­ting. I men­tioned ear­lier that Bayesi­ans don’t have to worry so much about hind­sight bias. This is closely re­lated to the prob­lem of old ev­i­dence.

Old Evidence

Sup­pose a new sci­en­tific hy­poth­e­sis, such as gen­eral rel­a­tivity, ex­plains a well-know ob­ser­va­tion such as the per­ihe­lion pre­ces­sion of mer­cury bet­ter than any ex­ist­ing the­ory. In­tu­itively, this is a point in fa­vor of the new the­ory. How­ever, the prob­a­bil­ity for the well-known ob­ser­va­tion was already at 100%. How can a pre­vi­ously-known state­ment provide new sup­port for the hy­poth­e­sis, as if we are re-up­dat­ing on ev­i­dence we’ve already up­dated on long ago? This is known as the prob­lem of old ev­i­dence, and is usu­ally lev­el­led as a charge against Bayesian episte­mol­ogy. How­ever, in some sense, the situ­a­tion is worse for log­i­cal in­duc­tion.

A Bayesian who en­dorses Solomonoff in­duc­tion can tell the fol­low­ing story: Solomonoff in­duc­tion is the right the­ory of episte­mol­ogy, but we can only ap­prox­i­mate it, be­cause it is un­com­putable. We ap­prox­i­mate it by search­ing for hy­pothe­ses, and com­put­ing their pos­te­rior prob­a­bil­ity retroac­tively when we find new ones. It only makes sense that when we find a new hy­poth­e­sis, we calcu­late its pos­te­rior prob­a­bil­ity by mul­ti­ply­ing its prior prob­a­bil­ity (based on its de­scrip­tion length) by the prob­a­bil­ity it as­signs to all ev­i­dence so far. That’s Bayes’ Law! The fact that we already knew the ev­i­dence is not rele­vant, since our ap­prox­i­ma­tion didn’t pre­vi­ously in­clude this hy­poth­e­sis.

Log­i­cal in­duc­tion speaks against this way of think­ing. The hy­po­thet­i­cal Solomonoff in­duc­tion ad­vo­cate is as­sum­ing one way of ap­prox­i­mat­ing Bayesian rea­son­ing via finite com­put­ing power. Log­i­cal in­duc­tion can be thought of as a differ­ent (more rigor­ous) story about how to ap­prox­i­mate in­tractible math­e­mat­i­cal struc­tures. In this new way, propo­si­tions are bought or sold at mar­ket prices at the time. If a new hy­poth­e­sis is dis­cov­ered, it can’t be given any credit for ‘pre­dict­ing’ old in­for­ma­tion. The price of known ev­i­dence is already at max­i­mum—you can’t gain any money by in­vest­ing in it.

There are good rea­sons to ig­nore old ev­i­dence, es­pe­cially if the old ev­i­dence has bi­ased your search for new hy­pothe­ses. Nonethe­less, it doesn’t seem right to to­tally rule out this sort of up­date.

I’m still a bit puz­zled by this, but I think the situ­a­tion is im­proved by un­der­stand­ing gears-level rea­son­ing. So, let’s move on to the dis­cus­sion of TEOTE.

Gears of Gears

As Valen­tine noted in his ar­ti­cle, it is some­what frus­trat­ing how the over­all idea of gears-level un­der­stand­ing seems so clear while re­main­ing only heuris­tic in defi­ni­tion. It’s a sign of a ripe philo­soph­i­cal puz­zle. If you don’t feel you have a good in­tu­itive grasp of what I mean by “gears level un­der­stand­ing”, I sug­gest read­ing his post.

Valen­tine gives three tests which point in the di­rec­tion of the right con­cept:

  1. Does the model pay rent? If it does, and if it were falsified, how much (and how pre­cisely) could you in­fer other things from the falsifi­ca­tion?

  2. How in­co­her­ent is it to imag­ine that the model is ac­cu­rate but that a given vari­able could be differ­ent?

  3. If you knew the model were ac­cu­rate but you were to for­get the value of one vari­able, could you red­erive it?

I already named one near-syn­onym for “gears”, namely “tech­ni­cal ex­pla­na­tion”. Two more are “in­side view” and Elon Musk’s no­tion of rea­son­ing from first prin­ci­ples. The im­pli­ca­tion is sup­posed to be that gears-level un­der­stand­ing is in some sense bet­ter than other sorts of knowl­edge, but this is de­cid­edly not sup­posed to be val­ued to the ex­clu­sion of other sorts of knowl­edge. In­side-view rea­son­ing is tra­di­tion­ally sup­posed to be com­bined with out­side-view rea­son­ing (al­though Elon Musk calls it “rea­son­ing by anal­ogy” and con­sid­ers it in­fe­rior, and much of Eliezer’s re­cent writ­ing warns of its dan­gers as well, while al­low­ing for its ap­pli­ca­tion to spe­cial cases). I sug­gested the terms gears-level & policy-level in a pre­vi­ous post (which I ac­tu­ally wrote af­ter most of this one).

Although TEOTE gets close to an­swer­ing Valen­tine’s ques­tion, it doesn’t quite hit the mark. The defi­ni­tion of “tech­ni­cal ex­pla­na­tion” pro­vided there is a the­ory which strongly con­cen­trates the prob­a­bil­ity mass on spe­cific pre­dic­tions and rules out oth­ers. It’s clear that a model can do this with­out be­ing “gears”. For ex­am­ple, my model might be that what­ever pre­dic­tion the Great Master makes will come true. The Great Master can make very de­tailed pre­dic­tions, but I don’t know how they’re gen­er­ated. I lack the un­der­stand­ing as­so­ci­ated with the pre­dic­tive power. I might have a strong out­side-view rea­son to trust the Great Master: their track record on pre­dic­tions is im­mac­u­late, their Bayes-loss minis­cule, their cal­ibra­tion supreme. Yet, I lack an in­side-view ac­count. I can’t de­rive their pre­dic­tions from first prin­ci­ples.

Here, I’m sid­ing with David Deutsch’s ac­count in the first chap­ter of The Fabric of Real­ity. He ar­gues that un­der­stand­ing and pre­dic­tive ca­pa­bil­ity are dis­tinct, and that un­der­stand­ing is about hav­ing good ex­pla­na­tions. I may not ac­cept his whole cri­tique of Bayesi­anism, but that much of his view seems right to me. Un­for­tu­nately, he doesn’t give a tech­ni­cal ac­count of what “ex­pla­na­tion” and “un­der­stand­ing” could be.

First At­tempt: Deter­minis­tic Predictions

TEOTE spends a good chunk of time on the is­sue of mak­ing pre­dic­tions in ad­vance. Ac­cord­ing to TEOTE, this is a hu­man solu­tion to a hu­man prob­lem: you make pre­dic­tions in ad­vance so that you can’t make up what pre­dic­tions you could have made af­ter the fact. This coun­ters hind­sight bias. An ideal Bayesian rea­soner, on the other hand, would never be tempted into hind­sight bias in the first place, and is free to eval­u­ate hy­pothe­ses on old ev­i­dence (as already dis­cussed).

So, is gears-level rea­son­ing just pure Bayesian rea­son­ing, in which hy­pothe­ses have strictly defined prob­a­bil­ities which don’t de­pend on any­thing else? Is out­side-view rea­son­ing the thing log­i­cal in­duc­tion adds, by al­low­ing the be­liefs of a hy­poth­e­sis to shift over time and to de­pend on on the wider mar­ket state?

This isn’t quite right. An ideal Bayesian can still learn to trust the Great Master, based on the re­li­a­bil­ity of the Great Master’s pre­dic­tions. Un­like a hu­man (and un­like a log­i­cal in­duc­tor), the Bayesian will at all times have in mind all the pos­si­ble ways the Great Master’s pre­dic­tions could have be­come so ac­cu­rate. This is be­cause a Bayesian hy­poth­e­sis con­tains a full joint dis­tri­bu­tion on all events, and an ideal Bayesian rea­sons about all hy­pothe­ses at all times. In this sense, the Bayesian always op­er­ates from an in­side view—it can­not trust the Great Master with­out a hy­poth­e­sis which cor­re­lates the Great Master with the world.

How­ever, it is pos­si­ble that this cor­re­la­tion is in­tro­duced in a very sim­ple way, by rul­ing out cases where the Great Master and re­al­ity dis­agree with­out pro­vid­ing any mechanism ex­plain­ing how this is the case. This may have low prior prob­a­bil­ity, but gain promi­nence due to the hit in Bayes-score other hy­pothe­ses are tak­ing for not tak­ing ad­van­tage of this cor­re­la­tion. It’s not a bad out­come given the epistemic situ­a­tion, but it’s not gears-level rea­son­ing, ei­ther. So, be­ing fully Bayesian or not isn’t ex­actly what dis­t­in­guishes whether ad­vanced pre­dic­tions are needed. What is it?

I sug­gest it’s this: whether the hy­poth­e­sis is well-defined, such that any­one can say what pre­dic­tions it makes with­out ex­tra in­for­ma­tion. In his post on gears, Valen­tine men­tions the im­por­tance of “how de­ter­minis­ti­cally in­ter­con­nected the vari­ables of the model are”. I’m point­ing at some­thing close, but im­por­tantly dis­tinct: how de­ter­minis­tic the pre­dic­tions are. You know that a coin is very close to equally likely to land on heads or tails, and from this you can (if you know a lit­tle com­bi­na­torics) com­pute things like the prob­a­bil­ity of get­ting ex­actly three heads if you flip the coin five times. Any­one with the same knowl­edge would com­pute the same thing. The model in­cludes prob­a­bil­ities in­side it, but how those prob­a­bil­ities flow is perfectly de­ter­minis­tic.

This is a no­tion of ob­jec­tivity: a wide va­ri­ety of peo­ple can agree on what prob­a­bil­ity the model as­signs, de­spite oth­er­wise varied back­ground knowl­edge.

If a model is well-defined in this way, it is very easy (Bayesian or no) to avoid hind­sight bias. You can­not ar­gue about how you could have pre­dicted some re­sult. Any­one can sit down and calcu­late.

The hy­poth­e­sis that the Great Master is always cor­rect, on the other hand, does not have this prop­erty. No­body but the Great Master can say what that hy­poth­e­sis pre­dicts. If I know what the Great Master says about a par­tic­u­lar thing, I can eval­u­ate the ac­cu­racy of the hy­poth­e­sis; but, this is spe­cial knowl­edge which I need in or­der to give the prob­a­bil­ities.

The Bayesian hy­poth­e­sis which sim­ply forces state­ments of the Great Master to cor­re­late with the world is some­what more gears-y, in that there’s a prob­a­bil­ity dis­tri­bu­tion which can be writ­ten down. How­ever, this prob­a­bil­ity dis­tri­bu­tion is a com­pli­cated mish-mosh of the Bayesian’s other hy­pothe­ses. So, pre­dict­ing what it would say re­quires ex­ten­sive knowl­edge of the pri­vate be­liefs of the Bayesian agent in­volved. This is typ­i­cal of the cat­e­gory of non-gears-y mod­els.

Ob­jec­tion: Doctrines

In­for­tu­nately, this ac­count doesn’t to­tally satisfy what Valen­tine wants.

Sup­pose that, rather than mak­ing an­nounce­ments on the fly, the Great Master has pub­lished a set of fixed Doc­trines which his ad­her­ents mem­o­rize. As in the pre­vi­ous thought ex­per­i­ment, the word of the Great Master is in­fal­lible; the ap­pli­ca­tion of the Doc­trines always leads to cor­rect pre­dic­tions. How­ever, the con­tents of the Doc­trines ap­pears to be a large mish-mosh of rules with no unify­ing theme. De­spite their ap­par­ent cor­rect­ness, they fail to provide any un­der­stand­ing. It is as if a physi­cist took all the equa­tions in a physics text, trans­formed them into ta­bles of num­bers, and then trans­ported those ta­bles to the mid­dle ages with ex­pla­na­tions of how to use the ta­bles (but none of where they come from). Though the ta­bles work, they are opaque; there is no in­sight as to how they were de­ter­mined.

The Doc­trines are a de­ter­minis­tic tool for mak­ing pre­dic­tions. Yet, they do not seem to be a gears-level model. Go­ing back to Valen­tine’s three tests, the Doc­trines fail test three: we could erase any one of the Doc­trines and we’d be un­able to red­erive it by how it fit to­gether with the rest. Hence, the Doc­trines have al­most as much of a “trust the Great Master” qual­ity as listen­ing to the Great Master di­rectly—the dis­ci­ples would not be able to de­rive the Doc­trines for them­selves.

Se­cond At­tempt: Proofs, Ax­ioms, & Two Levels of Gears

My next pro­posal is that hav­ing a gears-level model is like know­ing the proof. You might be­lieve a math­e­mat­i­cal state­ment be­cause you saw it in a text­book, or be­cause you have a strong math­e­mat­i­cal in­tu­ition which says it must be true. But, you don’t have the gears un­til you can prove it.

This sub­sumes the “de­ter­minis­tic pre­dic­tions” pic­ture: a model is an ax­io­matic sys­tem. If we know all the ax­ioms, then we can in the­ory pro­duce all the pre­dic­tions our­selves. (Think­ing of it this way in­tro­duces a new pos­si­bil­ity, that the model may be well-defined but we may be un­able to find the proofs, due to our own limi­ta­tions.) On the other hand, we don’t have ac­cess to the ax­ioms of the the­ory em­bod­ied by the Great Master, and so we have no hope of see­ing the proofs; we can only ob­serve that the Great Master is always right.

How does this help with the ex­am­ple of the Doc­trines?

The con­cept of “ax­ioms” is some­what slip­pery. There are many equiv­a­lent ways of ax­io­m­a­tiz­ing any given the­ory. We can of­ten flip views be­tween what’s taken as an ax­iom vs what’s proved as a the­o­rem. How­ever, the most el­e­gant set of ax­ioms tends to be preferred.

So, we can re­gard the Doc­trines as one long set of ax­ioms. If we look at them that way, then ad­her­ents of the Great Master have a gears-level un­der­stand­ing of the Doc­trines if they can suc­cess­fully ap­ply them as in­structed.

How­ever, the Doc­trines are not an el­e­gant set of ax­ioms. So, view­ing them in this way is very un­nat­u­ral. It is more nat­u­ral to see them as a set of as­ser­tions which the Great Master has pro­duced by some ax­ioms un­known to us. In this re­spect, we “can’t see the proofs”.

In the same way, we can con­sider flip­ping any model be­tween the ax­iom view and the the­o­rem view. Re­gard­ing the model as ax­io­matic, to de­ter­mine whether it is gears-level we only ask whether its pre­dic­tions are well-defined. Re­gard­ing in in “the­o­rem view”, we ask if we know how the model it­self was de­rived.

Hence, two of Valen­tine’s de­sir­able prop­er­ties of a gears-level model can be un­der­stood as the same prop­erty ap­plied at differ­ent lev­els:

  • Deter­minism, which is Val’s prop­erty #2, fol­lows from re­quiring that we can see the deriva­tions within the model.

  • Re­con­structabil­ity, Val’s prop­erty #3, fol­lows from re­quiring that we can see the deriva­tion of the model.

We might call the first level of gears “made out of gears”, and the sec­ond level “made by gears”—the model it­self be­ing con­structed via a known mechanism.

If we change our view so that a sci­en­tific the­ory is a “the­o­rem”, what are the “ax­ioms”? Well, there are many crite­ria which are ap­plied to sci­en­tific the­o­ries in differ­ent do­mains. Th­ese crite­ria could be thought of as pre-the­o­ries or meta-the­o­ries. They en­code the hard-won wis­dom of a field of study, tel­ling us what the­o­ries are likely to work or fail in that field. But, a very ba­sic ax­iom is: we want a the­ory to be the sim­plest the­ory con­sis­tent with all ob­ser­va­tions. The Great Master’s Doc­trines can­not pos­si­bly sur­vive this test.

To give a less silly ex­am­ple: if we train up a big neu­ral net­work to solve a ma­chine learn­ing prob­lem, the pre­dic­tions made by the model are de­ter­minis­tic, pre­dictable from the net­work weights. How­ever, some­one else who knew all the prin­ci­ples by which the net­work was cre­ated would nonethe­less train up a very differ­ent neu­ral net­work—un­less they use the very same gra­di­ent de­scent al­gorithm, data, ini­tial weights, and num­ber and size of lay­ers.

Even if they’re the same in all those de­tails, and so re­con­struct the same neu­ral net­work ex­actly, there’s a sig­nifi­cant sense in which they can’t see how the con­clu­sion fol­lows in­evitably from the ini­tial con­di­tions. It’s less doc­trine-y than be­ing handed a neu­ral net­work, but it’s more doc­trine-y than un­der­stand­ing the struc­ture of the prob­lem and why al­most any neu­ral net­work achiev­ing good perfor­mance on the task will have cer­tain struc­tures. Re­mem­ber what I said about math­e­mat­i­cal un­der­stand­ing. There’s always an­other level of “be­ing able to see why” you can ask for. Be­ing able to re­pro­duce the proof is differ­ent from be­ing able to ex­plain why the proof has to be the way it is.

Ex­act State­ment?

Gears-y ness is a mat­ter of de­gree, and there are sev­eral in­ter­con­nected things we can point at, and a slip­page of lev­els of anal­y­sis which makes ev­ery­thing quite com­pli­cated.

In the on­tol­ogy of math/​logic, we can point at whether you can see the proof of a the­o­rem. There are sev­eral slip­pages which make this fuzzier than it may seem. First: do you de­rive it only form the ax­ioms, or do you use com­monly known the­o­rems and equiv­alences (which you may or may not be able to prove if put on the spot)? There’s a long con­tinuum be­tween what one math­e­mat­i­cian might say to an­other as proof and a for­mal deriva­tion in logic. Se­cond: how well can you see why the proof has to be? This is the spec­trum be­tween fol­low­ing each proof step in­di­vi­d­u­ally (but see­ing them as al­most a ran­dom walk) vs see­ing the proof as an el­e­men­tary ap­pli­ca­tion of a well-known tech­nique. Third: we can start slip­ping the ax­ioms. There are small changes to the ax­ioms, in which one thing goes from be­ing an ax­iom to a the­o­rem and an­other thing makes the op­po­site tran­si­tion. There are also large changes, like for­mal­iz­ing num­ber the­ory via the Peano ax­ioms vs for­mal­iz­ing it in set the­ory, where the en­tire de­scrip­tion lan­guage changes. You need to trans­late from state­ments of num­ber the­ory to state­ments of set the­ory. Also, there is a nat­u­ral am­bi­guity be­tween tak­ing some­thing as an ax­iom vs re­quiring it as a con­di­tion in a the­o­rem.

In the on­tol­ogy of com­pu­ta­tion, we can point at know­ing the out­put of a ma­chine vs be­ing able to run it by hand to show the out­put. This is a lit­tle less flex­ible than the con­cept of math­e­mat­i­cal proof, but es­sen­tially the same dis­tinc­tion. Chang­ing the ax­ioms is like trans­lat­ing the same al­gorithm to a differ­ent com­pu­ta­tional for­mal­ism, like go­ing be­tween Tur­ing ma­chines and lambda calcu­lus. Also, there is a nat­u­ral am­bi­guity be­tween a pro­gram vs an in­put: when you run pro­gram XYZ with in­put ABC on a uni­ver­sal Tur­ing ma­chine, you in­put XYZABC to the uni­ver­sal tur­ing ma­chine; but, you can also think of this as run­ning pro­gram XY on in­put ZABC, or XYZA on in­put BC, et cetera.

In the on­tol­ogy of on­tol­ogy, we could say “can you see why this has to be, from the struc­ture of the on­tol­ogy de­scribing things?” “On­tol­ogy” is less pre­cise than the pre­vi­ous two con­cepts, but it’s clearly the same idea. A differ­ent on­tol­ogy doesn’t nec­es­sar­ily sup­port the same con­clu­sions, just like differ­ent ax­ioms don’t nec­es­sar­ily give the same the­o­rems. How­ever, the re­duc­tion­ist paradigm holds that the on­tolo­gies we use should all be con­sis­tent with one an­other (un­der some trans­la­tion be­tween the on­tolo­gies). At least, as­pire to be even­tu­ally con­sis­tent. Analo­gous to ax­iom/​as­sump­tion am­bi­guity and pro­gram/​in­put am­bi­guity, there is am­bi­guity be­tween an on­tol­ogy and the cog­ni­tive struc­ture which cre­ated and jus­tifies the on­tol­ogy. We can also dis­t­in­guish more lev­els; maybe we would say that an on­tol­ogy doesn’t make pre­dic­tions di­rectly, but pro­vides a lan­guage for stat­ing mod­els, which make pre­dic­tions. Even longer chains can make sense, but it’s all sub­jec­tive di­vi­sions. How­ever, un­like the situ­a­tion in logic and com­pu­ta­tion, we can’t ex­pect to ar­tic­u­late the full sup­port struc­ture for an on­tol­ogy; it is, af­ter all, a big mess of evolved neu­ral mechanisms which we don’t have di­rect ac­cess to.

Hav­ing es­tab­lished that we can talk about the same things in all three set­tings, I’ll re­strict my­self to talk­ing about on­tolo­gies.

Two-level defi­ni­tion of gears: A con­clu­sion is gears-like with re­spect to a par­tic­u­lar on­tol­ogy to the ex­tent that you can “see the deriva­tion” in that on­tol­ogy. A con­clu­sion is gears-like with­out qual­ifi­ca­tion to the ex­tent that you can also “see the deriva­tion” of the on­tol­ogy it­self. This is con­tigu­ous with gears-ness rel­a­tive to an on­tol­ogy, be­cause of the nat­u­ral am­bi­guity be­tween pro­grams and their in­puts, or be­tween ax­ioms and as­sump­tions. For a given ex­am­ple, though, it’s gen­er­ally more in­tu­itive to deal with the two lev­els sep­a­rately.

See­ing the deriva­tion: There are sev­eral things to point at by this phrase.

  • As in TEOTE, we might con­sider it im­por­tant that a model make pre­cise pre­dic­tions. This could be seen as a pre­req­ui­site of “see­ing the deriva­tion”: first, we must be say­ing some­thing spe­cific; then, we can ask if we can say why we’re say­ing that par­tic­u­lar thing. This im­plies that mod­els are more gears-like when they are more de­ter­minis­tic, all other things be­ing equal.

  • How­ever, I think it is also mean­ingful and use­ful to talk about whether the pre­dic­tions of the model are de­ter­minis­tic; the stan­dard way of as­sign­ing prob­a­bil­ities to dice is very gears-like, de­spite plac­ing wide prob­a­bil­ities. I think these are sim­ply two differ­ent im­por­tant things we can talk about.

  • Either way, be­ing able to see the deriva­tion is like be­ing able to see the proof or ex­e­cute the pro­gram, with all the slip­pages this im­plies. You see the deriva­tion less well to the ex­tent that you rely on known the­o­rems, and more to the ex­tent that you can spell out all the de­tails your­self if need be. You see it less well to the ex­tent that you un­der­stand the proof only step-by-step, and more well to the ex­tent that you can de­rive the proof as a nat­u­ral ap­pli­ca­tion of known prin­ci­ples. You can­not see the deriva­tion if you don’t even have ac­cess to the pro­gram which gen­er­ated the out­put, or are miss­ing some im­por­tant in­puts for that pro­gram.

See­ing the deriva­tion is about ex­plic­it­ness and ex­ter­nal ob­jec­tivity. You can triv­ially “ex­e­cute the pro­gram” gen­er­at­ing any of your thoughts, in that you think­ing is the pro­gram which gen­er­ated the thoughts. How­ever, the ex­e­cu­tion of this pro­gram could rely on ar­bi­trary de­tails of your cog­ni­tion. More­over, these de­tails are usu­ally not available for con­scious ac­cess, which means you can’t ex­plain the train of thought to oth­ers, and even you may not be able to repli­cate it later. So, a model is more gears-like the more repli­ca­ble it is. I’m not sure if this should be seen as an ad­di­tional re­quire­ment, or an ex­pla­na­tion of where the re­quire­ments come from.

Con­clu­sion, Fur­ther Directions

Ob­vi­ously, we only touched the tip of the ice­berg here. I started the post with the claim that I was try­ing to hash out the im­pli­ca­tions of log­i­cal in­duc­tion for prac­ti­cal ra­tio­nal­ity, but se­cretly, the post was about things which log­i­cal in­duc­tors can only barely be­gin to ex­plain. (I think these two di­rec­tions sup­port each other, though!)

We need the frame­work of log­i­cal in­duc­tion to un­der­stand some things here, such as how you still have de­grees of un­der­stand­ing when you already have the proof /​ already have a pro­gram which pre­dicts things perfectly (as dis­cussed in the “math­e­mat­i­cal un­der­stand­ing” sec­tion). How­ever, log­i­cal in­duc­tors don’t look like they care about “gears”—it’s not very close to the for­mal­ism, in the way that TEOTE gave a no­tion of tech­ni­cal ex­pla­na­tion which is close to the for­mal­ism of prob­a­bil­ity the­ory.

I men­tioned ear­lier that log­i­cal in­duc­tion suffers from the old ev­i­dence prob­lem more than Bayesi­anism. How­ever, it doesn’t suffer in the sense of los­ing bets it could be win­ning. Rather, we suffer, when we try to wrap our heads around what’s go­ing on. Some­how, log­i­cal in­duc­tion is learn­ing to do the right thing—the for­mal­ism is just not very ex­plicit about how it does this.

The idea (due to Sam Eisen­stat, hope­fully not butchered by me here) is that log­i­cal in­duc­tors get around the old ev­i­dence prob­lem by learn­ing no­tions of ob­jec­tivity.

A hy­poth­e­sis you come up with later can’t gain any cred­i­bil­ity by fit­ting ev­i­dence from the past. How­ever, if you reg­ister a pre­dic­tion ahead of time that a par­tic­u­lar hy­poth­e­sis-gen­er­a­tion pro­cess will even­tu­ally turn up some­thing which fits the old ev­i­dence, you can get credit, and use this credit to bet on what the hy­poth­e­sis claims will hap­pen later. You’re bet­ting on a par­tic­u­lar school of thought, rather than a known hy­poth­e­sis. “You can’t make money by pre­dict­ing old ev­i­dence, but you may be able to find a bene­fac­tor who takes it se­ri­ously.”

In or­der to do this, you need to spec­ify a pre­cise pre­dic­tion-gen­er­a­tion pro­cess which you are bet­ting in fa­vor of. For ex­am­ple, Solomonoff In­duc­tion can’t run as a trader, be­cause it is not com­putable. How­ever, the prob­a­bil­ities which it gen­er­ates are well-defined (if you be­lieve that halt­ing bits are well-defined, any­way), so you can make a busi­ness of bet­ting that its prob­a­bil­ities will have been good in hind­sight. If this busi­ness does well, then the whole mar­ket of the log­i­cal in­duc­tor will shift to­ward try­ing to make pre­dic­tions which Solomonoff In­duc­tion will later en­dorse.

Similarly for other ideas which you might be able to spec­ify pre­cisely with­out be­ing able to run right away. For ex­am­ple, you can’t find all the proofs right away, but you could bet that all the the­o­rems which the log­i­cal in­duc­tor ob­serves have proofs, and you’d be right ev­ery time. Do­ing so al­lows the mar­ket to start bet­ting it’ll see the­o­rems if it sees that they’re prov­able, even if it hasn’t yet seen this rule make a suc­cess­ful ad­vance pre­dic­tion. (Log­i­cal in­duc­tors start out re­ally ig­no­rant of logic; they don’t know what proofs are or how they’re con­nected to the­o­rems.)

This doesn’t ex­actly push to­ward gears-y mod­els as defined ear­lier, but it seems close. You push to­ward any­thing for which you can provide an ex­plicit jus­tifi­ca­tion, where “ex­plicit jus­tifi­ca­tion” is any­thing you can name ahead of time (and check later) which pins down pre­dic­tions of the sort which tend to cor­re­late with the truth.

This doesn’t mean the log­i­cal in­duc­tor con­verges en­tirely to gears-level rea­son­ing. Gears were never sup­posed to be ev­ery­thing, right? The op­ti­mal strat­egy com­bines gears-like and non-gears-like rea­son­ing. How­ever, it does sug­gest that gears-like rea­son­ing has an ad­van­tage over non-gears rea­son­ing: it can gain cred­i­bil­ity from old ev­i­dence. This will of­ten push gears-y mod­els above com­pet­ing non-gears con­sid­er­a­tions.

All of this is still ter­ribly in­for­mal, but is the sort of thing which could lead to a for­mal the­ory. Hope­fully you’ll give me credit later for that ad­vanced pre­dic­tion.