Toolbox-thinking and Law-thinking


I’ve no­ticed a di­chotomy be­tween “think­ing in toolboxes” and “think­ing in laws”.

The toolbox style of think­ing says it’s im­por­tant to have a big bag of tools that you can adapt to con­text and cir­cum­stance; peo­ple who think very toolboxly tend to sus­pect that any­one who goes talk­ing of a sin­gle op­ti­mal way is just ig­no­rant of the uses of the other tools.

The lawful style of think­ing, done cor­rectly, dis­t­in­guishes be­tween de­scrip­tive truths, nor­ma­tive ideals, and pre­scrip­tive ideals. It may talk about cer­tain paths be­ing op­ti­mal, even if there’s no ex­e­cutable-in-prac­tice al­gorithm that yields the op­ti­mal path. It con­sid­ers truths that are not tools.

Within nearly-Eu­clidean mazes, the tri­an­gle in­equal­ity—that the path AC is never spa­tially longer than the path ABC—is always true but only some­times use­ful. The tri­an­gle in­equal­ity has the pre­scrip­tive im­pli­ca­tion that if you know that one path choice will travel ABC and one path will travel AC, and if the only prag­matic path-merit you care about is go­ing the min­i­mum spa­tial dis­tance (rather than say avoid­ing stairs be­cause some­body in the party is in a wheelchair), then you should pick the route AC. But the tri­an­gle in­equal­ity goes on gov­ern­ing Eu­clidean mazes whether or not you know which path is which, and whether or not you need to avoid stairs.

Toolbox thinkers may be ex­tremely sus­pi­cious of this claim of uni­ver­sal lawful­ness if it is ex­plained less than perfectly, be­cause it sounds to them like “Throw away all the other tools in your toolbox! All you need to know is Eu­clidean ge­om­e­try, and you can always find the short­est path through any maze, which in turn is always the best path.”

If you think that’s an un­re­al­is­tic de­pic­tion of a mi­s­un­der­stand­ing that would never hap­pen in re­al­ity, keep read­ing.

Here’s a re­cent con­ver­sa­tion from Twit­ter which I’d con­sider a nearly perfect illus­tra­tion of the toolbox-vs.-laws di­chotomy:

David Chap­man: “By ra­tio­nal­ism, I mean any claim that there is an ul­ti­mate crite­rion ac­cord­ing to which think­ing and act­ing could be judged to be cor­rect or op­ti­mal… Un­der this defi­ni­tion, ‘ra­tio­nal­ism’ must go be­yond ‘sys­tem­atic meth­ods are of­ten use­ful, hooray!’… A ra­tio­nal­ism claims there is one weird trick to cor­rect think­ing, which guaran­tees an op­ti­mal re­sult. (Some ra­tio­nal­isms spec­ify the trick; oth­ers in­sist there must be one, but that it is not cur­rently know­able.) A ra­tio­nal­ism makes strongly nor­ma­tive judg­ments: ev­ery­one ought to think that way.”
Gra­ham Rowe: “Is it fair to say that ra­tio­nal­ists see the world en­tirely through ra­tio­nal­ity while meta-ra­tio­nal­ists look at ra­tio­nal­ity as one of many tools (that they can use fluently and ap­pro­pri­ately) to be used in ser­vice of a broader pur­pose?”
David Chap­man: “More-or-less, I think! Although I don’t think ra­tio­nal­ists do see the world en­tirely through ra­tio­nal­ity, they just say they think they ought to.”
Ju­lia Galef: “I don’t think the ‘one weird trick’ de­scrip­tion is ac­cu­rate. It’s more like: there’s one cor­rect nor­ma­tive model in the­ory, which can­not pos­si­bly be ap­prox­i­mated by a sin­gle rule in prac­tice, but we can look for col­lec­tions of ‘tricks’ that seem like they bring us closer to the nor­ma­tive model. e.g., ‘On the mar­gin, tak­ing more small risks is likely to in­crease your EV’ is one ex­am­ple.”
David Chap­man: “The el­e­ment that I’d call clearly meta-ra­tio­nal is un­der­stand­ing that ra­tio­nal­ity is not one well-defined thing but a bag of tricks that are more-or-less ap­pli­ca­ble in differ­ent situ­a­tions.”

Ju­lia then quoted a pa­per men­tion­ing “The best pre­scrip­tion for hu­man rea­son­ing is not nec­es­sar­ily to always use the nor­ma­tive model to gov­ern one’s think­ing.” To which Chap­man replied:

“Baron’s dis­tinc­tion be­tween ‘nor­ma­tive’ and ‘pre­scrip­tive’ is one I haven’t seen be­fore. That seems use­ful and maybe key. OTOH, if we’re look­ing for a dis­agree­ment crux, it might be whether a nor­ma­tive the­ory that can’t be achieved, even in prin­ci­ple, is a good thing.”

I’m now go­ing to badly stereo­type this con­ver­sa­tion in the form I feel like I’ve seen it many times pre­vi­ously, in­clud­ing e.g. in the dis­cus­sion of p-val­ues and fre­quen­tist statis­tics. On this stereo­typ­i­cal de­pic­tion, there is a di­chotomy be­tween the think­ing of Msr. Toolbox and Msr. Lawful that goes like this:

Msr. Toolbox: “It’s im­por­tant to know how to use a broad va­ri­ety of statis­ti­cal tools and adapt them to con­text. The many ways of calcu­lat­ing p-val­ues form one broad fam­ily of tools; any par­tic­u­lar tool in the set has good uses and bad uses, de­pend­ing on con­text and what ex­actly you do. Us­ing like­li­hood ra­tios is an in­ter­est­ing statis­ti­cal tech­nique, and I’m sure it has its good uses in the right con­texts. But it would be very sur­pris­ing if that one weird trick was the best calcu­la­tion to do in ev­ery pa­per and ev­ery cir­cum­stance. If you claim it is the uni­ver­sal best way, then I sus­pect you of blind ideal­ism, in­sen­si­tivity to con­text and nu­ance, ig­no­rance of all the other tools in the toolbox, the sheer folly of cal­low youth. You only have a ham­mer and no real-world ex­pe­rience us­ing screw­drivers, so you claim ev­ery­thing is a nail.”

Msr. Lawful: “On com­plex prob­lems we may not be able to com­pute ex­act Bayesian up­dates, but the math still de­scribes the op­ti­mal up­date, in the same way that a Carnot cy­cle de­scribes a ther­mo­dy­nam­i­cally ideal en­g­ine even if you can’t build one. You are un­likely to find a su­pe­rior view­point that makes some other up­date even more op­ti­mal than the Bayesian up­date, not with­out do­ing a great deal of fun­da­men­tal math re­search and maybe not at all. We didn’t choose that for­mal­ism ar­bi­trar­ily! We have a very broad va­ri­ety of co­her­ence the­o­rems all spotlight­ing the same cen­tral struc­ture of prob­a­bil­ity the­ory, say­ing vari­a­tions of ‘If your be­hav­ior can­not be viewed as co­her­ent with prob­a­bil­ity the­ory in sense X, you must be ex­e­cut­ing a dom­i­nated strat­egy and shoot­ing off your foot in sense Y’.”

I cur­rently sus­pect that when Msr. Law talks like this, Msr. Toolbox hears “I pre­scribe to you the fol­low­ing recipe for your be­hav­ior, the Bayesian Up­date, which you ought to ex­e­cute in ev­ery kind of cir­cum­stance.”

This also ap­pears to me to fre­quently turn into one of those awful durable forms of mi­s­un­der­stand­ing: Msr. Toolbox doesn’t see what you could pos­si­bly be tel­ling some­body to do with a “good” or “ideal” al­gorithm be­sides ex­e­cut­ing that al­gorithm.

It would not sur­prise me if there’s a sym­met­ri­cal form of durable mi­s­un­der­stand­ing where a Law­ist has trou­ble pro­cess­ing a Toolboxer’s dis­claimer: “No, you don’t un­der­stand, I am not try­ing to de­scribe the one true perfect op­ti­mal al­gorithm here, I’m try­ing to de­scribe a con­text-sen­si­tive tool that is some­times use­ful in real life.” Msr. Law may not see what you could pos­si­bly be do­ing with a sup­pos­edly “pru­dent” or “ac­tion­able” recipe be­sides say­ing that it’s the cor­rect an­swer, and may feel very sus­pi­cious of some­body try­ing to say ev­ery­one should use an an­swer while dis­claiming that they don’t re­ally think it’s true. Surely this is just the setup for some ab­surd motte-and-bailey where we claim some­thing is the nor­ma­tive an­swer, and then as soon as we’re challenged we walk back and claim it was ‘just one tool in the toolbox’.

And it’s not like those cal­low youths the Toolboxer is try­ing to lec­ture don’t ac­tu­ally ex­ist. The world is full of peo­ple who think they have the One True Recipe (with­out hav­ing a nor­ma­tive ideal by which to prove that this is in­deed the op­ti­mal recipe given their prefer­ences, knowl­edge, and available com­put­ing power).

The only way I see to re­solve this con­fu­sion is by grasp­ing a cer­tain par­tic­u­lar ab­strac­tion and dis­tinc­tion—as a more Lawfully in­clined per­son might put it. Or by be­ing able to de­ploy both kinds of think­ing, de­pend­ing on con­text—as a more Toolbox-in­clined per­son might put it.

It may be that none of my read­ers need the lec­ture at this point, but I’ve learned to be cau­tious about that sort of thing, so I’ll walk through the differ­ence any­ways.

Every traversable maze has a spa­tially short­est path; or if we are to be pre­cise in our claims but not our mea­sure­ments, a set of spa­tially short­est-ish paths that are all nearly the same dis­tance.

We may per­haps call this spa­tially short­est path the “best” or “ideal” or “op­ti­mal” path through the maze, if we think our prefer­ence for walk­ing shorter dis­tances is the only prag­mat­i­cally im­por­tant merit of a path.

That there ex­ists some short­est path, which may even be op­ti­mal ac­cord­ing to our prefer­ences, doesn’t mean that you can come to an in­ter­sec­tion at the maze and “just choose whichever branch is on the short­est path”.

And the fact that you can­not, at an in­ter­sec­tion, just choose the shorter path, doesn’t mean that the con­cepts of dis­tance and greater or lesser dis­tance aren’t use­ful.

It might even be that the maze-owner could truth­fully tell you, “By the way, this right-hand turn here keeps you on the short­est path,” and yet you’d still be wiser to take the left-hand turn… be­cause you’re fol­low­ing the left-hand rule. Where the left-hand rule is to keep your left hand on the wall and go on walk­ing, which works for not get­ting lost in­side a maze whose exit is con­nected to the start by walls. It’s a good rule for agents with sharply bounded mem­o­ries who can’t always re­mem­ber their paths ex­actly.

And if you’re us­ing the left-hand rule it is a ter­rible, ter­rible idea to jump walls and make a differ­ent turn just once, even if that looks like a great idea at the time, be­cause that is an ex­cel­lent way to get stuck travers­ing a dis­con­nected is­land of con­nected walls in­side the labyrinth.

So mak­ing the left-hand turn leads you to walk the short­est ex­pected dis­tance, rel­a­tive to the other rules you’re us­ing. Mak­ing the right-hand turn in­stead, even if it seemed lo­cally smart, might have you travers­ing an in­finite dis­tance in­stead.

But then you may not be on the short­est path, even though you are fol­low­ing the recom­men­da­tions of the wis­est and most pru­dent rule given your cur­rent re­sources. By con­tem­plat­ing the differ­ence, you know that there is in prin­ci­ple room for im­prove­ment. Maybe that in­spires you to write a maze-map­ping, step-count­ing cel­l­phone app that lets you get to the exit faster than the left-hand rule.

And the rea­son that there’s a bet­ter recipe isn’t that “no recipe is perfect”, it isn’t that there ex­ists an in­finite se­quence of ever-bet­ter roads. If the maze-owner gave you a map with the short­est path drawn in a line, you could walk the true short­est path and there wouldn’t be any shorter path than that.

Short­ness is a prop­erty of paths; a ten­dency to pro­duce shorter paths is a prop­erty of recipes. What makes a phone app an im­prove­ment is not that the app is ad­her­ing more neatly to some ideal se­quence of left and right turns, it’s that the path is shorter in a way that can be defined in­de­pen­dently of the app’s al­gorithms.

Once you can ad­mit a path can be “shorter” in a way that ab­stracts away from the walker—not bet­ter, which does de­pend on the walker, but shorter—it’s hard not to ad­mit the no­tion of there be­ing a short­est path.

I mean, I sup­pose you could try very hard to never talk about a short­est path and only talk about al­ter­na­tive recipes that yield shorter paths. You could dili­gently make sure to never imag­ine this short­er­ness as a kind of de­creased dis­tance-in-perfor­mance-space from any ‘short­est path’. You could make very sure that in your con­sid­er­a­tion of new recipes, you main­tain your ide­olog­i­cal pu­rity as a toolboxer by only ever ask­ing about laws that gov­ern which of two paths are shorter, and never get­ting any in­spira­tion from any kind of law that gov­erns which path is short­est.

In which case you would have dili­gently elimi­nated a valuable con­cep­tual tool from your toolbox. You would have care­fully made sure that you always had to take longer roads to those men­tal des­ti­na­tions that can be reached the fastest by con­tem­plat­ing prop­er­ties of ideal solu­tions, or dis­tance from ideal solu­tions.

But why? Why would you?

I think at this point the Toolbox re­ply—though I’m not sure I could pass its Ide­olog­i­cal Tur­ing Test—might be that ideal­is­tic think­ing has a great trap and rot­ten­ness at its heart.

It might say:

Some­body who doesn’t wisely shut down all this think­ing about “short­est paths” in­stead of the left-hand rule as a good tool for some mazes—some­one who be­gins to imag­ine some un­reach­able ideal of perfec­tion, in­stead of a se­ries of apps that find shorter paths most of the time—will surely, in prac­tice, be­gin to con­fuse the no­tion of the left-hand rule, or their other cur­rent recipe, with the short­est path.

After all, no­body can see this “short­est path”, and it’s sup­pos­edly a vir­tu­ous thing. So isn’t it an in­evitable con­se­quence of hu­man na­ture that peo­ple will start to use that idea as praise for their cur­rent recipes?

And also in the real world, surely Msr. Law will in­evitably for­get the ex­tra premise in­volved with the step from “spa­tially short­est path” to “best path”- the con­tex­tual re­quire­ment that our only im­por­tant prefer­ence was shorter spa­tial dis­tances so defined. Msr. Law will in­sist that some­body in a wheelchair go down the “best path” of the maze, even though that path in­volves go­ing up and down a flight of stairs.

And Msr. Law will be un­able to men­tally deal with a he­li­copter overfly­ing the maze that vi­o­lates their on­tol­ogy rel­a­tive to which “the short­est path” was defined.

And it will also never oc­cur to Msr. Law to pedal around the maze in a bi­cy­cle, which is a much eas­ier trip even if it’s not the short­est spa­tial dis­tance.

And Msr. Law will as­sume that the be­hav­ior of mort­gage-backed se­cu­ri­ties is in­de­pen­dently Gaus­sian-ran­dom be­cause the math is neater that way, and then de­rive a definite the­o­rem show­ing a top-level tranche of MBSs will al­most never de­fault, thus bring­ing down their trad­ing firm -

To all of which I can only re­ply: “Well, yes, that hap­pens some of the time, and there are con­tex­tual oc­ca­sions where it is a use­ful tool to lec­ture Msr. Law on the im­por­tance of hav­ing a di­verse toolbox. But it is not a uni­ver­sal truth that ev­ery­one works like that and needs to be pre­scribed the same lec­ture! You need to be sen­si­tive to con­text here!”

There are definitely ver­sions of Msr. Law who think the uni­ver­sal gen­er­al­iza­tion they’ve been told about is a One Weird Trick That Is All You Need To Know; peo­ple who could in fact benefit from a lec­ture on the im­por­tance of di­verse toolboxes.

There are also ex­treme toolbox thinkers could benefit from a lec­ture on the im­por­tance of think­ing that con­sid­ers un­reach­able ideals, and how to get closer to them, and the ob­sta­cles that are mov­ing us away from them.

Not to com­mit the fal­lacy of the golden mean or any­thing, but the two view­points are both meta­tools in the meta­toolbox, as it were. You’re bet­ter off if you can use both in ways that de­pend on con­text and cir­cum­stance, rather than in­sist­ing that only toolbox rea­son­ing is the uni­ver­sally best con­text-in­sen­si­tive met­away to think.

If that’s not putting the point too sharply.

Think­ing in terms of Law is of­ten use­ful. You just have to be care­ful to un­der­stand the con­text and the caveats: when is the right time to think in Law, how to think in Law, and what type of prob­lems call for Lawful think­ing.

Which is not the same as say­ing that ev­ery Law has ex­cep­tions. Ther­mo­dy­nam­ics still holds even at times, like play­ing ten­nis, when it’s not a good time to be think­ing about ther­mo­dy­nam­ics. If you thought that ev­ery Law had ex­cep­tions be­cause it wasn’t always use­ful to think about that Law, you’d be re­ject­ing the meta­tool of Law en­tirely, and think­ing in toolbox terms at a time when it wasn’t use­ful to do so.

Are there Laws of op­ti­mal thought gov­ern­ing the op­ti­mal way to con­tex­tu­al­ize and caveat, which might be helpful for find­ing good ex­e­cutable recipes? The nat­u­rally Lawful thinker will im­me­di­ately sus­pect so, even if they don’t know what those Laws are. Not know­ing these Laws won’t panic a healthy Lawful thinker. In­stead they’ll pro­ceed to look around for use­ful yet chaotic-seem­ing pre­scrip­tions to use now in­stead of later—with­out mis­tak­ing those chaotic pre­scrip­tions for Laws, or treat­ing the chaos of their cur­rent recipes as proof that there’s no good nor­ma­tive ideals to be had.

In­deed, it can some­times be use­ful to con­tem­plate, in de­tail, that there are prob­a­bly Laws you don’t know. But that’s a more ad­vanced meta­tool in the meta­toolbox, use­ful in nar­rower ways and in fewer con­texts hav­ing to do with the in­ven­tion of new Laws as well as new recipes, and I’d rather not strain Msr. Toolbox’s cre­dulity any fur­ther.

To close out, one recipe I’d pre­scribe to re­duce con­fu­sion in the toolbox-in­clined is to try to see the Laws as de­scrip­tive state­ments, rather than be­ing any kind of nor­ma­tive ideal at all.

The idea that there’s a short­est path through the maze isn’t a “nor­ma­tive ideal” in­stead of a “pre­scrip­tive ideal”, it’s just true. Once you define dis­tance there is in fact a short­est path through the maze.

The tri­an­gle in­equal­ity might sound very close to a pre­scrip­tive rule that you ought to walk along AC in­stead of ABC. But ac­tu­ally the pre­scrip­tive rule is only if you want to walk shorter dis­tances ce­teris paribus, only if you know which turn is which, only if you’re not try­ing to avoid stairs, and only if you’re not tak­ing an even faster route by get­ting on a bi­cy­cle and rid­ing out­side the whole maze to the exit. The pre­scrip­tive rule “try walk­ing along AC” isn’t the same as the tri­an­gle in­equal­ity it­self, which goes on be­ing true of spa­tial dis­tances in Eu­clidean or nearly-Eu­clidean ge­ome­tries—whether or not you know, whether or not you care, whether or not it’s use­ful to think about at any given mo­ment, even if you own a bi­cy­cle.

The state­ment that you can’t have a heat-pres­sure en­g­ine more effi­cient than a Carnot cy­cle isn’t about gath­er­ing in a cultish cir­cle to sing praises of the Carnot cy­cle as be­ing the ideally best pos­si­ble kind of en­g­ine. It’s just a true fact of ther­mo­dy­nam­ics. This true fact might helpfully sug­gest that you think about ob­sta­cles to Carnot-ness as pos­si­ble places to im­prove your en­g­ine—say, that you should try to pre­vent heat loss from the com­bus­tion cham­ber, since heat loss pre­vents an adi­a­batic cy­cle. But even at times when it’s not in fact use­ful to think about Carnot cy­cles, it doesn’t mean your heat en­g­ine is al­lowed on those oc­ca­sions to perform bet­ter than a Carnot en­g­ine.

You can’t ex­tract any more ev­i­dence from an ob­ser­va­tion than is given by its like­li­hood ra­tio. You could see this as be­ing true be­cause Bayesian up­dat­ing is an of­ten-un­reach­able nor­ma­tive ideal of rea­son­ing, so there­fore no­body can do bet­ter than it. But I’d call it a deeper level of un­der­stand­ing to see it as a law say­ing that you can’t get a higher ex­pected score by mak­ing any differ­ent up­date. This is a gen­er­al­iza­tion that holds over both Bayes-in­spired recipes and non-Bayes-in­spired recipes. If you want to as­sign higher prob­a­bil­ity to the cor­rect hy­poth­e­sis, it’s a short step from that prefer­ence to re­gard­ing Bayesian up­dates as a nor­ma­tive ideal; but the idea be­gins life as a de­scrip­tive as­ser­tion, not as a nor­ma­tive as­ser­tion.

It’s a rel­a­tively shal­low un­der­stand­ing of the co­her­ence the­o­rems to say “Well, they show that if you don’t use prob­a­bil­ities and ex­pected util­ities you’ll be in­co­her­ent, which is bad, so you shouldn’t do that.” It’s a deeper un­der­stand­ing to state, “If you do some­thing that is in­co­her­ent in way X, it will cor­re­spond to a dom­i­nated strat­egy in fash­ion Y. This is a uni­ver­sal gen­er­al­iza­tion that is true about ev­ery tool in the statis­ti­cal toolbox, whether or not they are in fact co­her­ent, whether or not you per­son­ally pre­fer to avoid dom­i­nated strate­gies, whether or not you have the com­put­ing power to do any bet­ter, even if you own a bi­cy­cle.”

I sup­pose that when it comes to the likes of Fun The­ory, there isn’t any deeper fact of na­ture un­der­ly­ing the “nor­ma­tive ideal” of a eu­daimonic uni­verse. But in sim­pler mat­ters of math and sci­ence, a “nor­ma­tive ideal” like the Carnot cy­cle or Bayesian de­ci­sion the­ory is al­most always the man­i­fes­ta­tion of some sim­pler fact that is so closely re­lated to some­thing we want that we are tempted to take one step to the right and view it as a “nor­ma­tive ideal”. If you’re aller­gic to nor­ma­tive ideals, maybe a helpful course would be to dis­card the view of what­ever-it-is as a nor­ma­tive ideal and try to un­der­stand it as a fact.

But that is a more ad­vanced state of un­der­stand­ing than try­ing to un­der­stand what is bet­ter or best. If you’re not aller­gic to ideals, then it’s okay to try to un­der­stand why Bayesian up­dates are of­ten-un­reach­able nor­ma­tive ideals, be­fore you try to un­der­stand how they’re just there.