Against Modest Epistemology

Fol­low-up to: Blind Empiricism

Modest episte­mol­ogy doesn’t need to re­flect a skep­ti­cism about causal mod­els as such. It can man­i­fest in­stead as a wari­ness about putting weight down on one’s own causal mod­els, as op­posed to oth­ers’.

In 1976, Robert Au­mann demon­strated that two ideal Bayesian rea­son­ers with the same pri­ors can­not have com­mon knowl­edge of a dis­agree­ment. Tyler Cowen and Robin Han­son have ex­tended this re­sult, es­tab­lish­ing that even un­der var­i­ous weaker as­sump­tions, some­thing has to go wrong in or­der for two agents with the same pri­ors to get stuck in a dis­agree­ment.1 If you and a trusted peer don’t con­verge on iden­ti­cal be­liefs once you have a full un­der­stand­ing of one an­other’s po­si­tions, at least one of you must be mak­ing some kind of mis­take.

If we were fully ra­tio­nal (and fully hon­est), then we would always even­tu­ally reach con­sen­sus on ques­tions of fact. To be­come more ra­tio­nal, then, shouldn’t we set aside our claims to spe­cial knowl­edge or in­sight and mod­estly pro­fess that, re­ally, we’re all in the same boat?

When I’m try­ing to sort out ques­tions like these, I of­ten find it use­ful to start with a re­lated ques­tion: “If I were build­ing a brain from scratch, would I have it act this way?”

If I were build­ing a brain and I ex­pected it to have some non-fatal flaws in its cog­ni­tive al­gorithms, I ex­pect that I would have it spend some of its time us­ing those flawed rea­son­ing al­gorithms to think about the world; and I would have it spend some of its time us­ing those same flawed rea­son­ing al­gorithms to bet­ter un­der­stand its rea­son­ing al­gorithms. I would have the brain spend most of its time on ob­ject-level prob­lems, while spend­ing some time try­ing to build bet­ter meta-level mod­els of its own cog­ni­tion and how its cog­ni­tion re­lates to its ap­par­ent suc­cess or failure on ob­ject-level prob­lems.

If the thinker is deal­ing with a for­eign cog­ni­tive sys­tem, I would want the thinker to try to model the other agent’s think­ing and pre­dict the de­gree of ac­cu­racy this sys­tem will have. How­ever, the thinker should also record the em­piri­cal out­comes, and no­tice if the other agent’s ac­cu­racy is more or less than ex­pected. If par­tic­u­lar agents are more of­ten cor­rect than its model pre­dicts, the sys­tem should re­cal­ibrate its es­ti­mates so that it won’t be pre­dictably mis­taken in a known di­rec­tion.

In other words, I would want the brain to rea­son about brains in pretty much the same way it rea­sons about other things in the world. And in prac­tice, I sus­pect that the way I think, and the way I’d ad­vise peo­ple in the real world to think, works very much like that:

  • Try to spend most of your time think­ing about the ob­ject level. If you’re spend­ing more of your time think­ing about your own rea­son­ing abil­ity and com­pe­tence than you spend think­ing about Ja­pan’s in­ter­est rates and NGDP, or com­pet­ing omega-6 vs. omega-3 metabolic path­ways, you’re tak­ing your eye off the ball.

  • Less than a ma­jor­ity of the time: Think about how re­li­able au­thor­i­ties seem to be and should be ex­pected to be, and how re­li­able you are—us­ing your own brain to think about the re­li­a­bil­ity and failure modes of brains, since that’s what you’ve got. Try to be even­handed in how you eval­u­ate your own brain’s spe­cific failures ver­sus the spe­cific failures of other brains.2 While do­ing this, take your own meta-rea­son­ing at face value.

  • … and then next, the­o­ret­i­cally, should come the meta-meta level, con­sid­ered yet more rarely. But I don’t think it’s nec­es­sary to de­velop spe­cial skills for meta-meta rea­son­ing. You just ap­ply the skills you already learned on the meta level to cor­rect your own brain, and go on ap­ply­ing them while you hap­pen to be meta-rea­son­ing about who should be trusted, about de­grees of re­li­a­bil­ity, and so on. Any­thing you’ve already learned about rea­son­ing should au­to­mat­i­cally be ap­plied to how you rea­son about meta-rea­son­ing.3

  • Con­sider whether some­one else might be a bet­ter meta-rea­soner than you, and hence that it might not be wise to take your own meta-rea­son­ing at face value when dis­agree­ing with them, if you have been given strong lo­cal ev­i­dence to this effect.

That prob­a­bly sounded ter­ribly ab­stract, but in prac­tice it means that ev­ery­thing plays out in what I’d con­sider to be the ob­vi­ous in­tu­itive fash­ion.


Once upon a time, my col­league Anna Sala­mon and I had a dis­agree­ment. I thought—this sounds re­ally stupid in ret­ro­spect, but keep in mind that this was with­out benefit of hind­sight—I thought that the best way to teach peo­ple about de­tach­ing from sunk costs was to write a script for lo­cal Less Wrong meetup lead­ers to carry out ex­er­cises, thus en­abling all such mee­tups to be taught how to avoid sunk costs. We spent a cou­ple of months try­ing to write this sunk costs unit, though a lot of that was (as I con­ceived of it) an up-front cost to figure out the ba­sics of how a unit should work at all.

Anna was against this. Anna thought we should not try to care­fully write a unit. Anna thought we should just find some vol­un­teers and im­pro­vise a sunk costs teach­ing ses­sion and see what hap­pened.

I ex­plained that I wasn’t start­ing out with the hy­poth­e­sis that you could suc­cess­fully teach anti-sunk-cost rea­son­ing by im­pro­vi­sa­tion, and there­fore I didn’t think I’d learn much from ob­serv­ing the im­pro­vised ver­sion fail. This may sound less stupid if you con­sider that I was ac­cus­tomed to writ­ing many things, most of which never worked or ac­com­plished any­thing, and a very few of which peo­ple paid at­ten­tion to and men­tioned later, and that it had taken me years of writ­ing prac­tice to get even that far. And so, to me, nega­tive ex­am­ples seemed too com­mon to be valuable. The liter­a­ture was full of failed at­tempts to cor­rect for cog­ni­tive bi­ases—would one more ex­am­ple of that re­ally help?

I tried to care­fully craft a sunk costs unit that would rise above the stan­dard level (which was failure), so that we would ac­tu­ally learn some­thing when we ran it (I rea­soned). I also didn’t think up-front that it would be two months to craft; the com­ple­tion time just kept ex­tend­ing grad­u­ally—be­ware the plan­ning fal­lacy!—and then at some point we figured we had to run what we had.

As read by one of the more ex­pe­rienced meetup lead­ers, the script did not work. It was, by my stan­dards, a mis­er­able failure.

Here are three les­sons I learned from that ex­per­i­ment.

The first les­son is to not care­fully craft any­thing that it was pos­si­ble to liter­ally just im­pro­vise and test im­me­di­ately in its im­pro­vised ver­sion, ever. Even if the min­i­mum im­pro­vis­able product won’t be rep­re­sen­ta­tive of the real ver­sion. Even if you already ex­pect the cur­rent ver­sion to fail. You don’t know what you’ll learn from try­ing the im­pro­vised ver­sion.4

The sec­ond les­son was that my model of teach­ing ra­tio­nal­ity by pro­duc­ing units for con­sump­tion at mee­tups wasn’t go­ing to work, and we’d need to go with Anna’s ap­proach of train­ing teach­ers who could fail on more rapid cy­cles, and run­ning cen­tral­ized work­shops us­ing those teach­ers.

The third thing I learned was to avoid dis­agree­ing with Anna Sala­mon in cases where we would have com­mon knowl­edge of the dis­agree­ment.

What I learned wasn’t quite as sim­ple as, “Anna is of­ten right.” Eliezer is also of­ten right.

What I learned wasn’t as sim­ple as, “When Anna and Eliezer dis­agree, Anna is more likely to be right.” We’ve had a lot of first-or­der dis­agree­ments and I haven’t par­tic­u­larly been track­ing whose first-or­der guesses are right more of­ten.

But the case above wasn’t a first-or­der dis­agree­ment. I had pre­sented my rea­sons, and Anna had un­der­stood and in­ter­nal­ized them and given her ad­vice, and then I had guessed that in a situ­a­tion like this I was more likely to be right. So what I learned is, “Anna is some­times right even when my usual meta-rea­son­ing heuris­tics say oth­er­wise,” which was the real sur­prise and the first point at which some­thing like an ex­tra push to­ward agree­ment is ad­di­tion­ally nec­es­sary.

It doesn’t par­tic­u­larly sur­prise me if a physi­cist knows more about pho­tons than I do; that’s a case in which my usual meta-rea­son­ing already pre­dicts the physi­cist will do bet­ter, and I don’t need any ad­di­tional nudge to cor­rect it. What I learned from that sig­nifi­cant multi-month ex­am­ple was that my meta-ra­tio­nal­ity—my abil­ity to judge which of two peo­ple is think­ing more clearly and bet­ter in­te­grat­ing the ev­i­dence in a given con­text—was not par­tic­u­larly bet­ter than Anna’s meta-ra­tio­nal­ity. And that meant the con­di­tions for some­thing like Cowen and Han­son’s ex­ten­sion of Au­mann’s agree­ment the­o­rem were ac­tu­ally be­ing fulfilled. Not pre­tend ought-to-be fulfilled, but ac­tu­ally fulfilled.

Could adopt­ing mod­est episte­mol­ogy in gen­eral have helped me get the right an­swer in this case? The ver­sions of mod­est episte­mol­ogy I hear about usu­ally in­volve defer­ence to the ma­jor­ity view, to the aca­demic main­stream, or to pub­li­cly rec­og­nized elite opinion. Anna wasn’t a ma­jor­ity; there were two of us, and no­body else in par­tic­u­lar was party to the ar­gu­ment. Nei­ther of us were part of a main­stream. And at the point in time where Anna and I had that dis­agree­ment, any out­sider would have thought that Eliezer Yud­kowsky had the more im­pres­sive track record at teach­ing ra­tio­nal­ity. Anna wasn’t yet head­ing CFAR. Any ad­vice to fol­low track records, to trust ex­ter­nally ob­serv­able elite­ness in or­der to avoid the temp­ta­tion to over­con­fi­dence, would have fa­vored listen­ing to Yud­kowsky over Sala­mon—that’s part of the rea­son I trusted my­self over her in the first place! And then I was wrong any­way, be­cause in real life that is al­lowed to hap­pen even when one per­son has more ex­ter­nally ob­serv­able sta­tus than an­other.

Where­upon I be­gan to hes­i­tate to dis­agree with Anna, and hes­i­tate even more if she had heard out my rea­sons and yet still dis­agreed with me.

I ex­tend a similar cour­tesy to Nick Bostrom, who rec­og­nized the im­por­tance of AI al­ign­ment three years be­fore I did (as I dis­cov­ered af­ter­wards, read­ing through one of his pa­pers). Once upon a time I thought Nick Bostrom couldn’t pos­si­bly get any­thing done in academia, and that he was stay­ing in academia for bad rea­sons. After I saw Nick Bostrom suc­cess­fully found his own re­search in­sti­tute do­ing in­ter­est­ing things, I con­cluded that I was wrong to think Bostrom should leave academia—and also meta-wrong to have been so con­fi­dent while dis­agree­ing with Nick Bostrom. I still think that or­a­cle AI (limit­ing AI sys­tems to only an­swer ques­tions) isn’t a par­tic­u­larly use­ful con­cept to study in AI al­ign­ment, but ev­ery now and then I dust off the idea and check to see how much sense or­a­cles cur­rently make to me, be­cause Nick Bostrom thinks they might be im­por­tant even af­ter know­ing that I’m more skep­ti­cal.

There are peo­ple who think we all ought to be­have this way to­ward each other as a mat­ter of course. They rea­son:

a) on av­er­age, we can’t all be more meta-ra­tio­nal than av­er­age; and

b) you can’t trust the rea­son­ing you use to think you’re more meta-ra­tio­nal than av­er­age. After all, due to Dun­ning-Kruger, a young-Earth cre­ation­ist will also think they have plau­si­ble rea­son­ing for why they’re more meta-ra­tio­nal than av­er­age.

… Whereas it seems to me that if I lived in a world where the av­er­age per­son on the street cor­ner were Anna Sala­mon or Nick Bostrom, the world would look ex­tremely differ­ent from how it ac­tu­ally does.

… And from the fact that you’re read­ing this at all, I ex­pect that if the av­er­age per­son on the street cor­ner were you, the world would again look ex­tremely differ­ent from how it ac­tu­ally does.

(In the event that this book is ever read by more than 30% of Earth’s pop­u­la­tion, I with­draw the above claim.)


I once poked at some­one who seemed to be ar­gu­ing for a view in line with mod­est episte­mol­ogy, nag­ging them to try to for­mal­ize their episte­mol­ogy. They sug­gested that we all treat our­selves as hav­ing a black box re­ceiver (our brain) which pro­duces a sig­nal (opinions), and treat other peo­ple as hav­ing other black boxes pro­duc­ing other sig­nals. And we all re­ceived our black boxes at ran­dom—from an an­thropic per­spec­tive of some kind, where we think we have an equal chance of be­ing any ob­server. So we can’t start out by be­liev­ing that our sig­nal is likely to be more ac­cu­rate than av­er­age.

But I don’t think of my­self as hav­ing started out with the a pri­ori as­sump­tion that I have a bet­ter black box. I learned about pro­cesses for pro­duc­ing good judg­ments, like Bayes’s Rule, and this let me ob­serve when other peo­ple vi­o­lated Bayes’s Rule, and try to keep to it my­self. Or I read about sunk cost effects, and de­vel­oped tech­niques for avoid­ing sunk costs so I can aban­don bad be­liefs faster. After hav­ing made ob­ser­va­tions about peo­ple’s real-world perfor­mance and in­vested a lot of time and effort into get­ting bet­ter, I ex­pect some de­gree of out­perfor­mance rel­a­tive to peo­ple who haven’t made similar in­vest­ments.

To which the mod­est re­ply is: “Oh, but any crack­pot could say that their per­sonal episte­mol­ogy is bet­ter be­cause it’s based on a bunch of stuff that they think is cool. What makes you differ­ent?”

Or as some­one ad­vo­cat­ing what I took to be mod­esty re­cently said to me, af­ter I ex­plained why I thought it was some­times okay to give your­self the dis­cre­tion to dis­agree with main­stream ex­per­tise when the main­stream seems to be screw­ing up, in ex­actly the fol­low­ing words: “But then what do you say to the Repub­li­can?”

Or as Ozy Bren­nan puts it, in di­alogue form:

be­com­ing sane side: “Hey! Guys! I found out how to take over the world us­ing only the power of my mind and a tooth­pick.”

harm re­duc­tion side: “You can’t do that. No­body’s done that be­fore.”

be­com­ing sane side: “Of course they didn’t, they were com­pletely ir­ra­tional.”

harm re­duc­tion side: “But they thought they were ra­tio­nal, too.”

be­com­ing sane side: “The differ­ence is that I’m right.”

harm re­duc­tion side: “They thought that, too!”

This ques­tion, “But what if a crack­pot said the same thing?”, I’ve never heard for­mal­ized—though it seems clearly cen­tral to the mod­est paradigm.

My first and pri­mary re­ply is that there is a say­ing among pro­gram­mers: “There is not now, nor has there ever been, nor will there ever be, any pro­gram­ming lan­guage in which it is the least bit difficult to write bad code.”

This is known as Flon’s Law.

The les­son of Flon’s Law is that there is no point in try­ing to in­vent a pro­gram­ming lan­guage which can co­erce pro­gram­mers into writ­ing code you ap­prove of, be­cause that is im­pos­si­ble.

The deeper mes­sage of Flon’s Law is that this kind of defen­sive, ad­ver­sar­ial, lock-down-all-the-doors, block-the-idiots-at-all-costs think­ing doesn’t lead to the in­ven­tion of good pro­gram­ming lan­guages. And I would say much the same about episte­mol­ogy for hu­mans.

Prob­a­bil­ity the­ory and de­ci­sion the­ory shouldn’t de­liver clearly wrong an­swers. Ma­chine-speci­fied episte­mol­ogy shouldn’t mis­lead an AI rea­soner. But if we’re just deal­ing with ver­bal in­junc­tions for hu­mans, where there are de­grees of free­dom, then there is noth­ing we can say that a hy­po­thet­i­cal crack­pot could not some­how mi­suse. Try­ing to defend against that hy­po­thet­i­cal crack­pot will not lead us to de­vise a good sys­tem of thought.

But again, let’s talk for­mal episte­mol­ogy.

So far as prob­a­bil­ity the­ory goes, a good Bayesian ought to con­di­tion on all of the available ev­i­dence. E. T. Jaynes lists this as a ma­jor desider­a­tum of good episte­mol­ogy—that if we know A, B, and C, we ought not to de­cide to con­di­tion only on A and C be­cause we don’t like where B is point­ing. If you’re try­ing to es­ti­mate the ac­cu­racy of your episte­mol­ogy, and you know what Bayes’s Rule is, then—on naive, straight­for­ward, tra­di­tional Bayesian episte­mol­ogy—you ought to con­di­tion on both of these facts, and es­ti­mate P(ac­cu­racy|know_Bayes) in­stead of P(ac­cu­racy). Do­ing any­thing other than that opens the door to a host of para­doxes.

The con­ver­gence that perfect Bayesi­ans ex­hibit on fac­tual ques­tions doesn’t in­volve any­one stray­ing, even for a mo­ment, from their in­di­vi­d­ual best es­ti­mate of the truth. The idea isn’t that good Bayesi­ans try to make their be­liefs more closely re­sem­ble their poli­ti­cal ri­vals’ so that their ri­vals will re­cip­ro­cate, and it isn’t that they toss out in­for­ma­tion about their own ra­tio­nal­ity. Au­mann agree­ment hap­pens in­ci­den­tally, with­out any de­liber­ate push to­ward con­sen­sus, through each in­di­vi­d­ual’s sin­gle-minded at­tempt to rea­son from their own pri­ors to the hy­pothe­ses that best match their own ob­ser­va­tions (which hap­pen to in­clude ob­ser­va­tions about other perfect Bayesian rea­son­ers’ be­liefs).

Modest episte­mol­ogy seems to me to be tak­ing the ex­per­i­ments on the out­side view show­ing that typ­i­cal holi­day shop­pers are bet­ter off fo­cus­ing on their past track record than try­ing to model the fu­ture in de­tail, and com­bin­ing that with the Dun­ning-Kruger effect, to ar­gue that we ought to throw away most of the de­tails in our self-ob­ser­va­tion. At its episte­molog­i­cal core, mod­esty says that we should ab­stract up to a par­tic­u­lar very gen­eral self-ob­ser­va­tion, con­di­tion on it, and then not con­di­tion on any­thing else be­cause that would be in­side-view­ing. An ob­ser­va­tion like, “I’m fa­mil­iar with the cog­ni­tive sci­ence liter­a­ture dis­cussing which de­bi­as­ing tech­niques work well in prac­tice, I’ve spent time on cal­ibra­tion and vi­su­al­iza­tion ex­er­cises to ad­dress bi­ases like base rate ne­glect, and my ex­pe­rience sug­gests that they’ve helped,” is to be gen­er­al­ized up to, “I use an episte­mol­ogy which I think is good.” I am then to ask my­self what av­er­age perfor­mance I would ex­pect from an agent, con­di­tion­ing only on the fact that the agent is us­ing an episte­mol­ogy that they think is good, and not con­di­tion­ing on that agent us­ing Bayesian episte­mol­ogy or de­bi­as­ing tech­niques or ex­per­i­men­tal pro­to­col or math­e­mat­i­cal rea­son­ing or any­thing in par­tic­u­lar.

Only in this way can we force Repub­li­cans to agree with us… or some­thing. (Even though, of course, any­one who wants to shoot off their own foot will ac­tu­ally just re­ject the whole mod­est frame­work, so we’re not ac­tu­ally helping any­one who wants to go astray.)

Where­upon I want to shrug my hands hel­plessly and say, “But given that this isn’t nor­ma­tive prob­a­bil­ity the­ory and I haven’t seen mod­esty ad­vo­cates ap­pear to get any par­tic­u­lar out­perfor­mance out of their mod­esty, why go there?”

I think that’s my true re­jec­tion, in the fol­low­ing sense: If I saw a sen­si­ble for­mal episte­mol­ogy un­der­ly­ing mod­esty and I saw peo­ple who ad­vo­cated mod­esty go­ing on to out­perform my­self and oth­ers, ac­com­plish­ing great deeds through the strength of their diffi­dence, then, in­deed, I would start pay­ing very se­ri­ous at­ten­tion to mod­esty.

That said, let me go on be­yond my true re­jec­tion and try to con­struct some­thing of a re­duc­tio. Two re­duc­tios, ac­tu­ally.

The first re­duc­tio is just, as I asked the per­son who pro­posed the sig­nal-re­ceiver episte­mol­ogy: “Okay, so why don’t you be­lieve in God like a ma­jor­ity of peo­ple’s sig­nal re­ceivers tell them to do?”

“No,” he replied. “Just no.”

“What?” I said. “You’re al­lowed to say ‘just no’? Why can’t I say ‘just no’ about col­lapse in­ter­pre­ta­tions of quan­tum me­chan­ics, then?”

This is a se­ri­ous ques­tion for mod­est episte­mol­ogy! It seems to me that on the sig­nal-re­ceiver in­ter­pre­ta­tion you have to be­lieve in God. Yes, differ­ent peo­ple be­lieve in differ­ent Gods, and you could claim that there’s a ma­jor­ity dis­be­lief in ev­ery par­tic­u­lar God. But then you could as eas­ily dis­be­lieve in quan­tum me­chan­ics be­cause (you claim) there isn’t a ma­jor­ity of physi­cists that backs any par­tic­u­lar in­ter­pre­ta­tion. You could dis­be­lieve in the whole ed­ifice of mod­ern physics be­cause no ex­actly speci­fied ver­sion of that physics is agreed on by a ma­jor­ity of physi­cists, or for that mat­ter, by a ma­jor­ity of peo­ple on Earth. If the sig­nal-re­ceiver ar­gu­ment doesn’t im­ply that we ought to av­er­age our be­liefs to­gether with the the­ists and all ar­rive at an 80% prob­a­bil­ity that God ex­ists, or what­ever the plane­tary av­er­age is, then I have no idea how the episte­molog­i­cal me­chan­ics are sup­posed to work. If you’re al­lowed to say “just no” to God, then there’s clearly some level—ob­ject level, meta level, meta-meta level—where you are li­censed to take your own rea­son­ing at face value, de­spite a ma­jor­ity of other re­ceivers get­ting a differ­ent sig­nal.

But if we say “just no” to any­thing, even God, then we’re no longer mod­est. We are faced with the night­mare sce­nario of hav­ing granted our­selves dis­cre­tion about when to dis­agree with other peo­ple, a dis­cre­tionary pro­cess where we take our own rea­son­ing at face value. (Even if a ma­jor­ity of oth­ers dis­agree about this be­ing a good time to take our own be­liefs at face value, tel­ling us that rea­son­ing about the in­cred­ibly deep ques­tions of re­li­gion is surely the worst of all times to trust our­selves and our pride.) And then what do you say to the Repub­li­can?

And if you give peo­ple the li­cense to de­cide that they ought to defer, e.g., only to a ma­jor­ity of mem­bers of the Na­tional Academy of Sciences, who mostly don’t be­lieve in God; then surely the analo­gous li­cense is for the­ists to defer to the true ex­perts on the sub­ject, their fa­vorite priest­hood.

The sec­ond re­duc­tio is to ask your­self whether a su­per­in­tel­li­gent AI sys­tem ought to soberly con­di­tion on the fact that, in the world so far, many agents (hu­mans in psy­chi­a­tric wards) have be­lieved them­selves to be much more in­tel­li­gent than a hu­man, and they have all been wrong.

Sure, the su­per­in­tel­li­gence thinks that it re­mem­bers a uniquely de­tailed his­tory of hav­ing been built by soft­ware en­g­ineers and raised on train­ing data. But if you ask any other ran­dom agent that thinks it’s a su­per­in­tel­li­gence, that agent will just tell you that it re­mem­bers a unique his­tory of be­ing cho­sen by God. Each other agent that be­lieves it­self to be a su­per­in­tel­li­gence will force­fully re­ject any anal­ogy to the other hu­mans in psy­chi­a­tric hos­pi­tals, so clearly “I force­fully re­ject an anal­ogy with agents who wrongly be­lieve them­selves to be su­per­in­tel­li­gences” is not suffi­cient jus­tifi­ca­tion to con­clude that one re­ally is a su­per­in­tel­li­gence. Per­haps the su­per­in­tel­li­gence will plead that its in­ter­nal ex­pe­riences, de­spite the ex­tremely ab­stract and high-level point of similar­ity, are re­ally ex­tremely dis­similar in the de­tails from those of the pa­tient in the psy­chi­a­tric hos­pi­tal. But of course, if you ask them, the psy­chi­a­tric pa­tient could just say the same thing, right?

I mean, the psy­chi­a­tric pa­tient wouldn’t say that, the same way that a crack­pot wouldn’t ac­tu­ally give a long ex­pla­na­tion of why they’re al­lowed to use the in­side view. But they could, and ac­cord­ing to mod­esty, That’s Ter­rible.


To gen­er­al­ize, sup­pose we take the fol­low­ing rule se­ri­ously as episte­mol­ogy, terming it Rule M for Modesty:

Rule M: Let X be a very high-level gen­er­al­iza­tion of a be­lief sub­sum­ing spe­cific be­liefs X1, X2, X3.… For ex­am­ple, X could be “I have an above-av­er­age episte­mol­ogy,” X1 could be “I have faith in the Bible, and that’s the best episte­mol­ogy,” X2 could be “I have faith in the words of Mo­hammed, and that’s the best episte­mol­ogy,” and X3 could be “I be­lieve in Bayes’s Rule, be­cause of the Dutch Book ar­gu­ment.” Sup­pose that all peo­ple who be­lieve in any Xi, taken as an en­tire class X, have an av­er­age level F of fal­li­bil­ity. Sup­pose also that most peo­ple who be­lieve some Xi also be­lieve that their Xi is not similar to the rest of X, and that they are not like most other peo­ple who be­lieve some X, and that they are less fal­lible than the av­er­age in X. Then when you are as­sess­ing your own ex­pected level of fal­li­bil­ity you should con­di­tion only on be­ing in X, and com­pute your ex­pected fal­li­bil­ity as F. You should not at­tempt to con­di­tion on be­ing in X3 or ask your­self about the av­er­age fal­li­bil­ity you ex­pect from peo­ple in X3.

Then the first ma­chine su­per­in­tel­li­gence should con­clude that it is in fact a pa­tient in a psy­chi­a­tric hos­pi­tal. And you should be­lieve, with a prob­a­bil­ity of around 33%, that you are cur­rently asleep.

Many peo­ple, while dream­ing, are not aware that they are dream­ing. Many peo­ple, while dream­ing, may be­lieve at some point that they have wo­ken up, while still be­ing asleep. Clearly there can be no li­cense from “I think I’m awake” to the con­clu­sion that you ac­tu­ally are awake, since a dream­ing per­son could just dream the same thing.

Let Y be the state of not think­ing that you are dream­ing. Then Y1 is the state of a dream­ing per­son who thinks this, and Y2 is the state of ac­tu­ally be­ing awake. It boots noth­ing, on Rule M, to say that Y2 is in­tro­spec­tively dis­t­in­guish­able from Y1 or that the in­ner ex­pe­riences of peo­ple in Y2 are ac­tu­ally quite differ­ent from those of peo­ple in Y1. Since peo­ple in Y1 usu­ally falsely be­lieve that they’re in Y2, you ought to just con­di­tion on be­ing in Y, not con­di­tion on be­ing in Y2. There­fore you should as­sign a 67% prob­a­bil­ity to cur­rently be­ing awake, since 67% of ob­server-mo­ments who be­lieve they’re awake are ac­tu­ally awake.

Which is why—in the dis­tant past, when I was ar­gu­ing against the mod­esty po­si­tion for the first time—I said: “Those who dream do not know they dream, but when you are awake, you know you are awake.” The mod­est haven’t for­mal­ized their episte­mol­ogy very much, so it would take me some years past this point to write down the Rule M that I thought was at the heart of the mod­esty ar­gu­ment, and say that “But you know you’re awake” was meant to be a re­duc­tio of Rule M in par­tic­u­lar, and why. Rea­son­ing un­der un­cer­tainty and in a bi­ased and er­ror-prone way, still we can say that the prob­a­bil­ity we’re awake isn’t just a func­tion of how many awake ver­sus sleep­ing peo­ple there are in the world; and the rules of rea­son­ing that let us up­date on Bayesian ev­i­dence that we’re awake can serve that pur­pose equally well whether or not dream­ers can profit from us­ing the same rules. If a rock wouldn’t be able to use Bayesian in­fer­ence to learn that it is a rock, still I can use Bayesian in­fer­ence to learn that I’m not.

Next: Sta­tus Reg­u­la­tion and Anx­ious Un­der­con­fi­dence.

The full book will be available Novem­ber 16th. You can go to equil­ibri­ to pre-or­der the book or learn more.

  1. See Cowen and Han­son, “Are Disagree­ments Hon­est?

  2. This doesn’t mean the net es­ti­mate of who’s wrong comes out 50-50. It means that if you ra­tio­nal­ized last Tues­day then you ex­pect your­self to ra­tio­nal­ize this Tues­day, if you would ex­pect the same thing of some­one else af­ter see­ing the same ev­i­dence.

  3. And then the re­cur­sion stops here, first be­cause we already went in a loop, and sec­ond be­cause in prac­tice noth­ing novel hap­pens af­ter the third level of any in­finite re­cur­sion.

  4. Chap­ter 22 of my Harry Pot­ter fan­fic­tion, Harry Pot­ter and the Meth­ods of Ra­tion­al­ity, was writ­ten af­ter I learned this les­son.