# Positive Bias: Look Into the Dark

I am teach­ing a class, and I write upon the black­board three num­bers: 2-4-6. “I am think­ing of a rule,” I say, “which gov­erns se­quences of three num­bers. The se­quence 2-4-6, as it so hap­pens, obeys this rule. Each of you will find, on your desk, a pile of in­dex cards. Write down a se­quence of three num­bers on a card, and I’ll mark it ‘Yes’ for fits the rule, or ‘No’ for not fit­ting the rule. Then you can write down an­other set of three num­bers and ask whether it fits again, and so on. When you’re con­fi­dent that you know the rule, write down the rule on a card. You can test as many triplets as you like.”

Here’s the record of one stu­dent’s guesses:

 4-6-2 No 4-6-8 Yes 10-12-14 Yes .

At this point the stu­dent wrote down their guess at the rule. What do you think the rule is? Would you have wanted to test an­other triplet, and if so, what would it be? Take a mo­ment to think be­fore con­tin­u­ing.

The challenge above is based on a clas­sic ex­per­i­ment due to Peter Wa­son, the 2-4-6 task. Although sub­jects given this task typ­i­cally ex­pressed high con­fi­dence in their guesses, only 21% of the sub­jects suc­cess­fully guessed the ex­per­i­menter’s real rule, and repli­ca­tions since then have con­tinued to show suc­cess rates of around 20%.

The study was called “On the failure to elimi­nate hy­pothe­ses in a con­cep­tual task.” Sub­jects who at­tempt the 2-4-6 task usu­ally try to gen­er­ate pos­i­tive ex­am­ples, rather than nega­tive ex­am­ples—they ap­ply the hy­po­thet­i­cal rule to gen­er­ate a rep­re­sen­ta­tive in­stance, and see if it is la­beled “Yes.”

Thus, some­one who forms the hy­poth­e­sis “num­bers in­creas­ing by two” will test the triplet 8-10-12, hear that it fits, and con­fi­dently an­nounce the rule. Some­one who forms the hy­poth­e­sis X-2X-3X will test the triplet 3-6-9, dis­cover that it fits, and then an­nounce that rule.

In ev­ery case the ac­tual rule is the same: the three num­bers must be in as­cend­ing or­der.

But to dis­cover this, you would have to gen­er­ate triplets that shouldn’t fit, such as 20-23-26, and see if they are la­beled “No.” Which peo­ple tend not to do, in this ex­per­i­ment. In some cases, sub­jects de­vise, “test,” and an­nounce rules far more com­pli­cated than the ac­tual an­swer.

This cog­ni­tive phe­nomenon is usu­ally lumped in with “con­fir­ma­tion bias.” How­ever, it seems to me that the phe­nomenon of try­ing to test pos­i­tive rather than nega­tive ex­am­ples, ought to be dis­t­in­guished from the phe­nomenon of try­ing to pre­serve the be­lief you started with. “Pos­i­tive bias” is some­times used as a syn­onym for “con­fir­ma­tion bias,” and fits this par­tic­u­lar flaw much bet­ter.

It once seemed that phlo­gis­ton the­ory could ex­plain a flame go­ing out in an en­closed box (the air be­came sat­u­rated with phlo­gis­ton and no more could be re­leased). But phlo­gis­ton the­ory could just as well have ex­plained the flame not go­ing out. To no­tice this, you have to search for nega­tive ex­am­ples in­stead of pos­i­tive ex­am­ples, look into zero in­stead of one; which goes against the grain of what ex­per­i­ment has shown to be hu­man in­stinct.

For by in­stinct, we hu­man be­ings only live in half the world.

One may be lec­tured on pos­i­tive bias for days, and yet over­look it in-the-mo­ment. Pos­i­tive bias is not some­thing we do as a mat­ter of logic, or even as a mat­ter of emo­tional at­tach­ment. The 2-4-6 task is “cold,” log­i­cal, not af­fec­tively “hot.” And yet the mis­take is sub-ver­bal, on the level of imagery, of in­stinc­tive re­ac­tions. Be­cause the prob­lem doesn’t arise from fol­low­ing a de­liber­ate rule that says “Only think about pos­i­tive ex­am­ples,” it can’t be solved just by know­ing ver­bally that “We ought to think about both pos­i­tive and nega­tive ex­am­ples.” Which ex­am­ple au­to­mat­i­cally pops into your head? You have to learn, word­lessly, to zag in­stead of zig. You have to learn to flinch to­ward the zero, in­stead of away from it.

I have been writ­ing for quite some time now on the no­tion that the strength of a hy­poth­e­sis is what it can’t ex­plain, not what it can —if you are equally good at ex­plain­ing any out­come, you have zero knowl­edge. So to spot an ex­pla­na­tion that isn’t helpful, it’s not enough to think of what it does ex­plain very well—you also have to search for re­sults it couldn’t ex­plain, and this is the true strength of the the­ory.

So I said all this, and then I challenged the use­ful­ness of “emer­gence” as a con­cept. One com­menter cited su­per­con­duc­tivity and fer­ro­mag­netism as ex­am­ples of emer­gence. I replied that non-su­per­con­duc­tivity and non-fer­ro­mag­netism were also ex­am­ples of emer­gence, which was the prob­lem. But be it far from me to crit­i­cize the com­menter! De­spite hav­ing read ex­ten­sively on “con­fir­ma­tion bias,” I didn’t spot the “gotcha” in the 2-4-6 task the first time I read about it. It’s a sub­ver­bal blink-re­ac­tion that has to be re­trained. I’m still work­ing on it my­self.

So much of a ra­tio­nal­ist’s skill is be­low the level of words. It makes for challeng­ing work in try­ing to con­vey the Art through words. Peo­ple will agree with you, but then, in the next sen­tence, do some­thing sub­de­liber­a­tive that goes in the op­po­site di­rec­tion. Not that I’m com­plain­ing! A ma­jor rea­son I’m writ­ing this is to ob­serve what my words haven’t con­veyed.

Are you search­ing for pos­i­tive ex­am­ples of pos­i­tive bias right now, or spar­ing a frac­tion of your search on what pos­i­tive bias should lead you to not see? Did you look to­ward light or dark­ness?

• I think some­thing else is go­ing on with the 2 4 6 ex­per­i­ment, as de­scribed. Many of the stu­dents are mak­ing the as­sump­tion about the set of po­ten­tial rules. Speci­fi­cally, the as­sump­tion is that most pairs of rules in this set have the fol­low­ing mu­tual re­la­tion­ship: most of the in­stances al­lowed by one rule, are dis­al­lowed by the other rule. This be­ing the case, then the quick­est way to test any hy­po­thet­i­cal rule is to pro­duce a va­ri­ety of in­stances which con­form with that rule, to see whether they con­form with the hid­den rule.

I’ll give you an ex­am­ple. Sup­pose that we are con­sid­er­ing a fam­ily of rules, “the third num­ber is an in­te­ger polyno­mial of the first two num­bers”. The quick­est way to dis­con­firm a hy­po­thet­i­cal rule is to pro­duce in­stances in ac­cor­dance with it and test them. If the rule is wrong, then the chances are good that an in­stance will quickly be dis­cov­ered that does not match the hid­den rule. It is much less effi­cient to pro­ceed by pro­duc­ing in­stances not in ac­cor­dance with it.

I’ll give a spe­cific ex­am­ple. Sup­pose the hid­den rule is c = a + b, and the hy­poth­e­sized rule be­ing tested is c = a—b. Now pick just one ran­dom in­stance in ac­cor­dance with the hy­poth­e­sized rule. I will sup­pose a = 4, b = 6, so c = −2. So the in­stance is 4 6 −2. That in­stance does not match the hid­den rule, so the hy­poth­e­sized rule is im­me­di­ately dis­con­firmed. Now try the fol­low­ing: in­stead of pick­ing a ran­dom in­stance in ac­cor­dance with the hy­poth­e­sized rule, pick one not in ac­cor­dance with it. I’ll pick 4 6 8. This also fails to match the hid­den rule, so it fails to tell us whether our hy­poth­e­sized rule is cor­rect. We see that it was quicker to test an in­stance that agrees with the hy­po­thet­i­cal rule.

Thus we can see that in a cer­tain class of situ­a­tions, the most effi­cient way to test a hy­poth­e­sis is to come up with in­stances that con­form with the hy­poth­e­sis.

Now you can fault peo­ple on hav­ing made this as­sump­tion. But if you do, then it is still a differ­ent er­ror from the one de­scribe. If the as­sump­tion about the kind of prob­lem faced had been cor­rect, then the ap­proach (test­ing in­stances that agree with the hy­poth­e­sis) would have been a good one. The er­ror, if any, lies not in the ap­proach per se but in the as­sump­tion.

Fi­nally, I do not think one can rightly fault peo­ple for mak­ing that as­sump­tion. For, it is in­evitable that very large and com­pletely untested as­sump­tions must be made in or­der to come to a con­clu­sion at all. For, in­finitely many rules are con­sis­tent with the ev­i­dence no mat­ter how many in­stances you test. The only way ever to whit­tle this in­finity of rules con­sis­tent with all the ev­i­dence down to one con­cluded rule is to make very large as­sump­tions. The as­sump­tion that I have de­scribed may sim­ply be the as­sump­tion which they made (and they had to make some as­sump­tion).

Fur­ther­more, it doesn’t mat­ter what as­sump­tions peo­ple make (and they must make some, be­cause of the na­ture of the prob­lem), a clever sci­en­tist can learn what as­sump­tions peo­ple tend to make and then vi­o­late those as­sump­tions. So no mat­ter what peo­ple do, some­one can come along, con­struct an ex­per­i­ment in which those as­sump­tions are vi­o­lated, and then say, “gotcha” when the ma­jor­ity of his test sub­jects come to the wrong con­clu­sions (be­cause of the as­sump­tions they were mak­ing which were vi­o­lated by the ex­per­i­ment).

• Another se­ri­ous prob­lem is that the stu­dents must make the nec­es­sary as­sump­tion that the rule be sim­ple. In the con­text of school, sim­ple is gen­er­ally “most triv­ial to figure out”.

This is a nec­es­sary as­sump­tion be­cause there could be rules that would not be pos­si­ble to de­ter­mine by guess­ing. For ex­am­ple, you’d have to spend the life­time of the uni­verse guess­ing triplets to cor­rectly iden­tify that the rule is “As­cend­ing in­te­gers ex­cept se­quences con­tain­ing the 22nd Busy Beaver num­ber”, and then you still wouldn’t know if there’s some other rider.

If it was said, “It will re­quire sev­eral more guesses to figure out the rule, but not more than a cou­ple dozen, and the se­quences you have don’t fully tell you what the rule is”, the ex­er­cise would be a lot more sane. At worst, the only mis­take the stu­dents made was as­sum­ing that the ex­er­cise was sup­posed to be too sim­ple. Which is like ask­ing them to be mind read­ers: I’m think­ing of a prob­lem; on a scale of 1-10, please guess how difficult it is to solve.

• The prob­lem is not that they are try­ing ex­am­ples which con­firm their hy­poth­e­sis it’s that they are try­ing only those ex­am­ples which test their hy­poth­e­sis.

The ar­ti­cle fo­cuses on test­ing ex­am­ples which don’t work be­cause peo­ple don’t do this enough. Search­ing for pos­i­tive ex­am­ples is (as you ar­gue) a nec­ces­sary part of test­ing a hy­poth­e­sis, and peo­ple seem to have no prob­lem ap­ply­ing this. What peo­ple fail to do is to search for the nega­tive as well.

Both pos­i­tive and nega­tive ex­am­ples are, I’d say, equally im­por­tant, but peo­ple’s fo­cus is com­pletely im­bal­anced.

• In the situ­a­tion you de­scribed, it would be nec­es­sary to test val­ues that did and didn’t match the hy­poth­e­sis, which ends up work­ing an awful lot like ad­just­ing away from an an­chor. Is there a way of solv­ing the 2 4 6 prob­lem with­out com­ing up with a hy­poth­e­sis too early?

• The prob­lem is not that they come up with a hy­poth­e­sis too early, it’s that they stop too early with­out test­ing ex­am­ples that are not sup­posed to work. In most cases peo­ple are given as many op­por­tu­ni­ties to test as they’d like, yet they are con­fi­dent in their an­swer af­ter only test­ing one or two cases (all of which came up pos­i­tive).

The trick is that you should come up with one or more hy­pothe­ses as soon as you can (maybe with­out an­nounc­ing them), but test both cases which do and don’t con­firm it, and be pre­pared to change your hy­poth­e­sis if you are proven wrong.

• If it re­quires a round-trip of hu­man speech through a pro­fes­sor (and thus the req­ui­si­tion of the at­ten­tion of the en­tire class) then you can hardly say they are given as many op­por­tu­ni­ties to test as they’d like. A per­son of func­tion­ing so­cial in­tel­li­gence cer­tainly has no more than 20 such round-trips available con­sec­u­tively, and less con­ser­va­tively even 4 might be push­ing it for many.

Give them a com­puter pro­gram to in­ter­act with and then you can say they have as many op­por­tu­ni­ties to test as they’d like.

• Come up with sev­eral hy­pothe­ses in par­allel, per­haps?

• Sooo many dou­ble posts! This new in­ter­face is buggy as @#\$!

• Fol­low­ing what Con­stant has pointed out, I am won­der­ing if there is, in fact, a way to solve the 2 4 6 prob­lem with­out first guess­ing, and then ad­just­ing your guess.

• Fol­low­ing what Con­stant has pointed out, I am won­der­ing if there is, in fact, a way to solve the 2 4 6 prob­lem with­out first guess­ing, and then ad­just­ing your guess.

• In the situ­a­tion you de­scribed, it would be nec­es­sary to test val­ues that did and didn’t match the hy­poth­e­sis, which ends up work­ing an awful lot like ad­just­ing away from an an­chor. Is there a way of solv­ing the 2 4 6 prob­lem with­out com­ing up with a hy­poth­e­sis too early?

• I meant that first com­ment to be more spec­u­la­tive than definite. I was spec­u­lat­ing about an al­ter­na­tive ex­pla­na­tion of the ob­served be­hav­ior, which lo­cates the fault el­se­where.

• Build­ing on the pre­vi­ous com­menter:

Through play­ing var­i­ous games of this sort, peo­ple de­velop a prior on the space of rules which has a lot of mass around rules of the type “X,X+2,X+4” or “X,2X,3X”.

• Why is it that I sus­pect Con­stant didn’t guess the rule prop­erly?

Isn’t it the en­tire point of the post that con­fir­ma­tion bias is the ten­dency NOT TO CHECK ASSUMPTIONS?

• Are you search­ing for pos­i­tive ex­am­ples of pos­i­tive bias right now, or spar­ing a frac­tion of your search on what pos­i­tive bias should lead you to not see? Did you look to­ward light or dark­ness?

Your hy­poth­e­sis is that pos­i­tive bi­ases are gen­er­ally bad. It is thus my duty to try and dis­prove your idea, and see what emerges from the re­sult.

Let’s take your ex­am­ple, but now the se­quences are ten num­bers long and the ini­tial se­quence is 2-4-6-10-12-14-16-18-20-22 (the rule is still the same). Pick­ing a se­quence at ran­dom from a given set of num­bers, we have only one chance in 10! = 3628800 of com­ing up with one that obeys the rule. Some­one fol­low­ing the ap­proach you recom­mended would prob­a­bly fist try one in­stance of “x,x+2,x+4...” or “x,2x,3x,...”, then start check­ing a few ran­dom se­quences (get­ting “No” on each one, with near cer­tainty). In this in­stance, dis­re­gard­ing pos­i­tive bias doesn’t help (un­less you do a re­ally bru­tal amount of test­ing). This is not just an ar­ti­fact of “long” se­quences—had we stuck with the se­quence of three num­bers, but the rule was “all in as­cend­ing or­der, or one num­ber above ten trillion”, then find­ing the right rule would be just as hard. What gives?

Even worse, sup­pose you started with two as­sump­tions: 1) the se­quence is x,2x,3x,4x,5x,… 10x 2) the se­quence is x, x+2, x+4,… x+18

You do one or two (pos­i­tive) tests of 1). They comes up “yes”. You then re­mem­ber to try and dis­prove the hy­poth­e­sis, try a hun­dred ran­dom se­quences, com­ing up with “no” ev­ery time. You then ac­cept 1).

How­ever, had you just tried to do some pos­i­tive test­ing of 1) and 2), you would very quickly have found out that some­thing was wrong.

Anal­y­sis: Test­ing is in­deed about try­ing to dis­prove a hy­poth­e­sis, and gain­ing con­fi­dence when you fail. But your hy­poth­e­sis cov­ers un­countably many differ­ent cases, and you can test (pos­i­tively or nega­tively) only a very few. Un­less you have some grounds to as­sume that this is enough (such as the uniform time and space as­sump­tions of mod­ern sci­ence, or some sort of nice or­der­ing or mea­sure on the space of hy­pothe­ses or of ob­ser­va­tions), then nei­ther pos­i­tive nor nega­tive test­ing are giv­ing you much in­for­ma­tion.

How­ever, if you have two com­pet­ing hy­poth­e­sis about the world, then a lit­tle test­ing is enough to tell which one is cor­rect. This is the eas­iest way of mak­ing progress, and should always be con­sid­ered.

Ver­dict: Aware­ness of pos­i­tive bias causes us to think “I may be wrong, I should check”. The cor­rect at­ti­tude in front of these sorts of prob­lems is the sub­tly differ­ent “there may be other ex­pla­na­tions for what I see, I should find them”. The two sen­ti­ments feel similar, but lead to very differ­ent ways of tack­ling the prob­lem.

• I think the Wa­son se­lec­tion task with cards is an even more di­rect demon­stra­tion of the ten­dency to seek con­fir­ma­tory, but not dis­con­fir­ma­tory, tests of a hy­poth­e­sis.

• Stu­art, you do have a “nice or­der­ing mea­sure”—sim­pler hy­pothe­ses (“all as­cend­ing”) have a higher prior prob­a­bil­ity than com­plex ones (“all as­cend­ing OR one over ten trillion” or ran­dom­ness). Pos­i­tive test­ing of con­tra­dic­tory, high-prior-prob­a­bil­ity hy­pothe­ses is still nega­tive test­ing of your origi­nal hy­poth­e­sis, no?

• This ex­per­i­ment isn’t up to the usual stan­dards of an eco­nomics ex­per­i­ment. When economists do such an in­for­ma­tion ex­per­i­ment, we give sub­jects some in­di­ca­tion of the dis­tri­bu­tion that the hid­den truth will be drawn from, and then we ac­tu­ally draw from that dis­tri­bu­tion. You can always make sub­jects look like fools if you give them an ex­am­ple that is rare given their prior ex­pec­ta­tions.

• Robin, I ob­serve that Na­ture also fails to live up to the usual stan­dards of an eco­nomics ex­per­i­ment.

Stu­art and Con­stant, in AI/​ma­chine learn­ing we have a for­mal no­tion of “strictly more gen­eral con­cepts” as those with a strictly greater set of pos­i­tive ex­am­ples, and sym­met­ri­cally for strictly more spe­cific con­cepts. (This is not usu­ally what I mean when I say “con­cept” but this is the term of art in ma­chine learn­ing.)

Pos­i­tive bias im­plies that peo­ple look at a set of ex­am­ples and a start­ing con­cept, and try to en­vi­sion a strictly more spe­cific con­cept: for ex­am­ple, “as­cend­ing by 2 but all num­bers pos­i­tive”. We seem to fo­cus less on find­ing a strictly more gen­eral con­cept, such as “sep­a­rated by equal in­ter­vals” or “in as­cend­ing or­der” or “any se­quence not end­ing in 2″.

Why do we only look in the more-spe­cific di­rec­tion and see only half the uni­verse of con­cepts? In­stinct, one might sim­ply say, and be done with it it. One might try a Bayesian ar­gu­ment that any more gen­eral con­cept would con­cen­trate its prob­a­bil­ity mass less, and do a poorer job of ex­plain­ing the pos­i­tive ex­am­ples found—for it seems that 10-12-14 is an un­likely thing to see, if the gen­er­a­tor is “any se­quence” than “any se­quence sep­a­rated by in­ter­vals of 2″. But this is an in­valid ar­gu­ment if you are the one gen­er­at­ing the ex­am­ples! As for the ini­tial ex­am­ple be­ing mis­lead­ingly spe­cific, heck, peo­ple read nonex­is­tent co­in­ci­dences into Na­ture all the time. It may not be fair of the ex­per­i­menter but it is cer­tainly re­al­is­tic as a test of a ra­tio­nal­ist’s skill.

If you are test­ing ex­am­ples in an or­a­cle, “pos­i­tive” and “nega­tive” are sym­met­ri­cal la­bels. This point alone should make it very clear that, from the stand­point of prob­a­bil­ity the­ory, we are deal­ing strictly with a bizarre quirk of hu­man psy­chol­ogy.

• Flynn, you write, “Isn’t it the en­tire point of the post that con­fir­ma­tion bias is the ten­dency NOT TO CHECK ASSUMPTIONS?”

You sim­ply can’t check all your as­sump­tions in finite time in this task, which is a prob­lem, be­cause you must com­plete the task in a finite time. That is not your fault—that is in­trin­sic in the challenge. There­fore some of your as­sump­tions will nec­es­sar­ily go untested—and they will nec­es­sar­ily be enor­mous as­sump­tions. The rea­son for this is that the set of pos­si­ble rules is too large—it’s in­finite—and re­mains in­finite no mat­ter how much test­ing you do.

See also Stu­art’s com­ment and Robin’s com­ment. I think they ex­press ma­jor points I was try­ing to make, more clearly than I did.

• Eliezer, yes some­times na­ture in­cludes rare events, but only rarely. We should eval­u­ate hu­man in­fer­ence abil­ities on av­er­age across the kinds of cases hu­mans face, and not just for rare sur­pris­ing events.

• The plethora of in­cor­rect hy­poth­e­sis com­pared to the rel­a­tively few cor­rect (so far) the­o­ries seem to speak against this.

• I’m not sure I buy the whole ‘sub­ver­bal’ thing—it seems to me that mis­lead­ing phras­ing is a big part of the prob­lem. If asked to find the “rule” which “gov­erns” a se­quence of three num­bers, I’d (in­cor­rectly …) as­sume that the ques­tioner was think­ing of some sim­ple rule that can be used to gen­er­ate all of the valid se­quences. Given the ex­am­ples, I’d guess it was some­thing like ‘x x+2 x+4’ or ‘2x 2(x+1) 2(x+2).’ Now, af­ter I started typ­ing this I re­al­ized that you could map all as­cend­ing 3 in­te­ger se­quences to the whole num­bers, so there is a “rule” that could be used to gen­er­ate the solu­tion, but no­body would look at the solu­tion in these terms nat­u­rally—in­stead, we think of the solu­tion as the set of se­quences with the “prop­erty” of be­ing in as­cend­ing or­der. If the ques­tioner said that he was think­ing of “a prop­erty which se­quences of 3 num­bers ei­ther have or lack,” rather than a “rule” which “gov­erns” the se­quences, I sus­pect more folks would dis­cover the cor­rect solu­tion.

• Robin, I sus­pect that Eliezer has a differ­ent per­spec­tive on that, given his line of work. Availa­bil­ity bias on which bi­ases to over­come? The cre­ation of a seed AI is an event so rare that is has never hap­pened (so far as we can tell), but failure to get it right on the first try could elimi­nate all life in the so­lar sys­tem. There is per­haps room for dis­cussing av­er­age and bet­ter in­fer­ence abil­ities with re­spect to com­mon and rare events, al­though we would do well to be clear on ex­actly what we are ar­gu­ing.

• Con­stant made an im­por­tant point: in­finitely many rules are con­sis­tent with the ev­i­dence no mat­ter how many in­stances you test. There­fore any guess you make must be in­fluenced by prior ex­pec­ta­tions. And like lu­sispe­dro said, based on ex­pe­rience stu­dents prob­a­bly put a lot more weight on rules based on sim­ple equa­tions than rules based on in­equal­ities.

I’m sure I could get the per­centage of peo­ple who guess cor­rectly down to 0% by sim­ply choos­ing the perfectly valid rule: “se­quences (a,b,c) such that EITHER a less than b less than c OR b is a mul­ti­ple of 73.”

Why? Be­cause rules of that sort are given low weight in sub­jects’ pri­ors.

• It seems very nor­mal to ex­pect that the rule will be more re­stric­tive or ar­ith­metic in na­ture. But if I am sup­posed to be sure of the rule, then I need to test more than just a few pos­si­bil­ities. Pri­ors are definitely in­volved here.

Part of the prob­lem is that we are trained like Mon­keys to make de­ci­sions on un­der­speci­fied prob­lems of this form all the time. I’ve hardly ever seen a “guess the next [num­ber|let­ter|item] in the se­quence prob­lem that didn’t have mul­ti­ple an­swers. But most of them have at least one an­swer that feels “right” in the sense of be­ing sim­plest, most el­e­gant or most ob­vi­ous or within typ­i­cal bounds given ba­sic as­sump­tions about prob­lems of that type.

I’m the sort of ac­cu­racy-minded prick who would keep test­ing un­til he was very close to cer­tain what the rule was, and would prob­a­bly take for­ever.

An in­ter­est­ing ver­sion of this phe­nomenon is the game: “Bang! Who’s dead”. one per­son starts the game, says “Bang!”, and some num­ber of peo­ple are metaphor­i­cally dead, based on a rule that the other par­ti­ci­pants are sup­posed to figure out (which is, AFAIK, the same ev­ery time, but I’m not say­ing it here). The only in­for­ma­tion that the starter will give is who is dead each time.

Took me for­ever to solve this, be­cause I tend to have a much weaker ver­sion of the bias you con­sider here. But re­al­is­ti­cally, most of my mates solved this game much faster than I did. I sus­pect that this “jump to con­clu­sions” bias is use­ful in many situ­a­tions.

• After see­ing the four ex­am­ples (in­clud­ing one that didn’t fit) given, it didn’t even oc­cur to me that some­one could think the first one in­di­cated a X-2X-3X pat­tern. It’s hard to tell what will con­firm and what will dis­con­firm in such a broad space of pos­si­bil­ities.

A bit off topic but af­ter nu­mer­ous in­ci­dents of mock­ing Eliezer, Men­cius Mold­bug has launched a full-scale as­sault on Bayesi­anism. He hasn’t shown any in­cli­na­tion to post his cri­tiques here, but per­haps some of the lu­mi­nar­ies here could show him the er­ror of his ways.

• That is a good link, Am­bitwistor. The last para­graph refers to an in­ter­est­ing psy­cholog­i­cal hy­poth­e­sis, which I’d like to ex­pand on in an ex­am­ple re­lated to the “Look Into the Dark” post. Let’s rephrase EY’s propo­si­tion to give it more of a so­cial “plot”.

“You’re a smug­gler in a strange for­eign land, where they only al­low ex­ports of goods in cer­tain com­bi­na­tions of quan­tities, so as to keep their do­mes­tic lobby groups happy. [Yes, it’s a con­vu­lated ex­am­ple, but gov­ern­ments can be con­voluted.] Trou­ble is, ev­ery­one knows the rule ex­cept you and your gang of smug­glers, and if you ask, you be­come a sus­pect. Fur­ther­more, you don’t ac­tu­ally know what you’re smug­gling, since your fence always seals them in the stan­dard ex­port con­tain­ers, which are num­bered or­di­nally “First”, “Se­cond”, and “Third”.

Since you’re an amoral smug­gler boss, in charge of a lot of obe­di­ent “mule” un­der­lings, you can send as many peo­ple through cus­toms as you want and no mat­ter how many get ar­rested, you won’t be a sus­pect. Also, you have an in­finite num­ber of empty ex­port con­tain­ers with the usual “First”, “Se­cond” or “Third” la­bels. If your mule gets ar­rested with an empty con­tainer, he’ll be re­leased im­me­di­ately. So ba­si­cally, you can test the rule all you want, since you’ll wit­ness any ar­rest that hap­pens.

Just as you and your team of crim­i­nals ar­rives at the cus­toms check­point, a man goes into cus­toms with 2 “First” boxes, 4 “Se­cond” boxes, and 6 “Third” boxes. You can start mak­ing a tidy profit as soon you de­ter­mine what the rule is. What is your next move?”

Granted, it is a con­voluted ex­am­ple and I’m wor­ried that in its cur­rent form it would just con­fuse too many test sub­jects. Per­haps some­one would think of a more straight­for­ward equiv­a­lent. The point, though, is to make the test sound less like the sort of rule we are fa­mil­iar with from math class. As sev­eral posters have al­luded, usu­ally a rule in math class is much stric­ter and re­quires some ar­ith­metic. A bu­reau­cratic rule, con­voluted though it may be, will of­ten be math­e­mat­i­cally sim­pler. E.g., “The ex­tremely pow­er­ful pineap­ple lobby has pushed through a law re­quiring that no other fruit (pa­paya or mango) be ex­ported in greater num­bers than pineap­ples. Ex­ports from the poli­ti­cally weak mango in­dus­try must not ex­ceed pa­paya ex­ports. Pineap­ples are la­bel­led Third; pa­payas la­bel­led Se­cond; man­gos la­bel­led First.”

My hy­poth­e­sis is that peo­ple will come up with this rule faster than they would when faced with the phras­ing from the origi­nal post. (Of course, the “do­mes­tic lobby groups happy” phras­ing is sort of a give­away … maybe it should be re­placed with a more neu­tral ex­pla­na­tion, or none at all.)

• We’re play­ing a game in which you, the player, start with a num­ber se­quence. There is a rule gov­ern­ing which num­ber comes next, and who­ever de­ter­mines the rule will re­cieve \$10. Any one can play, but I tagged the peo­ple who i think will be most in­ter­ested.

If you guess a num­ber, I will tell you if it is cor­rect, and if so, I will add it to the ex­ist­ing se­quence. Please only guess one num­ber each day. Please only guess one num­ber at a time, dont try and fill in a sec­tion of the se­quence.

If you guess the rule, I will tell you if you are cor­rect or in­cor­rect. If cor­rect, you win \$10. If in­cor­rect, you may not guess the rule again for 3 days.

Origi­nal se­quence:

2, 4, 6

The se­quence so far is:

2, 4, 6, 10, 18, 30, 50, 82, 134, 218, 354, 622, 623, 630 47 com­ments Up­dated about a month ago

Craig Fleischman (In­di­ana) wrote at 7:44pm on July 13th, 2007 10? Mes­sage—Delete

Dan Mar­go­lis (Ja­pan) wrote at 8:31pm on July 13th, 2007 7 Mes­sage—Delete

Jeff Bo­rack wrote at 1:03am on July 14th, 2007 10 yes, 7 no Delete

Dan Mar­go­lis (Ja­pan) wrote at 9:52am on July 14th, 2007 Its like fibonacci se­quence ex­cept start­ing at 2. The next digit is the sum of the two pre­vi­ous digits. So it would be 2, 4, 6, 10, 16, 26, 42, 68, 110...

So… X0 = 2, X1 = 4, Xn = (Xn-1 + Xn-2) Mes­sage—Delete

Jeff Bo­rack wrote at 12:16pm on July 14th, 2007 In­cor­rect Delete

Dan Mar­go­lis (Ja­pan) wrote at 12:38pm on July 14th, 2007 Worth a shot...I can’t de­duce much from so few num­bers… Mes­sage—Delete

Elliot Alyesh­merni wrote at 7:06pm on July 14th, 2007 im gonna go with 18 Mes­sage—Delete

Jeff Bo­rack wrote at 8:24pm on July 14th, 2007 a job well done Delete

Yvette Monach­ino wrote at 8:08pm on July 15th, 2007 30 Mes­sage—Delete

Jeff Bo­rack wrote at 10:47am on July 16th, 2007 30 works Delete

Yvette Monach­ino wrote at 11:06am on July 16th, 2007 50 Mes­sage—Delete

Jeff Bo­rack wrote at 11:35am on July 16th, 2007 good Delete

Elliot Alyesh­merni wrote at 2:25pm on July 16th, 2007 82, still havent got­ten the se­quence down so this is a bit of a guess Mes­sage—Delete

Jeff Bo­rack wrote at 2:33pm on July 16th, 2007 good Delete

Elliot Alyesh­merni wrote at 3:19pm on July 16th, 2007 i think we all got this se­quence now.. Mes­sage—Delete

Jeff Bo­rack wrote at 3:36pm on July 16th, 2007 i dont think any­one has it. but i wel­come you to guess. If your right, \$10. If your wrong, at least you’ll save yvette! Good luck. Delete

Peter Dahlke wrote at 7:31pm on July 16th, 2007 134 next? Mes­sage—Delete

Jeff Bo­rack wrote at 8:07pm on July 16th, 2007 yup Delete

Elliot Alyesh­merni wrote at 10:49pm on July 16th, 2007 218 Mes­sage—Delete

Jeff Bo­rack wrote at 11:15pm on July 16th, 2007 218 Delete

Vic­tor Bara­nowski wrote at 10:16am on July 17th, 2007 IDK where it started, but as­sum­ing we started with 2, 4, 6 the se­quence is:

Xn = X (n-1) + [(X(n-1) - X(n-2)) + (X(n-2)-X(n-3))]

or some­thing like that… Mes­sage—Delete

Jeff Bo­rack wrote at 10:26am on July 17th, 2007 In­ter­est­ing guess, I thought peo­ple were gonna say Xn = X(n-1)+X(n-2)+2, but both are wrong. Sorry Vic. The more in­ter­est­ing ques­tion is: why did it take so long for some­one to guess? Is the re­ward for guess­ing the cor­rect an­swer to low or is the penalty to high? Delete

Jeff Bo­rack wrote at 10:45am on July 17th, 2007 I’m chang­ing the rule of 1 rule guess/​week. You can now guess once ev­ery three days. Num­bers are still once a day even though el­liot broke that rule and i ac­cepted the num­ber. Delete

Elliot Alyesh­merni wrote at 11:43am on July 17th, 2007 this a an­swer works for ev­ery num­ber ex­cept 6 and 18, but i’ll put it down anyway

X(n)=2X(n-1)-X(n-3) Mes­sage—Delete

Vic­tor Bara­nowski wrote at 12:22pm on July 17th, 2007 Ya, that was similar to mine. Why the se­quence goes from 10 to 18 is the tricky part of this whole thing, which makes me think the equa­tion is go­ing to be pretty ugly or wierd… maybe jeff made a mis­take :P Mes­sage—Delete

Vic­tor Bara­nowski wrote at 12:23pm on July 17th, 2007 Oh, and I might as well guess 354… Mes­sage—Delete

Jeff Bo­rack wrote at 2:19pm on July 17th, 2007 a) the solu­tion is beu­tiful b) i didn’t make any mis­takes yet c) 354 is good Delete

Vic­tor Bara­nowski wrote at 2:46pm on July 17th, 2007 Can I cite a) in re­sponse to your b) ? Mes­sage—Delete

Jeff Bo­rack wrote at 8:20pm on July 17th, 2007 Hm­mmm, I’m not sure. It de­pends on when you think the mis­take was made. Tech­ni­cally it did come be­fore b), but i could also ar­gue that the mis­take what made when i clicked the “Add your com­ment” but­ton.

a) the solu­tion is… very nice and good b) i didn’t make any mis­takes in the num­ber se­quence yet. c) web browsers and AIM should have spell check­ers. this isn’t the 20th cen­tury any­more. Delete

Tait Kowalski wrote at 3:48pm on July 18th, 2007 Se­quence goes x(n) = x(n-1)+2*x(n-3)

so the next num­ber = 354 + 2*134 = 622

next num­ber is 622 Mes­sage—Delete

Jeff Bo­rack wrote at 4:41pm on July 18th, 2007 Wel­come Tait! That is the wrong rule, but ill ac­cept your guess at the next num­ber. Delete

Elliot Alyesh­merni wrote at 6:28pm on July 19th, 2007 the next num­ber is fuck you jeff, just give us the an­swer lol Mes­sage—Delete

Jeff Bo­rack wrote at 6:47pm on July 19th, 2007 Sorry el­liot, want me to call the Whaaaaaaaaaaaaaaaam­bu­lance? Delete

Vic­tor Bara­nowski wrote at 4:29pm on July 22nd, 2007 is the next num­ber 620? Mes­sage—Delete

Jeff Bo­rack wrote at 7:23pm on July 22nd, 2007 hmm strange guess. 620 is not a num­ber Delete

Vic­tor Bara­nowski wrote at 8:46am on July 23rd, 2007 howabout 623? Mes­sage—Delete

Jeff Bo­rack wrote at 12:40pm on July 23rd, 2007 : ) 623 is the next num­ber Delete

Craig Fleischman (In­di­ana) wrote at 12:55pm on July 23rd, 2007 630? Mes­sage—Delete

Jeff Bo­rack wrote at 12:59pm on July 23rd, 2007 630 is good Delete

Vic­tor Bara­nowski wrote at 1:51pm on July 23rd, 2007 Solu­tion: the next num­ber is what­ever num­ber is guessed, as long as it is higher than the pre­vi­ously guessed num­ber. Mes­sage—Delete

Jeff Bo­rack wrote at 2:28pm on July 23rd, 2007 ha­haha, yup. it took a lot of time but not a lot of guesses. i ex­pected the guess­ing to to into the hun­dreds of thou­sands. do you ac­cept pay­pal? Delete

Vic­tor Bara­nowski wrote at 2:35pm on July 23rd, 2007 no, i ac­cept shots and beers the next time we hang out. Mes­sage—Delete

Yvette Monach­ino wrote at 4:10pm on July 27th, 2007 that is the dumb­est se­quence i have ever heard of Mes­sage—Delete

Jeff Bo­rack wrote at 5:08pm on July 27th, 2007 It’s about think­ing out­side the box, yvette, some­thing i wouldnt ex­pect most MATH ma­jors to un­der­stand! : p Vic­tory for the en­g­ineers!!! Delete

Yvette Monach­ino wrote at 2:08pm on July 30th, 2007 aw thats a cute re­mark, know­ing that you don’t ac­tu­ally know what real math is i won’t take that as an in­sult, and the only vic­tory you ac­com­plished is adding your­self to the long list of pompous en­g­ineers, so con­grats :) Mes­sage—Delete

Jeff Bo­rack wrote at 2:52pm on July 30th, 2007 While I might be pompous, I un­for­tu­nately can’t be con­sid­ered much of an en­g­ineer. I did bio­eng­ineer­ing, which cer­tainly doesnt count, and i’ve never ac­tu­ally en­g­ineered any­thing. Nei­ther has vic, hes in law school.

It is true that i don’t know what real math is (al­though i would love for you to teach me). How­ever, I would imag­ine that real math does in­volve think­ing out­side the box on oc­ca­sion. In this par­tic­u­lar ex­am­ple, it re­quired you to test a num­ber you thought was not part of the se­quence. If you be­lieved you had found the se­quence, and con­tintued to test num­bers that fit that se­quence, you would never de­rive the an­swer. By sim­ply test­ing a num­ber that does not ap­pear to fall into the se­quence, such as 2 mil­lion, it’s easy to find the solu­tion.

Does this sound like any ‘real’ math prob­lems you have ever en­coun­tered?

• I just want to sum­ma­rize what I learned in this thread in or­der to en­sure that I un­der­stand it. As I un­der­stand, the steps for de­ter­min­ing the rule should be some­thing like this:

1. See se­quence.

2. What re­la­tions do the el­e­ments share? All are num­bers, in­te­gers, even, differ by two, and are in as­cend­ing or­der. The rule is like­lier to con­tain each (but not all) of these as a clause than not to.

3. If any re­la­tion you thought of be­longs to a larger class, add that class.

4. Try to dis­con­firm each re­la­tion by cre­at­ing se­quences that vi­o­late only this re­la­tion (as well as its de­scen­dents, nec­es­sar­ily). Test gen­eral at­tributes first, since if they fail, the de­scen­dents can be con­sid­ered im­pos­si­ble.

5. Create a can­di­date rule which con­sists of all re­la­tions that were not dis­con­firmed.

6. Offer the rule to the ex­am­iner.

Quite a bit more la­bo­ri­ous than blurt­ing out “n[i] = n[i-1]+2”, I have to ad­mit.

• But then n[i]=n[i-1]+2 is wrong, so...

• You need to do a lot more to demon­strate ir­ra­tional­ity than this. Ob­vi­ously, as other com­menters have pointed out, there are an in­finite num­ber of rules that agree with any given finite se­quence of ex­per­i­men­tal re­sults so ob­vi­ously you can never con­clu­sively demon­strate that your rule is in­deed the cor­rect one. More­over, you can’t even be ‘bias free’ in the sense of as­sign­ing all pos­si­ble rules the same prob­a­bil­ity un­less you want to as­sign each rule prob­a­bil­ity 0.

Now you might be tempted to just give up at this point but this is ex­actly the same prob­lem we face when do­ing sci­ence. We have an in­finite num­ber of pos­si­ble rules that ex­tend the re­sults we have seen so far and we need to guess which is most likely. Amaz­ingly we do it pretty well but jus­tify­ing it seems im­pos­si­ble, it’s the clas­si­cal philo­soph­i­cal prob­lem of in­duc­tion.

In short it’s not clear any­one is ‘wrong’. Maybe they have a good ini­tial prob­a­bil­ity dis­tri­bu­tion for what sorts of rules peo­ple nor­mally pick. Heck it’s not even clear what it means to be ‘wrong’ in this sense, i.e., hav­ing an im­plau­si­ble a pri­ori prob­a­bil­ity distribution

• I have two ob­ser­va­tions, one per­sonal and one gen­eral:

Once, I tried to ap­ply ar­tifi­cial neu­ral nets on the task to eval­u­ate po­si­tional situ­a­tions in the game of Go. I did a very ba­sic er­ror, which was to train the net only on pos­i­tive ex­am­ples. The net quickly learned to give high scores for these, but then I tested on bad situ­a­tions it still re­ported high scores. Maybe a lit­tle naive mis­take, but you have to learn some­times.

A very com­mon ex­am­ple is test­ing of soft­ware. Usu­ally, peo­ple pay much at­ten­tion on test­ing the pos­i­tive cases, and ver­ify­ing that they work as they should. Less time is spent on test­ing things that should not work, some­times re­sult­ing in pro­grams that gen­er­ates an­swers when it should not. The prob­lem here is that test­ing the pos­i­tive cases usu­ally con­sists of a limited set, while the nega­tive cases are al­most in­finite.

• Any­one who finds the game de­scribed at the top of the ar­ti­cle in­ter­est­ing, check out Zendo, a game based upon a similar idea. I’ve found Zendo handy when ex­plain­ing the con­cept in the OP and the var­i­ous other ideas of ex­per­i­men­tal de­sign and in­duc­tive in­ves­ti­ga­tion. Plus, it’s lots of fun. :-)

• Zendo is my go-to ex­er­cise for ex­plain­ing just about any idea in in­duc­tive in­ves­ti­ga­tion. (But it’s even more use­ful as a tool for re­mind­ing my­self to do bet­ter. After years, the num­ber of Zendo games I lose due to pos­i­tive bias is still far higher than I’d like… even when I think I’ve taken steps to avoid that.)

• As my group’s usual Zendo Master, I have a lot of play­ers fall into this trap. I like to train new play­ers with one easy prop­erty like “A Koan Has The Bud­dah Na­ture If (and only if) it con­tains a red piece.” Once they un­der­stand the rules, I jump to some­thing like “A Koan Has The Bud­dah Na­ture Un­less It con­tains ex­actly two pieces.”

Switch­ing from a pos­i­tively-marked prop­erty (there is a sim­ple fea­ture which all these things have) to a nega­tively-marked prop­erty (there is a sim­ple fea­ture which all these things lack) can be pretty eye-open­ing.

I showed Zendo to a math pro­fes­sor once who fell smack into the 2-4-6 trap and tried to build as many white-marked koans as pos­si­ble. He even asked why the game didn’t pun­ish peo­ple for just mak­ing the same koan over and over again, since it would be guaran­teed to “fol­low the rule.” I even­tu­ally man­aged to con­vey that the ob­ject of the game is to be able to tell me, in words, what you think the rule is. Since then I’ve been more ex­plicit that “part of the game in­volves liter­ally just say­ing, out loud, what you think defines the prop­erty.” Peo­ple always seem to think that the zendo is a sort of a silent lec­ture, when re­ally it’s more of a lab­o­ra­tory class.

• He even asked why the game didn’t pun­ish peo­ple for just mak­ing the same koan over and over again, since it would be guaran­teed to “fol­low the rule.” I even­tu­ally man­aged to con­vey that the ob­ject of the game is to be able to tell me, in words, what you think the rule is.

Maybe this pro­vides some in­sight into the na­ture of pos­i­tive bias. In the game, the only goal is to find the rule; there is no pun­ish­ment for ask­ing a wrong se­quence. But I guess the real life is not like this. In real life, es­pe­cially in the an­cient en­vi­ron­ment, mak­ing a wrong guess is costly; and our cog­ni­tive al­gorithms were op­ti­mized for that.

For ex­am­ple, imag­ine that the rule is some taboo, pun­ish­able by death. It is bet­ter to avoid the pun­ish­ment, than to find the bound­aries pre­cisely. Avoid­ing a su­per­set of the taboo also has some cost, but that cost is prob­a­bly cheaper than be­ing stoned to death. If you know that the se­quence “2-4-6” does not get you kil­led (un­like some other se­quences, not ex­plic­itly known which ones), it may be wise to guess “2-4-6″ over and over again.

• One thing that helped me re­ally get this one is test­ing soft­ware up­grades. It’s in­sanely te­dious. Most stuff just keeps work­ing. But if you don’t test, you’re just ask­ing for some­thing to come back and bite you in the back­side.

e.g. re­cent work ex­am­ple: up­grad­ing Tom­cat 6.0.16 to 6.0.29. Minor point re­lease from the Apache Soft­ware Foun­da­tion, com­puter sci­en­tists fa­mous for their ded­i­ca­tion to en­g­ineer­ing sta­bil­ity. I so didn’t want to bother test­ing this at all—days and days of te­dium. Then this bit us—some­one de­cided the let­ter of the spec beat moun­tains of real-world code in a sta­ble branch main­te­nance re­lease. And it’s in moun­tains of real-world code be­cause of this. My opinion of Apache slipped some­what. But my sys­tems stayed up.

I still hate lin­ing up test­ing, but a few of these and you start to ex­pand your map of chances large enough to mess you up. Sysad­mins know that com­put­ers are evil and out to get them, and that the only way around this is not to give them the op­por­tu­nity.

• A friend of mine has a similar story in­volv­ing why he never al­lows code-changes af­ter code freeze dates, even if X, even if Y, even if Z. His story, how­ever, in­volves avatars in a video game sort­ing their lay­ers in strange ways on ob­scure video cards to cause breast­plates to un­ex­pect­edly sort be­low breasts, which is why I still re­mem­ber it.

• It’s like back­ups or free­dom 0. Ap­prox­i­mately no-one gets it un­til they’ve been bit­ten in real life. (I am par­tic­u­larly bad at learn­ing with­out di­rect ap­pli­ca­tion of fore­head to con­crete, but am at­tempt­ing to think more clearly.)

• Funny, “three num­bers in as­cend­ing or­der” was the first hy­poth­e­sis that popped in my mind.

• I think most peo­ple would come up with the cor­rect an­swer ‘with ex­ten­sion’. Such as ‘in­creas­ing by 2 in as­cend­ing or­der’ where the cor­rect an­swer ‘as­cend­ing or­der’ is the ba­sis that they have then speci­fied fur­ther. In my eyes they have then given a par­tially cor­rect an­swer and should not strive so hard to ‘avoid this mis­take’ in the fu­ture. My rea­son­ing is that you might then ‘dis­miss out of hand’ a par­tially cor­rect an­swer and by de­fault do the same to the ‘fully cor­rect an­swer’. It is bet­ter then, to make a habit out of break­ing down a hy­pothe­ses be­fore dis­miss­ing it. Or you could just use up all your en­ergy on con­vinc­ing your­self that noth­ing should be be­lieved, ever. Since be­lief means to know with­out proof.

• Hey there! Wel­come to Less Wrong!

I’d say you should read the Se­quences, but that’s clearly what you’re do­ing :D. I’d sug­gest go­ing ahead and in­tro­duc­ing your­self over here.

I agree with you that some peo­ple might come up with the rule, but with un­nec­es­sary ad­di­tions. The point of look­ing into the dark is that peo­ple may tend to add on to those ex­ten­sions, when they should re­ally be shav­ing them down to their core. And they can only do so (Or at least do so more effec­tively.) by look­ing into the dark.

Also, that’s not ex­actly the com­monly ac­cepted defi­ni­tion of “Belief” around here. For what most would think of when you re­fer to “be­lief” check out here, here, and the re­lated The Sim­ple Truth ar­ti­cle, and re­ally the en­tire Map and Ter­ri­tory sequence

Again, wel­come!

• Thought ex­per­i­ment. Sup­pose you have two or­a­cles, and your task is to find out whether or not they have the same rule. If each or­a­cle is con­sid­ered as “A lookup table pro­duced by a coin flip for each pos­si­ble in­put, ex­cept that there’s a 50% chance that the sec­ond is just a copy of the first” then of course any in­put is as likely as any other to ex­hibit a differ­ence, and you can eas­ily com­pute the prob­a­bil­ity of no differ­ence af­ter n tests fail to ex­hibit one. But if you have an as­sump­tion that sim­pler rules are more likely (eg. your prior is 2^-com­plex­ity) then what’s your op­ti­mal strat­egy?

A plau­si­ble strat­egy is to fol­low the same strat­egy as you would if you had to find the rule of a sin­gle or­a­cle; you always send the in­put that gives you the most bits about Or­a­cle A’s rule. That way, you max­imise the prob­a­bil­ity of ex­hibit­ing a differ­ence given that one ex­ists. So if you can gen­er­ate an in­put which, un­der your cur­rent model of the space of A’s pos­si­ble rules (and the prob­a­bil­ity of each), has ex­actly a 50% chance of match­ing A, then it also has a 50% chance of match­ing B; more­over these prob­a­bil­ities are in­de­pen­dent, so you have 25%+25%=50% chance of ex­hibit­ing a differ­ence. If in­stead you picked an in­put with a 30% chance of match­ing A, your chance of ex­hibit­ing a differ­ence is 21%+21%=42%.

• It seems much of our cog­ni­tive ar­chi­tec­ture was de­vel­oped in the con­text of so­cial situ­a­tions. In­deed, the stan­dard ex­per­i­ments on check­ing modus po­nens and modus tol­lens un­der­stand­ing show sharp in­creases in abil­ity when they are pre­sented as so­cial rules (e.g. http://​​en.wikipe­dia.org/​​wiki/​​Wa­son_se­lec­tion_task check­ing whether some­one is vi­o­lat­ing the “minor drink­ing al­co­hol” rules, rather than cards gives much higher perfor­mance). Test­ing whether you un­der­stand a so­cial rule by de­liber­ately vi­o­lat­ing your cur­rent un­der­stand­ing can be a very, very ex­pen­sive test. It seems plau­si­ble that this cost has led to the hu­man de­fault ways for test­ing im­plicit rules to avoid seek­ing out these nega­tives, even when the cost would be low.

• We’re good at rea­son­ing with so­cial situ­a­tions, and bad with more ab­stract situ­a­tions. As such, we can’t be do­ing them the same way. Some­thing that helps in so­cial situ­a­tions is un­likely to cause a bias in more ab­stract situ­a­tions.

In other words, our cur­rent ar­chi­tec­ture was de­vel­oped in the con­text of so­cial situ­a­tions, and the fact that we do sig­nifi­cantly bet­ter in those situ­a­tions shows that it’s the only time we use it. Other­wise, we use differ­ent, lousy ar­chi­tec­ture that won’t ex­hibit the same bi­ases.

• Are you search­ing for pos­i­tive ex­am­ples of pos­i­tive bias right now, or spar­ing a frac­tion of your search on what pos­i­tive bias should lead you to not see?

Isn’t what pos­i­tive bias should lead you to not see a pos­i­tive ex­am­ple of pos­i­tive bias? Or am I ex­plain­ing the joke?

• I in­tu­itively wanted to see if the com­bi­na­tion 8-6-4 or 6-4-2 would be ac­cept­able, with­out ac­tu­ally mak­ing a guess at the rule. I looked at the two ac­cept­able an­swers and the one un­ac­cept­able an­swer and thought, okay, but that doesn’t prove a rule. The rule the ex­per­i­ment wants you to think about is a pat­tern like 2-4-6-8-10, so let’s see if some­thing dis­proves that pat­tern. Would 6-4-2 be ac­cept­able? Ob­vi­ously, it wouldn’t. If I wasn’t un­der the in­fluence of hind­sight bias I might con­tinue on to try and see if differ­ent in­ter­vals were not ac­cept­able I.e. 2-2-2 un­til I could differ­en­ti­ate be­tween as­cend­ing or­der and the in­ter­vals, but know­ing me and the like­li­hood of any­one ac­tu­ally guess­ing the rule I would put that as a very low prob­a­bil­ity. Still, this strikes me as the kind of thing where it’s best to avoid bring­ing up a solu­tion—get more in­for­ma­tion, and study and dis­cuss the in­for­ma­tion, and then try to solve it. If peo­ple did this per­haps they would come closer to get­ting it right?

Haha… And be­fore I read this blog I thought I was ir­ra­tional. Prob­a­bly still am.

• I won­der why noone cares to men­tion Ock­ham’s Ra­zor in this situ­a­tion. As already a cou­ple of times men­tioned, there are in­finite rules pos­si­ble to de­scribe a finite set of num­bers. thereby we can only start at the least re­strict­ing rule pos­si­ble and work our way farther in un­til we get to a point where we are not able to find a set of num­bers work­ing for our rule, but not for the rule to find within a cer­tain in­ter­val of time. thereby i start by say­ing its all num­bers. ob­vi­ously ill find a cou­ple of pairs not match­ing the cor­rect rule. ill then start try­ing whole num­bers. af­ter that i might try as­cend­ing num­bers or at least a>b or b>c… the only im­por­tant thing to do here is to find the sim­plest solu­tion still pos­si­ble.

So i ac­tu­ally wouldnt try find­ing any­thing thats not fit­ting my as­sump­tions, since there would be way more sets not fit­ting my as­sump­tion and not fit­ting the solu­tion.

• If I was ad­vis­ing an AI on how to solve this ques­tion, I might recom­mend guess­ing many sets of three ran­dom num­bers, and just look­ing at the ra­tio of ‘yes’ to ‘no’. A re­sult of 16 yes, could then be matched against var­i­ous rules and there ra­tios. This would greatly re­duce the solu­tion set, and or­der­ing would likely jump to the front as a likely pos­si­bil­ity.

If I were an­swer­ing the ques­tion for my­self, I would likely try to break it, by that I mean get you to ei­ther add a new rule, or to say ‘I don’t know’. { e, i, pi }

• Soft­ware de­sign: if you are us­ing a logic test, check on ei­ther side of the logic test, and also ran­dom an­swers.

is X > 5? if X is: 4: no 5: no 6: yes 5.00001: no 5.999999: yes −1: er­ror er­ror er­ror “tomato”: er­ror er­ror error

taught me to always dou­ble check the hy­poth­e­sis is not just a good fit, but a good enough fit for the pur­pose. If you never en­counter a tomato, or dec­i­mals or nega­tive num­bers, then the test works fine. if you ex­pect oc­ca­sional toma­toes, and your test is look­ing for a pos­i­tive in­te­ger. Maybe its time for a new test.

• I think there is a sim­ple ap­proach to han­dling these prob­lems. First define a num­ber than no one knows any­thing about. Say BB(10) where BB is the busy beaver func­tion. No one knows any­thing much about the size of this num­ber, whether its odd or even, etc. Then if some­one yes yes to:

BB(10), BB(10) + 2, BB(10) + 4 you can in­fer they prob­a­bly re­ally are us­ing rule: n, n+2, n+4.

If its not this rule they may need to say they can’t tell if the se­quence fol­lows the rules or not.

Un­less they are us­ing very gen­eral and hard to guess rules this method seems effec­tive. An ex­am­ple of an ab­surdly hard to guess rule would be. “All num­bers are less than BB(100)”

This doesn’t solve the prob­lem. If you think the rule is n,2n,3n you could try BB(10), 2BB(10), 3BB(10) but then the rule might re­ally be: n,kn,(k+1)n for some k. But again this method seems to me like it would give you a way to check most “easy” rules. Or at least some­thing like this is use­ful in test­ing your the­o­ries.

• I won­der if this can not be par­tially ex­plained by peo­ple want­ing to an­swer quickly. The teacher says you can make as many guesses as you like, but we still in­stinc­tively feel like we do bet­ter if we do it faster.

Imag­ine the same test, but now with the last line read­ing: “You can make as many guesses as you like, but you get graded on how fast you get the right re­sult”. With the rule it is a lot more ra­tio­nal to not spend too much time on ver­ifi­ca­tion of your hy­poth­e­sized rule. I have no idea what the best strat­egy is, I guess it de­pends on your pri­ors about the rule-space, but it prob­a­bly does not in­volve spend­ing a lot of ques­tions on falsifi­ca­tion.

My guess is that many peo­ple ap­proach the prob­lem as if it is of the above va­ri­ety, even though it isn’t. So while pos­i­tive bias no doubt plays a part, I think a de­sire to an­swer quickly also fac­tors hugely.

This is testable. Give peo­ple a 10 dol­lar re­ward for giv­ing the cor­rect an­swer, and ex­plic­itly tell them that the num­ber of guesses does not af­fect this re­ward. I hy­poth­e­size that the frac­tion of peo­ple get­ting the cor­rect an­swer will go up sig­nifi­cantly.

(I know this is a very old thread, but this se­quence still fea­tures promi­nently on the site, so I have some hopes that peo­ple still read this oc­ca­sion­ally :P)

• 6 Dec 2014 15:13 UTC
0 points

Teacher: ‘In ‘Beast and Man in In­dia’ John Ki­pling de­scribes a cus­tom of how gyp­sies ran­somed crows to Hin­dus. A gypsy would catch a crow, peg it on the ground spread-ea­gled so that it can­not es­cape, and when an­other bird would fly to at­tack it, the first one, defend­ing it­self, catches it with her legs. When the gypsy has enough crows, he goes to a shop of some rich Hindu and offers to let them go, for a price, or eat them for din­ner. The Hindu pays one or two paisas for a bird. Let the p of the crow not fly­ing away when it is pegged down be 90%, p of it catch­ing an­other one be 95%. If by the end of the day the gypsy has 16 paisas, what is the least num­ber of birds that will have been on the ground?′ …and the cor­rect an­swer is zero:)

• Is my idea cor­rect why this is in Mys­te­ri­ous An­swers?: Due to pos­i­tive bias you don’t try to falsify a the­ory—and if a the­ory does not pre­dict any­thing for the nega­tive case, then it does not have any pre­dic­tive value and thus is a mys­te­ri­ous an­swer.