The Bat and Ball Problem Revisited

Cross posted from my per­sonal blog.

In this post, I’m go­ing to as­sume you’ve come across the Cog­ni­tive Reflec­tion Test be­fore and know the an­swers. If you haven’t, it’s only three quick ques­tions, go and do it now.

One of the strik­ing early ex­am­ples in Kah­ne­man’s Think­ing, Fast and Slow is the fol­low­ing prob­lem:

(1) A bat and a ball cost $1.10 in to­tal. The bat costs $1.00 more than the ball.

How much does the ball cost? _____ cents

This ques­tion first turns up in­for­mally in a pa­per by Kah­ne­man and Fred­er­ick, who find that most peo­ple get it wrong:

Al­most ev­ery­one we ask re­ports an ini­tial ten­dency to an­swer “10 cents” be­cause the sum $1.10 sep­a­rates nat­u­rally into $1 and 10 cents, and 10 cents is about the right mag­ni­tude. Many peo­ple yield to this im­me­di­ate im­pulse. The sur­pris­ingly high rate of er­rors in this easy prob­lem illus­trates how lightly Sys­tem 2 mon­i­tors the out­put of Sys­tem 1: peo­ple are not ac­cus­tomed to think­ing hard, and are of­ten con­tent to trust a plau­si­ble judg­ment that quickly comes to mind.

In Think­ing Fast and Slow, the bat and ball prob­lem is used as an in­tro­duc­tion to the ma­jor theme of the book: the dis­tinc­tion be­tween fluent, spon­ta­neous, fast ‘Sys­tem 1’ men­tal pro­cesses, and effort­ful, re­flec­tive and slow ‘Sys­tem 2’ ones. The ex­plicit moral is that we are too will­ing to lean on Sys­tem 1, and this gets us into trou­ble:

The bat-and-ball prob­lem is our first en­counter with an ob­ser­va­tion that will be a re­cur­rent theme of this book: many peo­ple are over­con­fi­dent, prone to place too much faith in their in­tu­itions. They ap­par­ently find cog­ni­tive effort at least mildly un­pleas­ant and avoid it as much as pos­si­ble.

This story is very com­pel­ling in the case of the bat and ball prob­lem. I got this prob­lem wrong my­self when I first saw it, and still find the in­tu­itive-but-wrong an­swer very plau­si­ble look­ing. I have to con­sciously re­mind my­self to ap­ply some ex­tra effort and get the cor­rect an­swer.

How­ever, this be­comes more com­pli­cated when you start con­sid­er­ing other tests of this fast-vs-slow dis­tinc­tion. Fred­er­ick later com­bined the bat and ball prob­lem with two other ques­tions to cre­ate the Cog­ni­tive Reflec­tion Test:

(2) If it takes 5 ma­chines 5 min­utes to make 5 wid­gets, how long would it take 100 ma­chines to make 100 wid­gets? _____ minutes

(3) In a lake, there is a patch of lily pads. Every day, the patch dou­bles in size. If it takes 48 days for the patch to cover the en­tire lake, how long would it take for the patch to cover half of the lake? _____ days

Th­ese are de­signed to also have an ‘in­tu­itive-but-wrong’ an­swer (100 min­utes, 24 days), and an ‘effort­ful-but-right’ an­swer (5 min­utes, 47 days). But this time I seem to be im­mune to the wrong an­swers, in a way that just doesn’t hap­pen with the bat and ball:

I always have the same re­ac­tion, and I don’t know if it’s com­mon or I’m just the lone idiot with this prob­lem. The ‘ob­vi­ous wrong an­swers’ for 2. and 3. are com­pletely un­ap­peal­ing to me (I had to look up 3. to check what the ob­vi­ous an­swer was sup­posed to be). Ob­vi­ously the ma­chine-wid­get ra­tio hasn’t changed, and ob­vi­ously ex­po­nen­tial growth works like ex­po­nen­tial growth.

When I see 1., how­ever, I always think ‘oh it’s that bas­tard bat and ball ques­tion again, I know the cor­rect an­swer but can­not see it’. And I have to stare at it for a minute or so to work it out, slowed down dra­mat­i­cally by the fact that Ob­vi­ous Wrong An­swer is jump­ing up and down try­ing to dis­tract me.

If this test was re­ally test­ing my propen­sity for effort­ful thought over spon­ta­neous in­tu­ition, I ought to score zero. I hate effort­ful thought! As it is, I score two out of three, be­cause I’ve trained my in­tu­itions nicely for ra­tios and ex­po­nen­tial growth. The ‘in­tu­itive’, ‘Sys­tem 1’ an­swer that pops into my head is, in fact, the cor­rect an­swer, and the sup­pos­edly ‘in­tu­itive-but-wrong’ an­swers feel bad on a visceral level. (Why the hell would the lily pads take the same amount of time to cover the sec­ond half of the lake as the first half, when the rate of growth is in­creas­ing?)

The bat and ball still gets me, though. My gut hasn’t in­ter­nal­ised any­thing use­ful, and it’s su­per keen on shout­ing out the wrong an­swer in a dis­tract­ing way. My dis­like for effort­ful thought is definitely a prob­lem here.

I wanted to see if oth­ers had raised the same ob­jec­tion, so I started do­ing some re­search into the CRT. In the pro­cess I dis­cov­ered a lot of fol­low-up work that makes the story much more com­plex and in­ter­est­ing.

I’ve come nowhere near to do­ing a proper liter­a­ture re­view. Fred­er­ick’s origi­nal pa­per has been cited nearly 3000 times, and dredg­ing through that for the good bits is a lot more work than I’m will­ing to put in. This is just a sum­mary of the in­ter­est­ing stuff I found on my limited, par­tial dig through the liter­a­ture.

Think­ing, in­her­ently fast and in­her­ently slow

Fred­er­ick’s origi­nal Cog­ni­tive Reflec­tion Test pa­per de­scribes the Sys­tem 1/​Sys­tem 2 di­vide in the fol­low­ing way:

Rec­og­niz­ing that the face of the per­son en­ter­ing the class­room be­longs to your math teacher in­volves Sys­tem 1 pro­cesses — it oc­curs in­stantly and effortlessly and is un­af­fected by in­tel­lect, alert­ness, mo­ti­va­tion or the difficulty of the math prob­lem be­ing at­tempted at the time. Con­versely, find­ing to two dec­i­mal places with­out a calcu­la­tor in­volves Sys­tem 2 pro­cesses — men­tal op­er­a­tions re­quiring effort, mo­ti­va­tion, con­cen­tra­tion, and the ex­e­cu­tion of learned rules.

I find it in­ter­est­ing that he frames men­tal pro­cesses as be­ing in­her­ently effortless or effort­ful, in­de­pen­dent of the per­son do­ing the think­ing. This is not quite true even for the ex­am­ples he gives — face­blind peo­ple and calcu­lat­ing prodi­gies ex­ist.

This fram­ing is im­por­tant for in­ter­pret­ing the CRT. If the prob­lem in­her­ently has a wrong ‘Sys­tem 1 solu­tion’ and a cor­rect ‘Sys­tem 2 solu­tion’, the CRT can work as in­tended, as an effi­cient tool to split peo­ple by their propen­sity to use one strat­egy or the other. If there are ‘Sys­tem 1’ ways to get the cor­rect an­swer, the whole thing gets much more mud­dled, and it’s hard to dis­en­tan­gle nat­u­ral propen­sity to re­flec­tion from prior ex­po­sure to the right math­e­mat­i­cal con­cepts.

My ten­ta­tive guess is that the bat and ball prob­lem is close to be­ing this kind of effi­cient tool. Although in some ways it’s the sim­plest of the three prob­lems, solv­ing it in a ‘fast’, ‘in­tu­itive’ way re­lies on see­ing the prob­lem in a way that most peo­ple’s ed­u­ca­tion won’t have pro­vided. (I think this is true, any­way—I’ll go into more de­tail later.) I sus­pect that this is less true the other two prob­lems—ra­tios and ex­po­nen­tial growth are top­ics that a math­e­mat­i­cal or sci­en­tific ed­u­ca­tion is more likely to build in­tu­ition for.

(Aside: I’d like to know how these other two prob­lems were cho­sen. The pa­per just states the fol­low­ing:

Mo­ti­vated by this re­sult [the an­swers to the bat and ball ques­tion], two other prob­lems found to yield im­pul­sive er­ro­neous re­sponses were in­cluded with the “bat and ball” prob­lem to form a sim­ple, three-item “Cog­ni­tive Reflec­tion Test” (CRT), shown in Figure 1.

I have a vague sus­pi­cion that Fred­er­ick trawled through some­thing like ‘The Bumper Book of An­noy­ing Rid­dles’ to find some brain­teasers that don’t re­quire too much in the way of math­e­mat­i­cal pre­req­ui­sites. The lily­pads one has a fam­ily re­sem­blance to the clas­sic grains-of-wheat-on-a-chess­board puz­zle, for in­stance.)

How­ever, I haven’t found any great ev­i­dence ei­ther way for this guess. The origi­nal pa­per doesn’t break down par­ti­ci­pants’ scores by ques­tion – it just gives mean scores on the test as a whole. I did how­ever find this meta-anal­y­sis of 118 CRT stud­ies, which shows that the bat and ball ques­tion is the most difficult on av­er­age – only 32% of all par­ti­ci­pants get it right, com­pared with 40% for the wid­gets and 48% for the lily­pads. It also has the biggest jump in suc­cess rate when com­par­ing uni­ver­sity stu­dents with non-stu­dents. That looks like bet­ter math­e­mat­i­cal ed­u­ca­tion does help on the bat and ball, but it doesn’t clear up how it helps. It could im­prove par­ti­ci­pants’ abil­ity to in­tu­itively see the an­swer. Or it could im­prove abil­ity to come up with an ‘un­in­tu­itive’ solu­tion, like solv­ing the cor­re­spond­ing si­mul­ta­neous equa­tions by a rote method.

What I’d re­ally like is some in­sight into what in­di­vi­d­ual peo­ple ac­tu­ally do when they try to solve the prob­lems, rather than just this ag­gre­gate statis­ti­cal in­for­ma­tion. I haven’t found ex­actly what I wanted, but I did turn up a few in­ter­est­ing stud­ies on the way.

No, se­ri­ously, the an­swer isn’t ten cents

My favourite thing I found was this (ap­par­ently un­pub­lished) ‘ex­tremely rough draft’ by Meyer, Spunt and Fred­er­ick from 2013, re­vis­it­ing the bat and ball prob­lem. The in­tu­itive-but-wrong an­swer turns out to be ex­tremely sticky, and the pa­per is ba­si­cally a se­ries of in­creas­ingly des­per­ate at­tempts to get peo­ple to ac­tu­ally think about the ques­tion.

One con­jec­ture for what peo­ple are do­ing when they get this ques­tion wrong is the at­tribute sub­sti­tu­tion hy­poth­e­sis. This was sug­gested early on by Kah­ne­man and Fred­er­ick, and is a fancy way of say­ing that they are in­stead solv­ing the fol­low­ing sim­pler prob­lem:

(1) A bat and a ball cost $1.10 in to­tal. The bat costs $1.00.

How much does the ball cost? _____ cents

No­tice that this is miss­ing the ‘more than the ball’ clause at the end, turn­ing the ques­tion into a much sim­pler ar­ith­metic prob­lem. This sim­ple prob­lem does have ‘ten cents’ as the an­swer, so it’s very plau­si­ble that peo­ple are get­ting con­fused by it.

Meyer, Spunt and Fred­er­ick tested this hy­poth­e­sis by get­ting re­spon­dents to re­call the prob­lem from mem­ory. This showed a clear differ­ence: 94% of ‘five cent’ re­spon­dents could re­call the cor­rect ques­tion, but only 61% of ‘ten cent’ re­spon­dents. It’s pos­si­ble that there is a differ­ent com­mon cause of both the ‘ten cent’ re­sponse and mis­re­mem­ber­ing the ques­tion, but it at least gives some sup­port for the sub­sti­tu­tion hy­poth­e­sis.

How­ever, get­ting peo­ple to ac­tu­ally an­swer the ques­tion cor­rectly was a much more difficult prob­lem. First they tried bold­ing the words more than the ball to make this clause more salient. This made sur­pris­ingly lit­tle im­pact: 29% of re­spon­dents solved it, com­pared with 24% for the origi­nal prob­lem. Print­ing both ver­sions was slightly more suc­cess­ful, bump­ing up the cor­rect re­sponse to 35%, but it was still a small effect.

After this, they ditched sub­tlety and re­sorted to past­ing these huge warn­ings above the ques­tion:

Computation warning: 'Be careful! Many people miss the following problem because they do not take the time to check their answer. Comprehension warning: 'Be careful! Many people miss the following problem because they read it too quickly and actually answer a different question than the one that was asked.'

Th­ese were still only mildly effec­tive, with a cor­rect solu­tion jump­ing to 50% from 45%. Peo­ple just re­ally like the an­swer ‘ten cents’, it seems.

At this point they com­pletely gave up and just flat out added “HINT: 10 cents is not the an­swer.” This worked rea­son­ably well, though there was still a hard core of 13% who per­sisted in writ­ing down ‘ten cents’.

That’s where they left it. At this point there’s not re­ally any room to es­ca­late be­yond con­fis­cat­ing the re­spon­dents’ pens and pre­filling in the an­swer ‘five cents’, and I worry that some­body would still try and scratch in ‘ten cents’ in their own blood. The wrong an­swer is just in­cred­ibly com­pel­ling.

So, what are peo­ple do­ing when they solve this prob­lem?

Un­for­tu­nately, it’s hard to tell from the pub­lished liter­a­ture (or at least what I found of it). What I’d re­ally like is lots of tran­scripts of in­di­vi­d­u­als talk­ing through their prob­lem solv­ing pro­cess. The clos­est I found was this pa­per by Sza­szi et al, who did carry out these sort of in­ter­view, but it doesn’t in­clude any ex­am­ples of in­di­vi­d­ual re­sponses. In­stead, it gives a ag­gre­gated overview of types of re­sponses, which doesn’t go into the kind of de­tail I’d like.

Still, the ex­am­ples given for their re­sponse cat­e­gories give a few clues. The cat­e­gories are:

  • Cor­rect an­swer, cor­rect start. Ex­am­ple given: ‘I see. This is an equa­tion. Thus if the ball equals to x, the bat equals to x plus 1… ’

  • Cor­rect an­swer, in­cor­rect start. Ex­am­ple: ‘I would say 10 cents… But this can­not be true as it does not sum up to €1.10...’

  • In­cor­rect an­swer, re­flec­tive, i.e. some effort was made to re­con­sider the an­swer given, even if it was ul­ti­mately in­cor­rect. Ex­am­ple: ‘… but I’m not sure… If to­gether they cost €1.10, and the bat costs €1 more than the ball… the solu­tion should be 10 cents. I’m done.’

  • No re­flec­tion. Ex­am­ple: ‘Ok. I’m done.’

Th­ese demon­strate one way to rea­son your way to the cor­rect an­swer (solve the si­mul­ta­neous equa­tions) and one way to be wrong (just blurt out the an­swer). They also demon­strate one way to re­cover from an in­cor­rect solu­tion (think about the an­swer you blurted out and see if it ac­tu­ally works). Still, it’s all rather ab­stract and high level.

How To Solve It

How­ever, I did man­age to stum­ble onto an­other source of in­sight. While re­search­ing the prob­lem I came across this ar­ti­cle from the on­line mag­a­z­ine of the As­so­ci­a­tion for Psy­cholog­i­cal Science, which dis­cusses a var­i­ant ‘Ford and Fer­rari prob­lem’. This is quite in­ter­est­ing in it­self, but I was most ex­cited by the com­ments sec­tion. Fi­nally some ex­am­ples of how the prob­lem is solved in the wild!

The sim­plest ‘an­a­lyt­i­cal’, ‘Sys­tem 2’ solu­tion is to rewrite the prob­lem as two si­mul­ta­neous lin­ear equa­tions and plug-and-chug your way to the cor­rect an­swer. For ex­am­ple, writ­ing for the bat and for the ball, we get the two equations

, ,

which we could then solve in var­i­ous stan­dard ways, e.g.

, ,

which then gives


There are a cou­ple of var­i­ants of this ex­plained in the com­ments. It’s a very re­li­able way to tackle the prob­lem: if you already know how to do this sort of rote method, there are no sur­prises. This sort of method would work for any similar prob­lem in­volv­ing lin­ear equa­tions.

How­ever, it’s pretty ob­vi­ous that a lot of peo­ple won’t have ac­cess to this method. Plenty of peo­ple noped out of math­e­mat­ics long be­fore they got to si­mul­ta­neous equa­tions, so they won’t be able to solve it this way. What might be less ob­vi­ous, at least if you mostly live in a high-maths-abil­ity bub­ble, is that these peo­ple may also be miss­ing the sort of tacit math­e­mat­i­cal back­ground that would even al­low them to frame the prob­lem in a use­ful form in the first place.

That sounds a bit ab­stract, so let’s look at some re­sponses (I’ll paste all these straight in, so any ty­pos are in the origi­nal). First, we have these two con­fused com­menters:

The thing is, why does the ball have to be $.05? It could have been .04 0r.03 and the bat would still cost more than $1.


This is ex­actly what both­ers me and re­sulted in me want­ing to look up the ques­tion on­line. On the quiz the other 2 ques­tions were defini­tive. This one tech­ni­cally could have more than one an­swer so this is where phy­col­o­gists ac­tu­ally mess up when try­ing to give us a trick ques­tion. The ball at .4 and the bat at 1.06 doesn’t break the rule ei­ther.

Th­ese com­menters don’t au­to­mat­i­cally see two equa­tions in two vari­ables that to­gether are enough to con­strain the prob­lem. In­stead they seem to fo­cus mainly on the first con­di­tion (adding up to $1.10) and just use the sec­ond one as a vague check at best (‘the bat would still cost more than $1’). This means that they are un­able to im­me­di­ately tell that the prob­lem has a unique solu­tion.

In re­sponse, an­other com­menter, Tony, sug­gests a cor­rect solu­tion which is an in­ter­est­ing mix of writ­ing the prob­lem out for­mally and then figur­ing out the an­swer by trial and er­ror:\

I hear your pain. I feel as though psy­chol­o­gists and psy­chi­a­trists get to­gether ev­ery now and then to prove how stoopid I am. How­ever, af­ter more than a lit­tle head scratch­ing I’ve gained an un­der­stand­ing of this puz­zle. It can be ex­pressed as two facts and a ques­tion A=100+B and A+B=110, so B=? If B=2 then the solu­tion would be 100+2+2 and A+B would be 104. If B=6 then the solu­tion would be 100+6+6 and A+B would be 112. But as be KNOW A+B=110 the only num­ber for B on it’s own is 5.

This sug­gests enough half-re­mem­bered math­e­mat­i­cal knowl­edge to find a sen­si­ble ab­stract fram­ing, but not enough to solve it the stan­dard way.

Fi­nally, com­menter Marlo Eu­gene pro­vides an in­ge­nious way of solv­ing the prob­lem with­out writ­ing all the alge­braic steps out:

Lin­guis­tics makes all the differ­ence. The con­cep­tual em­pha­sis seems to lie within the word MORE.

X + Y = $1.10. If X = $1 MORE then that leaves $0.10 TO WORK WITH rather than au­to­mat­i­cally as­sign to Y

So you di­vide the re­main­der equally (as­sum­ing nega­tive val­ues are dis­qual­ified) and get 0.05.

So even this small sam­ple of com­ments sug­gests a wide di­ver­sity of prob­lem-solv­ing meth­ods lead­ing to the two com­mon an­swers. Fur­ther, these solu­tions don’t all split neatly into ‘Sys­tem 1’ ‘in­tu­itive’ and ‘Sys­tem 2’ ‘an­a­lytic’. Marlo Eu­gene’s solu­tion, for in­stance, is a mixed solu­tion of writ­ing the equa­tions down in a for­mal way, but then find­ing a clever way of just see­ing the an­swer rather than solv­ing them by rote.

I’d still ap­pre­ci­ate more de­tailed tran­scripts, in­clud­ing the time taken to solve the prob­lem. My sus­pi­cion is still that very few peo­ple solve this prob­lem with a fast in­tu­itive re­sponse, in the way that I rapidly see the cor­rect an­swer to the lily­pad ques­tion. Even the more ‘in­tu­itive’ re­sponses, like Marlo Eu­gene’s, seem to rely on some ini­tial care­ful re­flec­tion and a good ini­tial fram­ing of the prob­lem.

If I’m cor­rect about this lack of fast re­sponses, my ten­ta­tive guess for the rea­son is that it has some­thing to do with the way most of us learn si­mul­ta­neous equa­tions in school. We gen­er­ally learn ar­ith­metic as young chil­dren in a fairly con­crete way, with the for­mal nu­mer­i­cal prob­lems sup­ple­mented with lots of spe­cific ex­am­ples of adding up ap­ples and ba­nanas and so forth.

But then, for some rea­son, this goes com­pletely out of the win­dow once the un­known quan­tity isn’t sit­ting on its own on one side of the equals sign. This is in­stead hived off into its own sep­a­rate sub­ject, called ‘alge­bra’, and the rules are taught much later in a much more for­mal­ised style, with­out much at­tempt to build up in­tu­ition first.

(One ex­cep­tion is the sort of puz­zle sheets that are of­ten given to young kids, where the un­knowns are just empty boxes to be filled in. Some­times you get 2+3=□, some­times it’s 2+□=5, but ei­ther way you go about the same pro­cess of us­ing your wits to figure out the an­swer. Then, for some rea­son I’ll never un­der­stand, the work­sheets get put away and the poor kids don’t see the sub­ject again un­til years later, when the box is now called for some rea­son and you have to find the an­swer by defined rules. Any­way, this is a sep­a­rate rant.)

This lack of a rich back­ground in puz­zling out the an­swer to spe­cific con­crete prob­lems means most of us lean hard on for­mal rules in this do­main, even if we’re rel­a­tively math­e­mat­i­cally so­phis­ti­cated. Only a few build up the nec­es­sary reper­toire of tricks to solve the prob­lem quickly by in­sight. I’m re­minded of a story in Feyn­man’s The Plea­sure of Find­ing Things Out:

Around that time my cousin, who was three years older, was in high school. He was hav­ing con­sid­er­able difficulty with his alge­bra, so a tu­tor would come. I was al­lowed to sit in a cor­ner while the tu­tor would try to teach my cousin alge­bra. I’d hear him talk­ing about x.

I said to my cousin, “What are you try­ing to do?”

“I’m try­ing to find out what x is, like in 2x + 7 = 15.”

I say, “You mean 4.”

“Yeah, but you did it by ar­ith­metic. You have to do it by alge­bra.”

I learned alge­bra, for­tu­nately, not by go­ing to school, but by find­ing my aunt’s old school­book in the at­tic, and un­der­stand­ing that the whole idea was to find out what x is—it doesn’t make any differ­ence how you do it.

I think this re­li­ance on for­mal meth­ods might be some­what less true for ex­po­nen­tial growth and ra­tios, the sub­jects un­der­pin­ning the lily­pad and wid­get ques­tions. Cer­tainly I seem to have bet­ter in­tu­ition there, with­out hav­ing to re­sort to rote calcu­la­tion. But I’m not sure how gen­eral this is.

How To Vi­su­al­ise It

If you wanted to solve the bat and ball prob­lem with­out hav­ing to ‘do it by alge­bra’, how would you go about it?

My origi­nal post on the prob­lem was a pretty quick, throw­away job, but over time it picked up some truly ex­cel­lent com­ments by an­ders and Kyzen­tun, which re­ally start to dig into the struc­ture of the prob­lem and sug­gest ways to ‘just see’ the an­swer. The thread with an­ders in par­tic­u­lar goes into lots of other ex­am­ples of how we think through solv­ing var­i­ous prob­lems, and is well worth read­ing in full. I’ll only sum­marise the bat-and-ball-re­lated parts of the com­ments here.

We all used some var­i­ant of the method sug­gested by Marlo Eu­gene in the com­ments above. Writ­ing out the ba­sic prob­lem again, we have:

, .

Now, in­stead of im­me­di­ately jump­ing to the stan­dard method of elimi­nat­ing one of the vari­ables, we can just look at what these two equa­tions are say­ing and solve it di­rectly ‘by think­ing’. We have a bat, . If you add the price of the ball, , you get 110 cents. If you in­stead re­move the same quan­tity you get 100 cents. So the bat’s price must be ex­actly halfway be­tween these two num­bers, at 105 cents. That leaves five for the ball.

Now that I’m think­ing of the prob­lem in this way, I di­rectly see the equa­tions as be­ing ‘about a bat that’s halfway be­tween 100 and 110 cents’, and the an­swer is in­cred­ibly ob­vi­ous.

Kyzen­tun sug­gests a var­i­ant on the prob­lem that is much less coun­ter­in­tu­itive than the origi­nal:

A cen­tered piece of text and its mar­gins are 110 columns wide. The text is 100 columns wide. How wide is one mar­gin?

Same num­bers, same math­e­mat­i­cal for­mula to reach the solu­tion. But less mis­lead­ing be­cause you know there are two mar­gins, and thus know to di­vide by two af­ter sub­tract­ing.

In the origi­nal prob­lem, the 110 units and 100 units both re­fer to some­thing ab­stract, the sum and differ­ence of the bat and ball. In Kyzen­tun’s ver­sion these be­come much more con­crete ob­jects, the width of the text and the to­tal width of the mar­gins. The work of see­ing the equa­tions as re­lat­ing to some­thing con­crete has mostly been done for you.

Similarly, an­ders works the prob­lem by ‘get­ting rid of the 100 cents’, and split­ting the re­main­der in half to get at the price of the ball:

I just had an easy time with #1 which I haven’t be­fore. What I did was take away the differ­ence so that all the items are the same (sub­tract 100), evenly di­vide the re­main­der among the items (di­vide 10 by 2) and then add the resi­d­u­als back on to get 105 and 5.

The heuris­tic I seem to be us­ing is to treat ob­jects as made up of a value plus a resi­d­ual. So when they gave me the resi­d­ual my next thought was “now all the ob­jects are the same, so what­ever I do to one I do to all of them”.

I think that af­ter rea­son­ing my way through all these per­spec­tives, I’m fi­nally at the point where I have a quick, ‘in­tu­itive’ un­der­stand­ing of the prob­lem. But it’s sur­pris­ing how much work it was for such a sim­ple bit of alge­bra.

Fi­nal thoughts

Rather than mak­ing any big con­clu­sions, the main thing I wanted to demon­strate in this post is how com­pli­cated the story gets when you look at one prob­lem in de­tail. I’ve writ­ten about close read­ing re­cently, and this has been some­thing like a close read­ing of the bat and ball prob­lem.

Fred­er­ick’s origi­nal pa­per on the Cog­ni­tive Reflec­tion Test is in that generic so­cial sci­ence style where you define a new met­ric and then see how it cor­re­lates with a bunch of other macroscale fac­tors (ei­ther big so­cial cat­e­gories like gen­der or ed­u­ca­tion level, or the re­sults of other statis­ti­cal tests that try to mea­sure fac­tors like time prefer­ence or risk prefer­ence). There’s a strange in­differ­ence to the de­tails of the test it­self – at no point does he dis­cuss why he picked those spe­cific three ques­tions, and there’s no at­tempt to model what was mak­ing the in­tu­itive-but-wrong an­swer ap­peal­ing.

The later pa­per by Meyer, Spunt and Fred­er­ick is much more in­ter­est­ing to me, be­cause it re­ally starts to pick apart the speci­fics of the bat and ball prob­lem. Is an eas­ier ques­tion get­ting sub­sti­tuted? Can par­ti­ci­pants re­pro­duce the cor­rect ques­tion from mem­ory?

I learned the most from the in­di­vi­d­ual re­sponses, though. This is where you re­ally get to see the va­ri­ety of ways that peo­ple tackle the prob­lem. Care­ful re­flec­tion definitely seems to im­prove the chance of a cor­rect an­swer in gen­eral, but many of the re­sponses don’t re­ally fit the neat ‘fast vs slow’ di­vi­sion of the origi­nal setup.


I’m in­ter­ested in any com­ments on the post, but here are a few spe­cific things I’d like to get your an­swers to:

  • My rapid, in­tu­itive an­swer for the bat and ball ques­tion is wrong (at least un­til I re­trained it by think­ing about the prob­lem way too much). How­ever, for the other two I ‘just see’ the cor­rect an­swer. Is this com­mon for other peo­ple, or do you have a differ­ent split?

  • If you’re able to rapidly ‘just see’ the an­swer to the bat and ball ques­tion, how do you do it?

  • How do peo­ple go about de­sign­ing tests like these? This isn’t at all my field and I’d be in­ter­ested in any good sources. I’d kind of as­sumed that there’d be some kind of se­ri­ous-busi­ness Test Creation Method­ol­ogy, but for the CRT at least it looks like peo­ple just no­ticed they got sur­pris­ing an­swers for the bat and ball ques­tion and looked around for similar ques­tions. Is that un­usual com­pared to other psy­cholog­i­cal tests?