# The Bat and Ball Problem Revisited

Cross posted from my per­sonal blog.

In this post, I’m go­ing to as­sume you’ve come across the Cog­ni­tive Reflec­tion Test be­fore and know the an­swers. If you haven’t, it’s only three quick ques­tions, go and do it now.

One of the strik­ing early ex­am­ples in Kah­ne­man’s Think­ing, Fast and Slow is the fol­low­ing prob­lem:

(1) A bat and a ball cost $1.10 in to­tal. The bat costs$1.00 more than the ball.

How much does the ball cost? _____ cents

This ques­tion first turns up in­for­mally in a pa­per by Kah­ne­man and Fred­er­ick, who find that most peo­ple get it wrong:

Al­most ev­ery­one we ask re­ports an ini­tial ten­dency to an­swer “10 cents” be­cause the sum $1.10 sep­a­rates nat­u­rally into$1 and 10 cents, and 10 cents is about the right mag­ni­tude. Many peo­ple yield to this im­me­di­ate im­pulse. The sur­pris­ingly high rate of er­rors in this easy prob­lem illus­trates how lightly Sys­tem 2 mon­i­tors the out­put of Sys­tem 1: peo­ple are not ac­cus­tomed to think­ing hard, and are of­ten con­tent to trust a plau­si­ble judg­ment that quickly comes to mind.

In Think­ing Fast and Slow, the bat and ball prob­lem is used as an in­tro­duc­tion to the ma­jor theme of the book: the dis­tinc­tion be­tween fluent, spon­ta­neous, fast ‘Sys­tem 1’ men­tal pro­cesses, and effort­ful, re­flec­tive and slow ‘Sys­tem 2’ ones. The ex­plicit moral is that we are too will­ing to lean on Sys­tem 1, and this gets us into trou­ble:

The bat-and-ball prob­lem is our first en­counter with an ob­ser­va­tion that will be a re­cur­rent theme of this book: many peo­ple are over­con­fi­dent, prone to place too much faith in their in­tu­itions. They ap­par­ently find cog­ni­tive effort at least mildly un­pleas­ant and avoid it as much as pos­si­ble.

This story is very com­pel­ling in the case of the bat and ball prob­lem. I got this prob­lem wrong my­self when I first saw it, and still find the in­tu­itive-but-wrong an­swer very plau­si­ble look­ing. I have to con­sciously re­mind my­self to ap­ply some ex­tra effort and get the cor­rect an­swer.

How­ever, this be­comes more com­pli­cated when you start con­sid­er­ing other tests of this fast-vs-slow dis­tinc­tion. Fred­er­ick later com­bined the bat and ball prob­lem with two other ques­tions to cre­ate the Cog­ni­tive Reflec­tion Test:

(2) If it takes 5 ma­chines 5 min­utes to make 5 wid­gets, how long would it take 100 ma­chines to make 100 wid­gets? _____ minutes

(3) In a lake, there is a patch of lily pads. Every day, the patch dou­bles in size. If it takes 48 days for the patch to cover the en­tire lake, how long would it take for the patch to cover half of the lake? _____ days

Th­ese are de­signed to also have an ‘in­tu­itive-but-wrong’ an­swer (100 min­utes, 24 days), and an ‘effort­ful-but-right’ an­swer (5 min­utes, 47 days). But this time I seem to be im­mune to the wrong an­swers, in a way that just doesn’t hap­pen with the bat and ball:

I always have the same re­ac­tion, and I don’t know if it’s com­mon or I’m just the lone idiot with this prob­lem. The ‘ob­vi­ous wrong an­swers’ for 2. and 3. are com­pletely un­ap­peal­ing to me (I had to look up 3. to check what the ob­vi­ous an­swer was sup­posed to be). Ob­vi­ously the ma­chine-wid­get ra­tio hasn’t changed, and ob­vi­ously ex­po­nen­tial growth works like ex­po­nen­tial growth.

When I see 1., how­ever, I always think ‘oh it’s that bas­tard bat and ball ques­tion again, I know the cor­rect an­swer but can­not see it’. And I have to stare at it for a minute or so to work it out, slowed down dra­mat­i­cally by the fact that Ob­vi­ous Wrong An­swer is jump­ing up and down try­ing to dis­tract me.

If this test was re­ally test­ing my propen­sity for effort­ful thought over spon­ta­neous in­tu­ition, I ought to score zero. I hate effort­ful thought! As it is, I score two out of three, be­cause I’ve trained my in­tu­itions nicely for ra­tios and ex­po­nen­tial growth. The ‘in­tu­itive’, ‘Sys­tem 1’ an­swer that pops into my head is, in fact, the cor­rect an­swer, and the sup­pos­edly ‘in­tu­itive-but-wrong’ an­swers feel bad on a visceral level. (Why the hell would the lily pads take the same amount of time to cover the sec­ond half of the lake as the first half, when the rate of growth is in­creas­ing?)

The bat and ball still gets me, though. My gut hasn’t in­ter­nal­ised any­thing use­ful, and it’s su­per keen on shout­ing out the wrong an­swer in a dis­tract­ing way. My dis­like for effort­ful thought is definitely a prob­lem here.

I wanted to see if oth­ers had raised the same ob­jec­tion, so I started do­ing some re­search into the CRT. In the pro­cess I dis­cov­ered a lot of fol­low-up work that makes the story much more com­plex and in­ter­est­ing.

I’ve come nowhere near to do­ing a proper liter­a­ture re­view. Fred­er­ick’s origi­nal pa­per has been cited nearly 3000 times, and dredg­ing through that for the good bits is a lot more work than I’m will­ing to put in. This is just a sum­mary of the in­ter­est­ing stuff I found on my limited, par­tial dig through the liter­a­ture.

# Think­ing, in­her­ently fast and in­her­ently slow

Fred­er­ick’s origi­nal Cog­ni­tive Reflec­tion Test pa­per de­scribes the Sys­tem 1/​Sys­tem 2 di­vide in the fol­low­ing way:

Rec­og­niz­ing that the face of the per­son en­ter­ing the class­room be­longs to your math teacher in­volves Sys­tem 1 pro­cesses — it oc­curs in­stantly and effortlessly and is un­af­fected by in­tel­lect, alert­ness, mo­ti­va­tion or the difficulty of the math prob­lem be­ing at­tempted at the time. Con­versely, find­ing to two dec­i­mal places with­out a calcu­la­tor in­volves Sys­tem 2 pro­cesses — men­tal op­er­a­tions re­quiring effort, mo­ti­va­tion, con­cen­tra­tion, and the ex­e­cu­tion of learned rules.

I find it in­ter­est­ing that he frames men­tal pro­cesses as be­ing in­her­ently effortless or effort­ful, in­de­pen­dent of the per­son do­ing the think­ing. This is not quite true even for the ex­am­ples he gives — face­blind peo­ple and calcu­lat­ing prodi­gies ex­ist.

This fram­ing is im­por­tant for in­ter­pret­ing the CRT. If the prob­lem in­her­ently has a wrong ‘Sys­tem 1 solu­tion’ and a cor­rect ‘Sys­tem 2 solu­tion’, the CRT can work as in­tended, as an effi­cient tool to split peo­ple by their propen­sity to use one strat­egy or the other. If there are ‘Sys­tem 1’ ways to get the cor­rect an­swer, the whole thing gets much more mud­dled, and it’s hard to dis­en­tan­gle nat­u­ral propen­sity to re­flec­tion from prior ex­po­sure to the right math­e­mat­i­cal con­cepts.

My ten­ta­tive guess is that the bat and ball prob­lem is close to be­ing this kind of effi­cient tool. Although in some ways it’s the sim­plest of the three prob­lems, solv­ing it in a ‘fast’, ‘in­tu­itive’ way re­lies on see­ing the prob­lem in a way that most peo­ple’s ed­u­ca­tion won’t have pro­vided. (I think this is true, any­way—I’ll go into more de­tail later.) I sus­pect that this is less true the other two prob­lems—ra­tios and ex­po­nen­tial growth are top­ics that a math­e­mat­i­cal or sci­en­tific ed­u­ca­tion is more likely to build in­tu­ition for.

(Aside: I’d like to know how these other two prob­lems were cho­sen. The pa­per just states the fol­low­ing:

Mo­ti­vated by this re­sult [the an­swers to the bat and ball ques­tion], two other prob­lems found to yield im­pul­sive er­ro­neous re­sponses were in­cluded with the “bat and ball” prob­lem to form a sim­ple, three-item “Cog­ni­tive Reflec­tion Test” (CRT), shown in Figure 1.

I have a vague sus­pi­cion that Fred­er­ick trawled through some­thing like ‘The Bumper Book of An­noy­ing Rid­dles’ to find some brain­teasers that don’t re­quire too much in the way of math­e­mat­i­cal pre­req­ui­sites. The lily­pads one has a fam­ily re­sem­blance to the clas­sic grains-of-wheat-on-a-chess­board puz­zle, for in­stance.)

How­ever, I haven’t found any great ev­i­dence ei­ther way for this guess. The origi­nal pa­per doesn’t break down par­ti­ci­pants’ scores by ques­tion – it just gives mean scores on the test as a whole. I did how­ever find this meta-anal­y­sis of 118 CRT stud­ies, which shows that the bat and ball ques­tion is the most difficult on av­er­age – only 32% of all par­ti­ci­pants get it right, com­pared with 40% for the wid­gets and 48% for the lily­pads. It also has the biggest jump in suc­cess rate when com­par­ing uni­ver­sity stu­dents with non-stu­dents. That looks like bet­ter math­e­mat­i­cal ed­u­ca­tion does help on the bat and ball, but it doesn’t clear up how it helps. It could im­prove par­ti­ci­pants’ abil­ity to in­tu­itively see the an­swer. Or it could im­prove abil­ity to come up with an ‘un­in­tu­itive’ solu­tion, like solv­ing the cor­re­spond­ing si­mul­ta­neous equa­tions by a rote method.

What I’d re­ally like is some in­sight into what in­di­vi­d­ual peo­ple ac­tu­ally do when they try to solve the prob­lems, rather than just this ag­gre­gate statis­ti­cal in­for­ma­tion. I haven’t found ex­actly what I wanted, but I did turn up a few in­ter­est­ing stud­ies on the way.

# No, se­ri­ously, the an­swer isn’t ten cents

My favourite thing I found was this (ap­par­ently un­pub­lished) ‘ex­tremely rough draft’ by Meyer, Spunt and Fred­er­ick from 2013, re­vis­it­ing the bat and ball prob­lem. The in­tu­itive-but-wrong an­swer turns out to be ex­tremely sticky, and the pa­per is ba­si­cally a se­ries of in­creas­ingly des­per­ate at­tempts to get peo­ple to ac­tu­ally think about the ques­tion.

One con­jec­ture for what peo­ple are do­ing when they get this ques­tion wrong is the at­tribute sub­sti­tu­tion hy­poth­e­sis. This was sug­gested early on by Kah­ne­man and Fred­er­ick, and is a fancy way of say­ing that they are in­stead solv­ing the fol­low­ing sim­pler prob­lem:

(1) A bat and a ball cost $1.10 in to­tal. The bat costs$1.00.

How much does the ball cost? _____ cents

No­tice that this is miss­ing the ‘more than the ball’ clause at the end, turn­ing the ques­tion into a much sim­pler ar­ith­metic prob­lem. This sim­ple prob­lem does have ‘ten cents’ as the an­swer, so it’s very plau­si­ble that peo­ple are get­ting con­fused by it.

Meyer, Spunt and Fred­er­ick tested this hy­poth­e­sis by get­ting re­spon­dents to re­call the prob­lem from mem­ory. This showed a clear differ­ence: 94% of ‘five cent’ re­spon­dents could re­call the cor­rect ques­tion, but only 61% of ‘ten cent’ re­spon­dents. It’s pos­si­ble that there is a differ­ent com­mon cause of both the ‘ten cent’ re­sponse and mis­re­mem­ber­ing the ques­tion, but it at least gives some sup­port for the sub­sti­tu­tion hy­poth­e­sis.

How­ever, get­ting peo­ple to ac­tu­ally an­swer the ques­tion cor­rectly was a much more difficult prob­lem. First they tried bold­ing the words more than the ball to make this clause more salient. This made sur­pris­ingly lit­tle im­pact: 29% of re­spon­dents solved it, com­pared with 24% for the origi­nal prob­lem. Print­ing both ver­sions was slightly more suc­cess­ful, bump­ing up the cor­rect re­sponse to 35%, but it was still a small effect.

After this, they ditched sub­tlety and re­sorted to past­ing these huge warn­ings above the ques­tion:

Th­ese were still only mildly effec­tive, with a cor­rect solu­tion jump­ing to 50% from 45%. Peo­ple just re­ally like the an­swer ‘ten cents’, it seems.

At this point they com­pletely gave up and just flat out added “HINT: 10 cents is not the an­swer.” This worked rea­son­ably well, though there was still a hard core of 13% who per­sisted in writ­ing down ‘ten cents’.

That’s where they left it. At this point there’s not re­ally any room to es­ca­late be­yond con­fis­cat­ing the re­spon­dents’ pens and pre­filling in the an­swer ‘five cents’, and I worry that some­body would still try and scratch in ‘ten cents’ in their own blood. The wrong an­swer is just in­cred­ibly com­pel­ling.

# So, what are peo­ple do­ing when they solve this prob­lem?

Un­for­tu­nately, it’s hard to tell from the pub­lished liter­a­ture (or at least what I found of it). What I’d re­ally like is lots of tran­scripts of in­di­vi­d­u­als talk­ing through their prob­lem solv­ing pro­cess. The clos­est I found was this pa­per by Sza­szi et al, who did carry out these sort of in­ter­view, but it doesn’t in­clude any ex­am­ples of in­di­vi­d­ual re­sponses. In­stead, it gives a ag­gre­gated overview of types of re­sponses, which doesn’t go into the kind of de­tail I’d like.

Still, the ex­am­ples given for their re­sponse cat­e­gories give a few clues. The cat­e­gories are:

• Cor­rect an­swer, cor­rect start. Ex­am­ple given: ‘I see. This is an equa­tion. Thus if the ball equals to x, the bat equals to x plus 1… ’

• Cor­rect an­swer, in­cor­rect start. Ex­am­ple: ‘I would say 10 cents… But this can­not be true as it does not sum up to €1.10...’

• In­cor­rect an­swer, re­flec­tive, i.e. some effort was made to re­con­sider the an­swer given, even if it was ul­ti­mately in­cor­rect. Ex­am­ple: ‘… but I’m not sure… If to­gether they cost €1.10, and the bat costs €1 more than the ball… the solu­tion should be 10 cents. I’m done.’

• No re­flec­tion. Ex­am­ple: ‘Ok. I’m done.’

Th­ese demon­strate one way to rea­son your way to the cor­rect an­swer (solve the si­mul­ta­neous equa­tions) and one way to be wrong (just blurt out the an­swer). They also demon­strate one way to re­cover from an in­cor­rect solu­tion (think about the an­swer you blurted out and see if it ac­tu­ally works). Still, it’s all rather ab­stract and high level.

# How To Solve It

How­ever, I did man­age to stum­ble onto an­other source of in­sight. While re­search­ing the prob­lem I came across this ar­ti­cle from the on­line mag­a­z­ine of the As­so­ci­a­tion for Psy­cholog­i­cal Science, which dis­cusses a var­i­ant ‘Ford and Fer­rari prob­lem’. This is quite in­ter­est­ing in it­self, but I was most ex­cited by the com­ments sec­tion. Fi­nally some ex­am­ples of how the prob­lem is solved in the wild!

The sim­plest ‘an­a­lyt­i­cal’, ‘Sys­tem 2’ solu­tion is to rewrite the prob­lem as two si­mul­ta­neous lin­ear equa­tions and plug-and-chug your way to the cor­rect an­swer. For ex­am­ple, writ­ing for the bat and for the ball, we get the two equations

, ,

which we could then solve in var­i­ous stan­dard ways, e.g.

, ,

which then gives

.

There are a cou­ple of var­i­ants of this ex­plained in the com­ments. It’s a very re­li­able way to tackle the prob­lem: if you already know how to do this sort of rote method, there are no sur­prises. This sort of method would work for any similar prob­lem in­volv­ing lin­ear equa­tions.

How­ever, it’s pretty ob­vi­ous that a lot of peo­ple won’t have ac­cess to this method. Plenty of peo­ple noped out of math­e­mat­ics long be­fore they got to si­mul­ta­neous equa­tions, so they won’t be able to solve it this way. What might be less ob­vi­ous, at least if you mostly live in a high-maths-abil­ity bub­ble, is that these peo­ple may also be miss­ing the sort of tacit math­e­mat­i­cal back­ground that would even al­low them to frame the prob­lem in a use­ful form in the first place.

That sounds a bit ab­stract, so let’s look at some re­sponses (I’ll paste all these straight in, so any ty­pos are in the origi­nal). First, we have these two con­fused com­menters:

The thing is, why does the ball have to be $.05? It could have been .04 0r.03 and the bat would still cost more than$1.

and

This is ex­actly what both­ers me and re­sulted in me want­ing to look up the ques­tion on­line. On the quiz the other 2 ques­tions were defini­tive. This one tech­ni­cally could have more than one an­swer so this is where phy­col­o­gists ac­tu­ally mess up when try­ing to give us a trick ques­tion. The ball at .4 and the bat at 1.06 doesn’t break the rule ei­ther.

Th­ese com­menters don’t au­to­mat­i­cally see two equa­tions in two vari­ables that to­gether are enough to con­strain the prob­lem. In­stead they seem to fo­cus mainly on the first con­di­tion (adding up to $1.10) and just use the sec­ond one as a vague check at best (‘the bat would still cost more than$1’). This means that they are un­able to im­me­di­ately tell that the prob­lem has a unique solu­tion.

In re­sponse, an­other com­menter, Tony, sug­gests a cor­rect solu­tion which is an in­ter­est­ing mix of writ­ing the prob­lem out for­mally and then figur­ing out the an­swer by trial and er­ror:\

I hear your pain. I feel as though psy­chol­o­gists and psy­chi­a­trists get to­gether ev­ery now and then to prove how stoopid I am. How­ever, af­ter more than a lit­tle head scratch­ing I’ve gained an un­der­stand­ing of this puz­zle. It can be ex­pressed as two facts and a ques­tion A=100+B and A+B=110, so B=? If B=2 then the solu­tion would be 100+2+2 and A+B would be 104. If B=6 then the solu­tion would be 100+6+6 and A+B would be 112. But as be KNOW A+B=110 the only num­ber for B on it’s own is 5.

This sug­gests enough half-re­mem­bered math­e­mat­i­cal knowl­edge to find a sen­si­ble ab­stract fram­ing, but not enough to solve it the stan­dard way.

Fi­nally, com­menter Marlo Eu­gene pro­vides an in­ge­nious way of solv­ing the prob­lem with­out writ­ing all the alge­braic steps out:

Lin­guis­tics makes all the differ­ence. The con­cep­tual em­pha­sis seems to lie within the word MORE.

X + Y = $1.10. If X =$1 MORE then that leaves $0.10 TO WORK WITH rather than au­to­mat­i­cally as­sign to Y So you di­vide the re­main­der equally (as­sum­ing nega­tive val­ues are dis­qual­ified) and get 0.05. So even this small sam­ple of com­ments sug­gests a wide di­ver­sity of prob­lem-solv­ing meth­ods lead­ing to the two com­mon an­swers. Fur­ther, these solu­tions don’t all split neatly into ‘Sys­tem 1’ ‘in­tu­itive’ and ‘Sys­tem 2’ ‘an­a­lytic’. Marlo Eu­gene’s solu­tion, for in­stance, is a mixed solu­tion of writ­ing the equa­tions down in a for­mal way, but then find­ing a clever way of just see­ing the an­swer rather than solv­ing them by rote. I’d still ap­pre­ci­ate more de­tailed tran­scripts, in­clud­ing the time taken to solve the prob­lem. My sus­pi­cion is still that very few peo­ple solve this prob­lem with a fast in­tu­itive re­sponse, in the way that I rapidly see the cor­rect an­swer to the lily­pad ques­tion. Even the more ‘in­tu­itive’ re­sponses, like Marlo Eu­gene’s, seem to rely on some ini­tial care­ful re­flec­tion and a good ini­tial fram­ing of the prob­lem. If I’m cor­rect about this lack of fast re­sponses, my ten­ta­tive guess for the rea­son is that it has some­thing to do with the way most of us learn si­mul­ta­neous equa­tions in school. We gen­er­ally learn ar­ith­metic as young chil­dren in a fairly con­crete way, with the for­mal nu­mer­i­cal prob­lems sup­ple­mented with lots of spe­cific ex­am­ples of adding up ap­ples and ba­nanas and so forth. But then, for some rea­son, this goes com­pletely out of the win­dow once the un­known quan­tity isn’t sit­ting on its own on one side of the equals sign. This is in­stead hived off into its own sep­a­rate sub­ject, called ‘alge­bra’, and the rules are taught much later in a much more for­mal­ised style, with­out much at­tempt to build up in­tu­ition first. (One ex­cep­tion is the sort of puz­zle sheets that are of­ten given to young kids, where the un­knowns are just empty boxes to be filled in. Some­times you get 2+3=□, some­times it’s 2+□=5, but ei­ther way you go about the same pro­cess of us­ing your wits to figure out the an­swer. Then, for some rea­son I’ll never un­der­stand, the work­sheets get put away and the poor kids don’t see the sub­ject again un­til years later, when the box is now called for some rea­son and you have to find the an­swer by defined rules. Any­way, this is a sep­a­rate rant.) This lack of a rich back­ground in puz­zling out the an­swer to spe­cific con­crete prob­lems means most of us lean hard on for­mal rules in this do­main, even if we’re rel­a­tively math­e­mat­i­cally so­phis­ti­cated. Only a few build up the nec­es­sary reper­toire of tricks to solve the prob­lem quickly by in­sight. I’m re­minded of a story in Feyn­man’s The Plea­sure of Find­ing Things Out: Around that time my cousin, who was three years older, was in high school. He was hav­ing con­sid­er­able difficulty with his alge­bra, so a tu­tor would come. I was al­lowed to sit in a cor­ner while the tu­tor would try to teach my cousin alge­bra. I’d hear him talk­ing about x. I said to my cousin, “What are you try­ing to do?” “I’m try­ing to find out what x is, like in 2x + 7 = 15.” I say, “You mean 4.” “Yeah, but you did it by ar­ith­metic. You have to do it by alge­bra.” I learned alge­bra, for­tu­nately, not by go­ing to school, but by find­ing my aunt’s old school­book in the at­tic, and un­der­stand­ing that the whole idea was to find out what x is—it doesn’t make any differ­ence how you do it. I think this re­li­ance on for­mal meth­ods might be some­what less true for ex­po­nen­tial growth and ra­tios, the sub­jects un­der­pin­ning the lily­pad and wid­get ques­tions. Cer­tainly I seem to have bet­ter in­tu­ition there, with­out hav­ing to re­sort to rote calcu­la­tion. But I’m not sure how gen­eral this is. # How To Vi­su­al­ise It If you wanted to solve the bat and ball prob­lem with­out hav­ing to ‘do it by alge­bra’, how would you go about it? My origi­nal post on the prob­lem was a pretty quick, throw­away job, but over time it picked up some truly ex­cel­lent com­ments by an­ders and Kyzen­tun, which re­ally start to dig into the struc­ture of the prob­lem and sug­gest ways to ‘just see’ the an­swer. The thread with an­ders in par­tic­u­lar goes into lots of other ex­am­ples of how we think through solv­ing var­i­ous prob­lems, and is well worth read­ing in full. I’ll only sum­marise the bat-and-ball-re­lated parts of the com­ments here. We all used some var­i­ant of the method sug­gested by Marlo Eu­gene in the com­ments above. Writ­ing out the ba­sic prob­lem again, we have: , . Now, in­stead of im­me­di­ately jump­ing to the stan­dard method of elimi­nat­ing one of the vari­ables, we can just look at what these two equa­tions are say­ing and solve it di­rectly ‘by think­ing’. We have a bat, . If you add the price of the ball, , you get 110 cents. If you in­stead re­move the same quan­tity you get 100 cents. So the bat’s price must be ex­actly halfway be­tween these two num­bers, at 105 cents. That leaves five for the ball. Now that I’m think­ing of the prob­lem in this way, I di­rectly see the equa­tions as be­ing ‘about a bat that’s halfway be­tween 100 and 110 cents’, and the an­swer is in­cred­ibly ob­vi­ous. Kyzen­tun sug­gests a var­i­ant on the prob­lem that is much less coun­ter­in­tu­itive than the origi­nal: A cen­tered piece of text and its mar­gins are 110 columns wide. The text is 100 columns wide. How wide is one mar­gin? Same num­bers, same math­e­mat­i­cal for­mula to reach the solu­tion. But less mis­lead­ing be­cause you know there are two mar­gins, and thus know to di­vide by two af­ter sub­tract­ing. In the origi­nal prob­lem, the 110 units and 100 units both re­fer to some­thing ab­stract, the sum and differ­ence of the bat and ball. In Kyzen­tun’s ver­sion these be­come much more con­crete ob­jects, the width of the text and the to­tal width of the mar­gins. The work of see­ing the equa­tions as re­lat­ing to some­thing con­crete has mostly been done for you. Similarly, an­ders works the prob­lem by ‘get­ting rid of the 100 cents’, and split­ting the re­main­der in half to get at the price of the ball: I just had an easy time with #1 which I haven’t be­fore. What I did was take away the differ­ence so that all the items are the same (sub­tract 100), evenly di­vide the re­main­der among the items (di­vide 10 by 2) and then add the resi­d­u­als back on to get 105 and 5. The heuris­tic I seem to be us­ing is to treat ob­jects as made up of a value plus a resi­d­ual. So when they gave me the resi­d­ual my next thought was “now all the ob­jects are the same, so what­ever I do to one I do to all of them”. I think that af­ter rea­son­ing my way through all these per­spec­tives, I’m fi­nally at the point where I have a quick, ‘in­tu­itive’ un­der­stand­ing of the prob­lem. But it’s sur­pris­ing how much work it was for such a sim­ple bit of alge­bra. # Fi­nal thoughts Rather than mak­ing any big con­clu­sions, the main thing I wanted to demon­strate in this post is how com­pli­cated the story gets when you look at one prob­lem in de­tail. I’ve writ­ten about close read­ing re­cently, and this has been some­thing like a close read­ing of the bat and ball prob­lem. Fred­er­ick’s origi­nal pa­per on the Cog­ni­tive Reflec­tion Test is in that generic so­cial sci­ence style where you define a new met­ric and then see how it cor­re­lates with a bunch of other macroscale fac­tors (ei­ther big so­cial cat­e­gories like gen­der or ed­u­ca­tion level, or the re­sults of other statis­ti­cal tests that try to mea­sure fac­tors like time prefer­ence or risk prefer­ence). There’s a strange in­differ­ence to the de­tails of the test it­self – at no point does he dis­cuss why he picked those spe­cific three ques­tions, and there’s no at­tempt to model what was mak­ing the in­tu­itive-but-wrong an­swer ap­peal­ing. The later pa­per by Meyer, Spunt and Fred­er­ick is much more in­ter­est­ing to me, be­cause it re­ally starts to pick apart the speci­fics of the bat and ball prob­lem. Is an eas­ier ques­tion get­ting sub­sti­tuted? Can par­ti­ci­pants re­pro­duce the cor­rect ques­tion from mem­ory? I learned the most from the in­di­vi­d­ual re­sponses, though. This is where you re­ally get to see the va­ri­ety of ways that peo­ple tackle the prob­lem. Care­ful re­flec­tion definitely seems to im­prove the chance of a cor­rect an­swer in gen­eral, but many of the re­sponses don’t re­ally fit the neat ‘fast vs slow’ di­vi­sion of the origi­nal setup. # Questions I’m in­ter­ested in any com­ments on the post, but here are a few spe­cific things I’d like to get your an­swers to: • My rapid, in­tu­itive an­swer for the bat and ball ques­tion is wrong (at least un­til I re­trained it by think­ing about the prob­lem way too much). How­ever, for the other two I ‘just see’ the cor­rect an­swer. Is this com­mon for other peo­ple, or do you have a differ­ent split? • If you’re able to rapidly ‘just see’ the an­swer to the bat and ball ques­tion, how do you do it? • How do peo­ple go about de­sign­ing tests like these? This isn’t at all my field and I’d be in­ter­ested in any good sources. I’d kind of as­sumed that there’d be some kind of se­ri­ous-busi­ness Test Creation Method­ol­ogy, but for the CRT at least it looks like peo­ple just no­ticed they got sur­pris­ing an­swers for the bat and ball ques­tion and looked around for similar ques­tions. Is that un­usual com­pared to other psy­cholog­i­cal tests? • I’ve refer­enced the cog­ni­tive re­flec­tion test as one of those lit­mus tests of ra­tio­nal­ity, where I feel like any de­cent prac­tice of ra­tio­nal­ity should get peo­ple to re­li­ably an­swer the ques­tions on that test. I found this to ac­tu­ally be the best cov­er­age of the whole test, and it’s anal­y­sis of peo­ple’s rea­son­ing to be a sig­nifi­cant step up from what I’ve seen in other cov­er­ages of the test. • Se­cond­ing Habryka. I’d re­ally like to see this re­viewed. • I haven’t thought about the bat and ball ques­tion speci­fi­cally very much since writ­ing this post, but I did get a lot of in­ter­est­ing com­ments and sug­ges­tions that have sort of been rol­ling around my head in back­ground mode ever since. Here’s a few I wanted to high­light: Is the bat and ball ques­tion re­ally differ­ent to the oth­ers? First off, it was in­ter­est­ing to see how much agree­ment there was with my in­tu­ition that the bat and ball ques­tion was in­ter­est­ingly differ­ent to the other two ques­tions in the CRT. Read­ing through the com­ments I count four other peo­ple who ex­plic­itly agree with this (1, 2, 3, 4) and three who ei­ther ex­plic­itly dis­agree or point out that they find the wid­get prob­lem hard­est (5, 6, 7). I’d be in­trigued to know if other peo­ple also dis­agree that the bat and ball feels differ­ent to them. Con­crete vs ab­stract quan­tities. Out of the peo­ple who agreed with that the bat and ball is differ­ent, this com­ment from @aw­bery does a par­tic­u­larly good job of giv­ing a po­ten­tial ex­pla­na­tion for why: The prob­lem is a ‘two things’ prob­lem. The first sen­tence pre­sents two things, a bat and a ball. The lan­guage cor­rectly re­flects there are two things we should con­sider. The first sen­tence is ‘this plus that equals$1.10’. It cor­rectly sounds like a + b; two things. The first sen­tence pre­sents the state of af­fairs, not the prob­lem it­self. The sec­ond sen­tence pre­sents the prob­lem. The lan­guage of the sec­ond sen­tence re­in­forces the two things idea be­cause there’s still the bat and the ball and they’re com­pared against each other: ‘there’s this one and it’s more than that one’. The trick­i­ness is that it is a two things prob­lem, but the two things we need to con­sider are not the most ob­ject level sin­gle units, but the bat, and the bat-plus-ball. Our brains are pul­led to­ward the ob­ject level di­vi­sion of things by the lan­guage and the vi­sual na­ture of the prob­lem. We have to think re­ally hard to un­der­stand that the ab­stract con­struct of the prob­lem is the same shape as the state of af­fairs – there are two things to con­sider in re­la­tion to each other – but while the bat and the ball are still in­volved, they’re re­con­figured by a non-in­tu­itive/​non-ob­ject-like di­vi­sion.

There’s no ob­ject level mir­ror trick in the other two prob­lems, they’re straight for­ward maths map­ping an ob­ject level vi­sual rep­re­sen­ta­tion. The wid­get prob­lem pre­sents a pro­cess which doesn’t change how the ma­chines and wid­gets re­late to each other in its solu­tion. Our brains don’t have to mash up the pond and the lilies to sep­a­rate the vi­sual pre­sen­ta­tion to an ab­stract level. We can see that the pond is the same pond, half cov­ered with lilies then fully cov­ered with lilies at the next step. We don’t sud­denly have some new ab­stract un­real con­figu­ra­tion of lilies and pond to con­tend with.

I think this is why Kyzen­tun and An­der’s meth­ods help get at the bat and ball prob­lem in­tu­itively – be­cause they by­pass the con­flict be­tween ob­ject level and ab­stract and trans­late it into the for­mal alge­bra realm. The prob­lem as pre­sented is non-in­tu­itive be­cause the ob­jects vi­su­al­iza­tion it sug­gests doesn’t re­flect the shape of the for­mal solu­tion.

So I think this is a par­tic­u­lar type of prob­lem, one in which vi­sual shape and lan­guage of the pre­sen­ta­tion col­lude to obfus­cate the vi­su­al­iza­tion of the solu­tion at an ab­stract/​for­mal level. It’s a differ­ent type of prob­lem to the other two in this sense, be­cause the ob­jects they pre­sent can be used as given in the solu­tion.

Close­ness to cor­rect an­swer. Another in­ter­est­ing pos­si­bil­ity is in TheManxLoiner’s com­ment—that the bat and ball prob­lem is difficult be­cause the in­cor­rect an­swer is ‘close to the real one’, whereas for the other two prob­lems the in­cor­rect an­swer is ‘wildly off’. I’ve writ­ten a com­ment in re­sponse but I need to think about this more.

Eth­nomethod­ol­ogy. David Chap­man pointed out that these in­tro­spec­tive ac­counts of what peo­ple are think­ing when they solve maths prob­lems are very un­re­li­able, and that I’d prob­a­bly be bet­ter con­cen­trat­ing strictly on what peo­ple do, as in eth­nomethod­ol­ogy:

Yes, the fun­da­men­tal prin­ci­ple of eth­nomethod­olog­i­cal method­ol­ogy is “look at what peo­ple say and do, and don’t ever spec­u­late about what’s hap­pen­ing in their head, be­cause we can’t know.” At first that seems like a strait­jacket, and highly un­in­tu­itive; but it forces you to re­ally look, and then you see what is go­ing on.

This sounds promis­ing. I’m only just get­ting round to read­ing some eth­nomethod­ol­ogy, and I haven’t got my bear­ings yet.

Cog­ni­tive de­cou­pling. There’s a link with cog­ni­tive de­cou­pling (in Stanovich’s origi­nal sense) that could be worth ex­plor­ing fur­ther. Suc­cess in the bat and ball prob­lem seems to in­volve de­cou­pling from the noisy wrong an­swer. David Chap­man recom­mended For­mal Lan­guages in Logic by Du­tilh No­vaes for more back­ground on this. So far I’ve read maybe a third of it. I’ve also writ­ten a bit more about cog­ni­tive de­cou­pling and the his­tory of the term here.

Next steps. I’m not sure where I’m go­ing to take this next. Prob­a­bly nowhere much for a while, as I have other pri­ori­ties. But some op­tions are:

• An­ders came up with a load of similar prob­lems in the com­ments. Th­ese are de­signed to be cog­ni­tively un­pleas­ant in the same way as the bat and ball, so I keep putting them off. I should ac­tu­ally go through them!

• I’m go­ing to con­tinue read­ing Du­tilh No­vaes and some eth­nomethod­ol­ogy.

• Con­nect more speci­fi­cally to Stanovich’s idea of cog­ni­tive de­cou­pling.

Test­ing the­o­ries? Fur­ther out, it could be in­ter­est­ing to ac­tu­ally test some the­o­ries by try­ing al­ter­na­tive, dis­guised ver­sions of the ques­tion, on Me­chan­i­cal Turk or some­thing. Right now I’ve barely con­sid­ered this, be­cause I haven’t thought through what I’d want care­fully enough yet, but it might be in­ter­est­ing to test vari­a­tions in:

• how con­crete the things the quan­tities re­fer to are (e.g. re­ally con­crete like ‘the price of the bat’, or more ab­stract like ‘the differ­ence be­tween the price of the bat and ball’. Some of An­ders’ var­i­ant ques­tions might fit the bill

• how close in mag­ni­tude the in­tu­itive-but-wrong an­swer is, as in TheManxLoiner’s comment

I’m very ig­no­rant about ex­per­i­ment de­sign, so to do this I’d to get help from some­one more knowl­edge­able. And psych re­search sounds like a gi­gan­tic minefield even if you are knowl­edge­able, so I’d prob­a­bly end up wast­ing my time. But prob­a­bly I’d learn some­thing from go­ing through the pro­cess, and it’s some­thing that could maybe hap­pen in the fu­ture.

• It’s nice to see such an in-depth anal­y­sis of the CRT ques­tions. I don’t re­ally share dross­bucket’s in­tu­ition—for me the 100 wid­get ques­tion feels coun­ter­in­tu­itive the same way as the ball and bat ques­tion, but nei­ther feels re­ally aver­sive, so it was hard for me to ap­pre­ci­ate the feel­ings that gen­er­ated this post. But this gives a good ex­am­ple of an idea of “train­ing math­e­mat­i­cal in­tu­itions” I hadn’t thought about be­fore.

• My daugh­ter is just start­ing to learn sub­trac­tion. She was very frus­trated by it, and if I ver­bally asked “What’s seven minus five?” she was about 50% likely to give the right an­swer. I asked her a se­quence of sim­ple sub­trac­tion prob­lems and she con­sis­tently performed at about that level. In the course of our back and forth I switch my phras­ing to the form “You have seven ap­ples and you take away five, how many left?” and she im­me­di­ately started an­swer­ing the ques­tions 100% cor­rectly, very rapidly too. Ex­per­i­men­tally I switched back to the prior form and she started get­ting them wrong again. It was ap­par­ent to me that sim­ply phras­ing the prob­lem in terms of con­crete ob­jects was ac­ti­vat­ing some­thing like vi­su­al­iza­tion which made the prob­lems easy, and just phras­ing it as ab­stract num­bers was failing to ac­ti­vate this switch. So as you say, for more tricky ar­ith­metic prob­lems, it may be the case that what men­tal cir­cuits are “ac­ti­vated au­to­mat­i­cally” de­ter­mine the first an­swer you ar­rive at, and you can ex­ploit that effect with edge cases like this.

• Strangely, it can some­times also go the other way!

One of my most eye-open­ing teach­ing ex­pe­riences oc­curred when I was helping a six-year-old who was strug­gling with ba­sic ad­di­tion – or so it ap­peared. She was try­ing to work through a book that helped her to the con­cept of ad­di­tion via var­i­ous ex­am­ples such as “If Nel­lie has three ap­ples and is then given two more, how many ap­ples does she have?” The poor lit­tle girl didn’t have a clue.
How­ever, af­ter spend­ing a short time with her I dis­cov­ered that she could do 3+2 with no prob­lem what­so­ever. In fact, she had no trou­ble with ad­di­tion. She just couldn’t get her head around all these wretched ap­ples, cakes, mon­keys etc that were be­ing used to “ex­plain” the con­cept of ad­di­tion to her. She needed to work through the book al­most “back­wards” – I had to help her un­der­stand that adding up ap­ples was just an ex­am­ple of an ab­stract ad­di­tion she could do perfectly well! Her prob­lem was that all the books for six-year-olds went the other way round.

I think this is un­usual though.

• Ooh, I’d for­got­ten about that test, and how the beer ver­sion was much eas­ier—that would be an­other good one to read up on.

• I sus­pect that this is less true the other two prob­lems—ra­tios and ex­po­nen­tial growth are top­ics that a math­e­mat­i­cal or sci­en­tific ed­u­ca­tion is more likely to build in­tu­ition for.

This seems to be con­tra­dicted by:

the bat and ball ques­tion is the most difficult on av­er­age – only 32% of all par­ti­ci­pants get it right, com­pared with 40% for the wid­gets and 48% for the lily­pads. It also has the biggest jump in suc­cess rate when com­par­ing uni­ver­sity stu­dents with non-stu­dents.
• Ah yeah, I meant to make this bit clearer and for­got.

I’m not re­ally sure what to make of that state­ment you put in ital­ics. The jump in suc­cess rate could be down to bet­ter trained in­tu­ition. It could also be due to bet­ter ac­cess to for­mal meth­ods. I don’t re­ally see it as good ev­i­dence for my guess ei­ther way.

If I get more time later I’ll edit the post.

• I didn’t “just see” the an­swers to the ques­tions the first time I saw them, but nei­ther would I say that I had to solve them en­tirely for­mally. It was more like dock­ing a boat—the river keeps tug­ging at the tail end, un­til you feel the boat’s side touch the berth and know it has stopped. There’s a kind of nat­u­ral in­er­tia to this kind of puz­zles.

Also, there is a kind of prob­lems like “one wallet con­tains ten coins, an­other one con­tains twice more, and the to­tal is twenty; ex­plain” that get asked much ear­lier than kids learn alge­bra, if I re­mem­ber right. But it gets dis­missed, in favour of cases where you must learn not to count the same bits of ev­i­dence twice (cough Bayes cough). I like to think this dis­mis­sal bites peo­ple in the back­side when they learn Men­delian ge­net­ics (more eas­ily seen when the genes in ques­tion in­ter­act hi­er­ar­chi­cally) or, Mer­lin for­bid, mass-spec­trom­e­try, where the math difficulty is com­pli­cated by the chem difficulty of molecules not di­vid­ing into usual sub­units.

Whew, I was think­ing to write a sep­a­rate post on this, but now I don’t have to! Profit!

• I have the same ex­pe­rience as you, dross­bucket: my rapid an­swer to (1) was the com­mon in­cor­rect an­swer, but for (2) and (3) my in­tu­ition is well-honed.

A pos­si­ble rea­son for this is that the in­tu­itive but in­cor­rect an­swer in (1) is a de­cent ap­prox­i­ma­tion to the cor­rect an­swer, whereas the com­mon in­cor­rect an­swers in (2) and (3) are wildly off the cor­rect an­swer. For (1) I have to ex­plic­itly do a calcu­la­tion to ver­ify the in­cor­rect­ness of the rapid an­swer, whereas in (2) and (3) my un­der­stand­ing of the situ­a­tion im­me­di­ately rules out the in­cor­rect an­swers.

Here are ques­tions which might be similar to (I):

(4a) I booked seats J23 to J29 in a cin­ema. How many seats have I booked?

(4b) There is a 20m fence in which the fence posts are 2m apart. How many fence posts are there?

(4c) How many num­bers are there in this list: 200,201,202,203,204,...,300.

(5) In 24 hours, how many times do the hour-hand and minute-hand of a stan­dard clock over­lap?

(6) You are in a race and you just over­take sec­ond place. What is your new po­si­tion in the race?

• A pos­si­ble rea­son for this is that the in­tu­itive but in­cor­rect an­swer in (1) is a de­cent ap­prox­i­ma­tion to the cor­rect an­swer, whereas the com­mon in­cor­rect an­swers in (2) and (3) are wildly off the cor­rect an­swer. For (1) I have to ex­plic­itly do a calcu­la­tion to ver­ify the in­cor­rect­ness of the rapid an­swer, whereas in (2) and (3) my un­der­stand­ing of the situ­a­tion im­me­di­ately rules out the in­cor­rect an­swers.

I must have missed this com­ment be­fore, sorry. This is a re­ally in­ter­est­ing point. Just to write it out ex­plic­itly,

(1) cor­rect an­swer: 5, in­cor­rect an­swer: 10
(2) cor­rect an­swer: 5, in­cor­rect an­swer: 100
(3) cor­rect an­swer: 47, in­cor­rect an­swers: 24

Now, for both (1) and (3) the wrong an­swer is off by roughly a fac­tor of two. But I also share your sense that the an­swer to (3) is ‘wildly off’, whereas the an­swer to (1) is ‘close enough’.

There are a cou­ple of pos­si­ble rea­sons for this. One is that 5 cents and 10 cents both just reg­ister as ‘some small change’, whereas 24 days and 47 days feel mean­ingfully differ­ent.

But also, it could be to do with rel­a­tive size com­pared to the other num­bers that ap­pear in the prob­lem setup. In (1), 5 and 10 are both similarly small com­pared to 100 and 110. In (3), 24 is small com­pared to 48, but 47 isn’t.

There’s a var­i­ant ‘Ford and Fer­rari’ prob­lem that is some­what re­lated:

> A Fer­rari and a Ford to­gether cost $190,000. The Fer­rari costs$100,000 more than the Ford. How much does the Ford cost?

So here we have cor­rect an­swer: 45000, in­cor­rect an­swer: 90000

Here the in­cor­rect an­swer feels some­what wrong, as the Ford is im­prob­a­bly close in price to the Fer­rari. Peo­ple ap­peared to do bet­ter on this mod­ified prob­lem than the bat and ball, but I haven’t looked into the de­tails.

• I do the bat and the ball prob­lem with low effort, and the ex­po­nen­tial growth one with no effort, but I find the ma­chines one a bit con­fus­ing.

For the bat and the ball, I do some­thing similar to the mar­gins ex­am­ple. I vi­su­al­ise an amount (rep­re­sented by a length on the num­ber line), I vi­su­al­ise a dol­lar higher than that amount on that num­ber line, and then move them around so that they’re not over­lap­ping, then see that the sum is $1.10. Then I re­al­ise there’s two iden­ti­cal bits that are added to the$1, which means they’re 102 each.

(Btw, the bit about them adding a hint and there still be­ing peo­ple who wrote 10 cents made me laugh out loud, that’s hilar­i­ous.)

I’m not sure how to vi­su­al­ise ma­chines tak­ing 5 mins to make 5 things. Do they all do a differ­ent bit of the job? Can they all work on one wid­get si­mul­ta­neously, speed­ing that one up? I guess you’re ex­pected to as­sume they each work on one wid­get. Okay, I guess that kind makes sense, and is in­tu­itive if that’s true.

• I just wanted to say this was a re­ally fun read. I hadn’t con­sid­ered the mul­ti­ple ways peo­ple could get to the right or wrong an­swer.

• It’s been may years since I first saw this ques­tion, so my mem­o­ries may not be ac­cu­rate, but I think my in­ter­nal thoughts went some­thing like this: ‘Well 1.10 minus 1 is .10, but wait I know this is a trick ques­tion so … Ah! I also need to di­vide by 2. The an­swer is .05.’ And then I checked my an­swer by do­ing 1.05 + .05 and 1.05 - .05. In­tro­spect­ing now on why I leaped to the idea of di­vid­ing by two, I think what I was see­ing was some­thing like: In this con­text “costs $1.00 more than” means Ex­actly$1 more than, so it’s say­ing that with­out the $1 the two things are equal and you need to di­vide the cost be­tween them. This makes me think of or­di­nary real life con­texts where I would say “costs$1.00 (or $20 or$100) more than.” It seems pos­si­ble it might be clear to both me and my listener I meant ‘at least x more than,’ ‘as much as x more than,’ or ‘ap­prox­i­mately x more than.’ I won­der if chang­ing the word­ing to “The bat costs ex­actly $1.00 more than the ball” would help any. • This is ex­actly what both­ers me and re­sulted in me want­ing to look up the ques­tion on­line. On the quiz the other 2 ques­tions were defini­tive. This one tech­ni­cally could have more than one an­swer so this is where phy­col­o­gists ac­tu­ally mess up when try­ing to give us a trick ques­tion. The ball at .4 and the bat at 1.06 doesn’t break the rule ei­ther. In­ter­est­ing: these could cover a cou­ple of mi­s­un­der­stand­ings, one is that B>=100, the other that “The bat costs$1.00 more than the ball” does not mean B-b=100, but that B-b>=100.

In or­di­nary lan­guage, “that costs $1.00 more than the other one” is not in­cor­rect if the differ­ence is$1.01.

I sus­pect that per­son would have been cor­rected by say­ing “the bat costs pre­cisely one dol­lar more than the ball”

• The bat and ball prob­lem I an­swer in what I’ll call one con­scious time-step with the cor­rect “five cents”, but it hap­pens too fast for me to ver­ify how (be­yond the usual trou­ble with ver­ify­ing in­ter­nal re­flec­tion). I would spec­u­late, in de­creas­ing or­der of in­tu­itive prob­a­bil­ity, that in or­der to get the an­swer, ei­ther (a) I’ve seen an ex­actly analo­gous “trick” prob­lem be­fore and am pat­tern-match­ing on that or (b) I’m do­ing the alge­bra quickly us­ing my seem­ingly well-de­vel­oped math­e­mat­i­cal in­tu­ition. I can also imag­ine (c) I’m leap­ing to the “wrong” an­swer, then try­ing to ver­ify it, notic­ing it’s wrong, and cor­rect­ing it, all in the same sub­con­scious flash, but that feels off. Imag­in­ing the “ten cents” an­swer doesn’t ac­tu­ally feel com­pel­ling; it just feels wrong. (It feels like a similar emo­tion to notic­ing I’ve got­ten the wrong amount of change, in fact.)

The wid­gets prob­lem I do a no­tice­able dou­ble-take on, but it’s rapidly cor­rected within one con­scious time-step; the “100” is a mo­men­tary flicker be­fore my brain set­tles on the cor­rect an­swer. Imag­in­ing “100” af­ter­wards feels wrong, but less im­me­di­ately so than “ten cents” did. It feels like I have a bias there to­ward an­swer­ing “how many wid­gets can you pro­duce in a fixed time” ques­tions, so I might have an echo of the mis­read­ing “how many wid­gets can 100 ma­chines pro­duce in [as­sumed to be the same amount of time as be­fore, since no con­trary time value is pre­sented to over­ride this]”.

The lily pads ques­tion takes me a con­scious time-step longer to an­swer than ei­ther of the other two; the ini­tial flash is “in­con­clu­sive”, and then I see my­self recheck­ing the part where the quan­tity dou­bles ev­ery step be­fore an­swer­ing “47”. (I no­tice I didn’t re­mem­ber that the steps were days, only re­mem­ber­ing that there was a time unit; I don’t know if that’s rele­vant.) Imag­in­ing “24” af­ter­wards feels some in­ter­me­di­ate level of wrong be­tween “ten cents” and “100”; my men­tal graph of the growth curve puts the ex­pected value 24 at “way too low” in­tu­itively be­fore I can com­pute the ac­tual ex­po­nent.

• How­ever, for the other two I ‘just see’ the cor­rect an­swer. Is this com­mon for other peo­ple, or do you have a differ­ent split?

For all three ques­tions, the wrong an­swer comes to my mind first*. But es­pe­cially in the con­text of ex­pect­ing a trick ques­tion, I sec­ond-guess it and come up with the cor­rect an­swer fairly quickly.

*In the third ques­tion, the ac­tual an­swer “24” does not come to mind first, but the gen­eral sense of “half that num­ber” does. My mind does not ac­tu­ally calcu­late what half of 48 is be­fore finish­ing think­ing through the prob­lem.

• I just saw the an­swer to the bat and ball prob­lem within a few sec­onds. As I re­mem­ber, my thought pro­cess was some­thing like: Could it be 10 cents? No, that adds up to $1.20. So there’s an ex­tra 10 cents—oh, of course, the differ­ence be­tween$1 and $1.10 has to be dis­tributed evenly be­tween both items, so the an­swer is 5 cents. I’ve taken a course that cov­ered si­mul­ta­neous equa­tions, but my mem­ory of it is hazy enough that I’m sure that method would’ve taken me much longer. • I’m go­ing to pull a re­verse true scots­man here and say that is si­mul­ta­neous equa­tions. (When we think of ‘solv­ing si­mul­ta­neous equa­tions’ we imag­ine peo­ple pul­ling the an­swer out, rather than push­ing the solu­tion in and see­ing if it fits—solv­ing ver­sus check­ing as it were.) • How­ever, for the other two I ‘just see’ the cor­rect an­swer. Is this com­mon for other peo­ple, or do you have a differ­ent split? I think I figured out and ver­ified the an­swer to all 3 ques­tions in 5-10 sec­onds each, when I first heard them (though I was ex­posed to them in the con­text of “Take the cog­ni­tive re­flec­tion test which peo­ple fail be­cause the ob­vi­ous an­swer is wrong”, which always felt like cheat­ing to me). If I re­call cor­rectly, the third ques­tion was eas­ier than the sec­ond ques­tion, which was eas­ier than bat & ball: I think I gen­er­ated the cor­rect an­swer as a sug­ges­tion for 2 and 3 pretty much im­me­di­ately (alongside the sup­pos­edly ob­vi­ous an­swers), and I just had to check them. I can’t quite re­mem­ber my strat­egy for bat & ball, but I think I gen­er­ated the$0.1 ball, $1 bat an­swer, saw that the differ­ence was$0.9 in­stead of $1, ad­justed to$0.05, \$1.05, and found that that one was cor­rect.

• This is pretty much the same for me. I think the solu­tion to bat and ball of “10cents, oh no, that doesn’t work. Split the differ­ence evenly for 5 cents? yup that’s bet­ter” is all done on sys­tem 1.

Kah­ne­man’s ex­am­ples of sys­tem 1 think­ing in­clude (I think) a Chess Grand­mas­ter see­ing a good chess move, so he in­cludes the pos­si­bil­ity of train­ing your sys­tem 1 to be able to do more things. In the case of the OP, sys­tem 1 has been trained to re­ally un­der­stand ex­po­nen­tial growth and ra­tios. I think that for me both “quickly check that your an­swer is right” and “try some­thing vaguely sen­si­ble and see what hap­pens” are both in­grained as gen­eral prin­ci­ples that I don’t have to ex­ert effort to ap­ply them to sim­ple prob­lems.

A prob­lem which I would vol­un­teer for a CRT is the snail climb­ing out of a well. Here there’s an ob­vi­ous but wrong an­swer but I think if you re­al­ise that it’s wrong then the cor­rect an­swer isn’t too hard to figure out.