Steven Byrnes comments on Rationality Research Report: Towards 10x OODA Looping?

Steven Byrnes 7 Mar 2024 2:57 UTC
4 points
0
something that’s really weirded me out about the literature on IQ, transfer learning, etc, is that… it seems like it’s just really hard to transfer learn. We’ve basically failed to increase g, and the “transfer learning demonstrations” I’ve heard of seemed pretty weaksauce.
You might be referring to the skeptical take on transfer learning, summarized as follows in Surfaces and Essences by Hofstadter & Sander:
Experimental studies have indeed demonstrated that subjects who are shown a source situation and who are then given a target situation are usually unable to see any connection between the two unless they share surface-level traits. Furthermore, in such experiments, when two situations have a superficial resemblance, then the second one invariably brings the first one to mind, no matter whether it is appropriate or not (that is, irrespective of whether there are deeper reasons to connect the two cases). For instance, if subjects first tackle an arithmetic problem concerning items bought in a store, then any other problem concerning purchases will instantly remind them of the initial problem. But if the theme of the first problem is experimentally manipulated say it becomes a visit to a doctor’s office instead of a store — then the participants will almost surely see no link between the two stories, even if the solution method for the first problem applies perfectly to the second problem.
But then the authors argue that this skeptical take is misleading:
Unfortunately, the source–target [experimental] paradigm [in the studies above] has a serious defect that undermines the generality of the conclusions that experiments based upon it produce. This defect stems from the fact that the knowledge acquired about the source situation during the twenty minutes or so of a typical experiment is perforce very limited — often consisting merely in the application of a completely unfamiliar formula to a word problem. By contrast, when in real life we are faced with a new situation and have to decide what to do, the source situations we retrieve spontaneously and effortlessly from our memories are, in general, extremely familiar. We all depend implicitly on knowledge deeply rooted in our experiences over a lifetime, and this knowledge, which has been confirmed and reconfirmed over and over again, has also been generalized over time, allowing it to be carried over fluidly to all sorts of new situations. It is very rare that, in real life, we rely on an analogy to a situation with which we are barely familiar at all. To put it more colorfully, when it comes to understanding novel situations, we reach out to our family and our friends rather than to the first random passerby. But in the source–target paradigm, experimental subjects are required to reach out to a random passerby—namely, the one that was imposed on them as a source situation by the experimenter.
And so, what do the results obtained in the framework of this paradigm really demonstrate? What they show is that when people learn something superficially, they wind up making superficial analogies to it.
To rephrase: The problem is that, in the experimental protocol, the subjects only ever wind up with a crappy surface-level understanding of the source situation, not a deep mental model of the source situation reflective of true familiarity / expertise. When people do have real comfort and familiarity with the source situation, then they find deep structural analogies all over the place.
For example (these are my examples), if you talk to an economist about some weird situation, they will easily notice that there’s a supply-and-demand way to look at it, and ditto gains-from-trade and so on. And physicists will analogize random things to superpositions and fourier-space and so on, etc. Of course, the main thing that everyone is an “expert” in is “intuitive everyday life stuff”, and hence our thinking and speech is full of constant non-surface-level analogies to traveling, seasons, ownership, arguments, etc. etc.
I’m not sure if this is relevant to what you were saying, just thought I’d share.
What links here?
- Steven Byrnes's comment on Transfer Learning in Humans by niplav (22 Apr 2024 13:58 UTC; 10 points)
- Steven Byrnes's comment on Surprising LLM reasoning failures make me think we still need qualitative breakthroughs for AGI by Kaj_Sotala (17 Apr 2025 20:41 UTC; 6 points)
- Kaj_Sotala 23 Apr 2025 9:35 UTC
  4 points
  0
  Parent
  I’ve seen claims that the failure of transfer also goes in the direction of people with extensive practical experience and familiarity with math failing to apply it in a more formal context. From p. 64-67 of Cognition in Practice:
  Like the AMP, the Industrial Literacy Project began with intensive observational work in everyday settings. From these observations (e.g. of preloaders assembling orders in the icebox warehouse) hypotheses were developed about everyday math procedures, for example, how preloaders did the arithmetic involved in figuring out when to assemble whole or partial cases, and when to take a few cartons out of a case or add them in, in order to efficiently gather together the products specified in an order. Dairy preloaders, bookkeepers and a group of junior high school students took part in simulated case loading experiments. Since standardized test data were available from the school records of the students, it was possible to infer from their performance roughly the grade-equivalent of the problems. Comparisons were made of both the performances of the various experimental groups and the procedures employed for arriving at problem solutions.
  A second study was carried out by cognitive psychologists investigating arithmetic practices among children selling produce in a market in Brazil (Carraher et al. 1982; 1983; Carraher and Schliemann 1982). They worked with four boys and a girl, from impoverished families, between 9 and 15 years of age, third to eighth grade in school. The researchers approached the vendors in the marketplace as customers, putting the children through their arithmetic paces in the course of buying bananas, oranges and other produce.
  M. is a coconut vendor, 12 years old, in the third grade. The interviewer is referred to as ‘customer.’
  Customer: How much is one coconut?
  M: 35.
  Customer: I’d like ten. How much is that?
  M: (Pause.) Three will be 105; with three more, that will be 210. (Pause) I need four more. That is … (pause) 315 … I think it is 350.
  The problem can be mathematically represented in several ways. 35 x 10 is a good representation of the question posed by the interviewer. The subject’s answer is better represented by 105 + 105 + 105 +35, which implies that 35 x 10 was solved by the subject as (3 x 35) + 105 + 105 +35 … M. proved to be competent in finding out how much 35 x 10 is, even though he used a routine not taught in 3rd grade, since in Brazil3rd graders learn to multiply any number by ten simply by placing a zero to the right of that number. (Carraher, Carraher and Scldiemam. 1983: 8-9)
  The conversation with each child was taped. The transcripts were analyzed as a basis for ascertaining what problems should appear on individually constructed paper and pencil arithmetic tests. Each test included all and only the problems the child attem pted to solve in the market. The formal test was given about one week after the informal encounter in the market.
  Herndon, a teacher who has written eloquently about American schooling, described (1971) his experiences teaching a junior high class whose students had failed in mainstream classrooms. He discovered that one of them had a well-paid, regular job scoring for a bowling league. The work demanded fast, accurate, complicated arithmetic. Further, all of his students engaged in relatively extensive arithmetic activities while shopping or in after-schooljobs. He tried to build a bridge between their practice of arithmetic outside the classroom and school arithmetic lessons by creating “bowling score problems,” “shopping problems,” and “paper route problems.” The attempt was a failure, the league scorer unable to solve even a simple bowling problem in the school setting. Herndon provides a vivid picture of the discontinuity, beginning with the task in the bowling alley:
  … eight bowling scores at once. Adding quickly, not making any mistakes (for no one was going to put up with errors), following the rather complicated process of scoring in the game of bowling. Get a spare, score ten plus whatever you get on the next ball, score a strike, then ten plus whatever you get on the next two balls; imagine the man gets three strikes in a row and two spares and you are the scorer, plus you are dealing with seven other guys all striking or sparing or neither one. I figured I had this particular dumb kid now. Back in eighth period I lectured him on how smart he was to be a league scorer in bowling. I pried admissions from the other boys, about how they had paper routes and made change. I made the girls confess that when they went to buy stuff they didn’t have any difficulty deciding if those shoes cost $10.95 or whether it meant $109.50 or whether it meant $1.09 or how much change they’d get back from a twenty. Naturally I then handed out bowling-score problems, and naturally everyone could choose which ones they wanted to solve, and naturally the result was that all the dumb kids immediately rushed me yelling, “Is this right? I don’t know how to do it! What’s the answer? This ain’t right, is it?” and “What’s my grade?” The girls who bought shoes for $10.95 with a $20 bill came up with $400.15 for change and wanted to know if that was right? The brilliant league scorer couldn’t decide whether two strikes and a third frame of eight amounted to eighteen or twenty-eight or whether it was one hundred eight and a half. (Herndon 1971: 94-95)
  People’s bowling scores, sales of coconuts, dairy orders and best buys in the supermarket were correct remarkably often; the performance of AMP participants in the market and simulation experiment has already been noted. Scribner comments that the dairy preloaders made virtually no errors in a simulation of their customary task, nor did dairy truck drivers make errors on simulated pricing of delivery tickets (Scribner and Fahrmeier 1982: to, 18). In the market in Recife, the vendors generated correct arithmetic results 99% of the time.
  All of these studies show consistent discontinuities between individuals’ performances in work situations and in school-like testing ones. Herndon reports quite spectacular differences between math in the bowling alley and in a test simulating bowling score “problems.” The shoppers’ average score was in the high 50s on the math test. The market sellers in Recife averaged 74% on the pencil and paper test which had identical math problems to those each had solved in the market. The dairy loaders who did not make mistakes in the warehouse scored on average 64% on a formal arithmetic test.
  Both Claude and Perplexity claimed that these results have been consistently replicated, e.g. Perplexity’s answer included a link to a Nature paper from February:
  Children’s arithmetic skills do not transfer between applied and academic mathematics
  Many children from low-income backgrounds worldwide fail to master school mathematics¹; however, some children extensively use mental arithmetic outside school²^,³. Here we surveyed children in Kolkata and Delhi, India, who work in markets (n = 1,436), to investigate whether maths skills acquired in real-world settings transfer to the classroom and vice versa. Nearly all these children used complex arithmetic calculations effectively at work. They were also proficient in solving hypothetical market maths problems and verbal maths problems that were anchored to concrete contexts. However, they were unable to solve arithmetic problems of equal or lesser complexity when presented in the abstract format typically used in school. The children’s performance in market maths problems was not explained by memorization, access to help, reduced stress with more familiar formats or high incentives for correct performance. By contrast, children with no market-selling experience (n = 471), enrolled in nearby schools, showed the opposite pattern. These children performed more accurately on simple abstract problems, but only 1% could correctly answer an applied market maths problem that more than one third of working children solved (β = 0.35, s.e.m. = 0.03; 95% confidence interval = 0.30–0.40, P < 0.001). School children used highly inefficient written calculations, could not combine different operations and arrived at answers too slowly to be useful in real-life or in higher maths. These findings highlight the importance of educational curricula that bridge the gap between intuitive and formal maths.
  - Raemon 23 Apr 2025 16:51 UTC
    2 points
    0
    Parent
    This sounds like it might be a Paper Trauma -ish thing, which might have a different specific mechanism.
    - Kaj_Sotala 23 Apr 2025 18:56 UTC
      2 points
      0
      Parent
      The bit in the Nature paper saying that the formal → practical direction goes comparably badly as the practical → formal direction would suggest that it’s at least not only that. (I only read the abstract of it, though.)
      - Raemon 24 Apr 2025 0:55 UTC
        2 points
        0
        Parent
        I do agree it’s suggestive, I’d be interested to see practical → different practical.
- Raemon 8 Mar 2024 1:15 UTC
  4 points
  0
  Parent
  I was going off a vague sense from having talked to a few people who had scanned the literature more than I.
  Right now I’m commissioning a lit review about “transfer learning”, “meta learning”, and things similar to that. My sense so far is that there aren’t a lot of super impressive results, but part of that looks like it’s because it’s hard to teach people relevant stuff in a “laboratory”-esque setting.

Steven Byrnes comments on Rationality Research Report: Towards 10x OODA Looping?

Children’s arithmetic skills do not transfer between applied and academic mathematics