This type of learning does not necessarily seem like augmenting a car for space travel (maybe it is).
In case you misunderstood, the analogy I was making was air travel : space travel :: in-distribution learning : out of distribution learning. I was not claiming that getting an LLM to learn an OOD thing is like tuning an airplane to a space rocket. But, as I said, it’s not a great analogy.
They are not having to learn all about games and dice and boards and pieces all from scratch. They are having to mostly map existing learned models into a slightly novel combination for a slightly new domain. I’m not saying that’s a trivial thing to do, because it’s a hard open problem that many many smart people have been trying to crack for decades.
This seems more like a within-distribution problem: the player is encountering a game that is composed of pieces that are very alike the pieces of the games they’ve previously encountered, and the rules follow a similar logic. I expect that if you invent some game with simple rules that is a not-very-well-thought-through mash of checkers, chess, shogi, go, and the game of Ur, Claude 4.6 will get it.
A better example might be going from normal board games to Baba Is You or something. The ontology (or meta-ontology?) of Baba Is You is very different than that of a vast majority of board games. It’s not like you’re inventing everything from scratch. Old stuff transfers. Someone who has played some games will generally have an easier time learning to play Baba Is You than someone who has never played some games. But some of it transfers in a non-straightforward way, and if you don’t do it right, it breaks.
But it does not seem as daunting as you are portraying it. Yes, out-of-distribution is a very large space. But there’s an awful lot of that space that we’re simply not interested in learning anyway, so that narrows it down quite a lot.
I wouldn’t call it “daunting”. It’s just … a meaningfully different kind of beast?
But I also don’t see how us not caring about most of the space is supposed to make it easier.
If you want to figure out which one out of 1000 hypotheses is the correct one (in some classification problem, say), you don’t care about the other 999, but it doesn’t help.
If you mean that we only need to extrapolate to some nearby-ish regions of the training distribution, and most of the nearby-ish regions of the training distribution we don’t care about, then it seems to me like you’re looking for some specialized hacks, and I don’t think specialized hacks will work in general / take you “far”. (Feel free to correct me if I’m misinterpreting you.)
This seems more like a within-distribution problem: the player is encountering a game that is composed of pieces that are very alike the pieces of the games they’ve previously encountered, and the rules follow a similar logic.
Well, that’s one of the big questions, isn’t it? Seems fairly clear there’s no hard boundary between in-distribution and out-of-distribution. Is the cure for cancer and the way to discover it going to be completely OOD? Or is it going to lean heavily on existing knowledge of cell biology, genetics, and all previous cancer research? The common phrasing is ‘standing on the shoulders of giants’. This is pretty well accepted as the way new inventions and discoveries happen. Not as radically alien knowledge that emerges from a vacuum, but an incremental step up using a mountain of existing knowledge bases (analogous to a game composed of pieces very alike ones they’ve previously encountered). Very large discoveries or paradigm shifts are likely more OOD, but the vast bulk of new science is fairly incremental and I would think the sort of problems you’d consider within-distribution. No?
Seems fairly clear there’s no hard boundary between in-distribution and out-of-distribution.
Yeah, this is a vague description of LLMs’ capabilities’ most salient failure mode, but its vagueness (or maybe: our understanding of this phenomenon being low-resolution) doesn’t make it non-real or less significant or easier to overcome.
Is the cure for cancer and the way to discover it going to be completely OOD? Or is it going to lean heavily on existing knowledge of cell biology, genetics, and all previous cancer research?
A mosaic of both, but I also expect that OOD-ish reasoning is common in normal humans, and if you somehow stuck Claude 4.6 in a human body and tasked it with leading a normal human life, it would start doing something weirdly stupid by human standards within the first 1-2 hours and that over time those stupid things would cascade if uncorrected (be it by whoever is overseeing that LLM in human body or by other social forces taking care of a weirdly behaving cyborg).
The common phrasing is ‘standing on the shoulders of giants’. This is pretty well accepted as the way new inventions and discoveries happen. Not as radically alien knowledge that emerges from a vacuum, but an incremental step up using a mountain of existing knowledge bases (analogous to a game composed of pieces very alike ones they’ve previously encountered).
Never did I claim that “OOD-ish reasoning”/”true creativity” is about summoning new knowledge from the vacuum. In my previous comment, I wrote “Old stuff transfers. [...] But some of it transfers in a non-straightforward way, and if you don’t do it right, it breaks.”.
Very large discoveries or paradigm shifts are likely more OOD, but the vast bulk of new science is fairly incremental and I would think the sort of problems you’d consider within-distribution. No?
Sure. AlphaFold and LLMs solving open math problems are examples of this.
I sense that you’re intending this comment to imply/suggest something, but I don’t know what.
In case you misunderstood, the analogy I was making was
air travel : space travel :: in-distribution learning : out of distribution learning. I was not claiming that getting an LLM to learn an OOD thing is like tuning an airplane to a space rocket. But, as I said, it’s not a great analogy.This seems more like a within-distribution problem: the player is encountering a game that is composed of pieces that are very alike the pieces of the games they’ve previously encountered, and the rules follow a similar logic. I expect that if you invent some game with simple rules that is a not-very-well-thought-through mash of checkers, chess, shogi, go, and the game of Ur, Claude 4.6 will get it.
A better example might be going from normal board games to Baba Is You or something. The ontology (or meta-ontology?) of Baba Is You is very different than that of a vast majority of board games. It’s not like you’re inventing everything from scratch. Old stuff transfers. Someone who has played some games will generally have an easier time learning to play Baba Is You than someone who has never played some games. But some of it transfers in a non-straightforward way, and if you don’t do it right, it breaks.
I wouldn’t call it “daunting”. It’s just … a meaningfully different kind of beast?
But I also don’t see how us not caring about most of the space is supposed to make it easier.
If you want to figure out which one out of 1000 hypotheses is the correct one (in some classification problem, say), you don’t care about the other 999, but it doesn’t help.
If you mean that we only need to extrapolate to some nearby-ish regions of the training distribution, and most of the nearby-ish regions of the training distribution we don’t care about, then it seems to me like you’re looking for some specialized hacks, and I don’t think specialized hacks will work in general / take you “far”. (Feel free to correct me if I’m misinterpreting you.)
Well, that’s one of the big questions, isn’t it? Seems fairly clear there’s no hard boundary between in-distribution and out-of-distribution. Is the cure for cancer and the way to discover it going to be completely OOD? Or is it going to lean heavily on existing knowledge of cell biology, genetics, and all previous cancer research? The common phrasing is ‘standing on the shoulders of giants’. This is pretty well accepted as the way new inventions and discoveries happen. Not as radically alien knowledge that emerges from a vacuum, but an incremental step up using a mountain of existing knowledge bases (analogous to a game composed of pieces very alike ones they’ve previously encountered). Very large discoveries or paradigm shifts are likely more OOD, but the vast bulk of new science is fairly incremental and I would think the sort of problems you’d consider within-distribution. No?
Yeah, this is a vague description of LLMs’ capabilities’ most salient failure mode, but its vagueness (or maybe: our understanding of this phenomenon being low-resolution) doesn’t make it non-real or less significant or easier to overcome.
A mosaic of both, but I also expect that OOD-ish reasoning is common in normal humans, and if you somehow stuck Claude 4.6 in a human body and tasked it with leading a normal human life, it would start doing something weirdly stupid by human standards within the first 1-2 hours and that over time those stupid things would cascade if uncorrected (be it by whoever is overseeing that LLM in human body or by other social forces taking care of a weirdly behaving cyborg).
Never did I claim that “OOD-ish reasoning”/”true creativity” is about summoning new knowledge from the vacuum. In my previous comment, I wrote “Old stuff transfers. [...] But some of it transfers in a non-straightforward way, and if you don’t do it right, it breaks.”.
Sure. AlphaFold and LLMs solving open math problems are examples of this.
I sense that you’re intending this comment to imply/suggest something, but I don’t know what.