A main reason I give examples like this is that math is an area where it’s feasible for there to be a legible absence of legibilization. (By legible, I mean interpersonal explicitness https://www.lesswrong.com/posts/KuKaQEu7JjBNzcoj5/explicitness .) Mathematicians are supposed to make interesting novel definitions that legibilize inexplicit ideas / ways of thinking. If they don’t, you can tell that their publications are not so interesting. It is legible that they failed to legibilize.
In fact I suspect there will be many much “easier” feats that AI won’t be able to do for a while (some decades). Easier, in the sense that many more humans are able to do those feats. Much harder, in the sense that it requires creativity, and therefore requires having the algorithm-pieces for creativity. That’s easy for humans because it’s our birthright, but “hard” for AI because it doesn’t have that yet. Lots of little novel “knacks”, of the sort people can pick up; surprising connections or analogies; solving lots of little problems, or coming up with a way of seeing some sort of situation in a way that makes it make sense.
But knacks, insights, predictions, inventions—these are not categories. I’m not saying “AI can’t make predictions” or “AI can’t learn knacks”, because that would be nonsensical, because those aren’t categories. I’m saying that some things humans do, require broad-spectrum creativity; but it’s hard to describe those things as a task, and so I don’t give it as an answer to the question about what AI can’t do. (If a task is stereotyped, not that hard, has clear feedback, and has lots of demonstrations—all things correlated with legibility—then probably AI will be able to do it soon!)
Definition: Let n be a positive integer. We define ann-cohesive ring to be a commutative ring S such that, for every prime p dividing the characteristic of S, pn divides the order of the multiplicative group S×. We define ann-cohesive ideal of a ring R to be an ideal I of R such that the quotient ring R/I is an n-cohesive ring.
Then again, even humans aren’t super reliable at making big insights, and it’s not trivial even for very top-tier humans to actually be both creative and have significant impact in the world.
To be clear, I think GPT-n is worse than humans in this regard, but it’s not generally good practice to compare humans and AIs by trying to show that an AI or human can or can’t do something at all, and in general the treatment of capabilities as very discrete such that either an AI does or doesn’t have the capability at all has done harm to AI discourse.
There’s some reason to think around thresholds, like for example long tails requiring high reliability, but in general I’m much more skeptical of the need to attribute a deep reason/cause for why current AI might fail to automate AI research/take over the world, and think if LLMs stall out, a lot of the reason will be pretty prosaic.
I am quite surprised that this happened 3 years ago! This seems really impressive for 3 years ago GPT series? And I expect the models to get better? Yes, it might be a fluke, but wouldn’t we expect current models to have a higher chance of doing a fluke this good?
I think this puts a lot of weight on a bespoke definition of “interesting” and that kind of obscures what you’re saying. I feel similar about your use of the concept of creativity.
I think that current LLMs are extremely “creative” for many plausible definitions of that word, so I guess it doesn’t really carve things at the joints for me. Visual art, stories, plausible baby names, what sorts of recipes you can try with x ingredients. All written-ont-the-tin use cases for these things.
I do not believe that LLMs think very much in the manner that we do at all. I just don’t think I would pitch that as lacking some true spark of creativity or something. It is too opaque to me what you’re saying.
I read what I thought were the relevant excerpts in what you linked there. I hadn’t really crossed paths with you before, but you seem to have a rich ontology and lexicon when it comes to theory of mind.
I am not sure if that pinpoints the disagreement or not. We might just be talking past each other. I’ll tell you what I think creativity is and then I’ll restate my objection to your prediction.
I do think “creativity” is a useful word, just maybe not a load bearing one in my ontology.
Like, if I really like a story and it has a lot of unexpected elements that I think it uses really well, that is what I might call creative. Or anything like that if it feels novel, exciting, clever sort of thing… Maybe if someone were giving something high praise and wanted to say it is very deep and clever they could say it is “very creative”. Especially if it was artistic or novel.
Also sometimes when it is just a lot of whacky things are together even if it’s not that clever. Like, when a kid combines a lot of elements into their pretend world or story.
Ya, I know it has something to do with a minds ability to keep learning and improving. Your “trajectory of creativity” concept is about a minds ability to continue to improve beyond the minds around it. I don’t resonate with those usages as much, but I can also kind of understand where it’s coming from and how you’re using the word.
I think my original objection / pushback was partly that it feels hard to operationalize this because what you find interesting is kind of just your thing and it doesn’t seem like a meaningful proxy for intelligence or something. I guess I would add that surely some people are already impressed and interested with some math ideas that chatbots can come up with. Also, perhaps if you could visualize extremely high dimensional spaces you would think that AlphaEvolves proofs were beautiful, elegant, and crisp. I’m not saying there’s no information/signal in what you’re saying; I just found it left a lot unclear for me when I first read it I guess.
I get that LLMs clearly aren’t as good at publishing new top tier math papers or whatever. I guess, gun to my head, I would put most of that down to, like… lacking many of the cognitive abilities needed to independently execute on large scale, messy tasks independently. Or some mix of attributes like that. I would also expect them to have really bad vibes based planning abilities… And plus by the time off the shelf AIs can write math papers of a given quality tier, the goalposts for interestingness will move accordingly… Maybe there is something to the idea that they are not generally inclined towards effing the ineffable and carving structure from reality, but also I doubt they’d have trouble with eg. neologisms.
I agree that it’s hard to operationalize; that’s part of what my OP was saying. And then I think it’s relatively easier to operationalize in mathematics, where it is in large part explicitly about creativity in my sense (but maybe not especially much in your sense). So that’s where I’m getting my prediction; if you don’t see what I mean by creativity, or wouldn’t make the same prediction, then fair enough, we’ll have to agree to disagree.
If someone asks me “what’s the least impressive thing you think AI won’t be able to do by 20XX”, I give answers like “make lots of original interesting math concepts”. (See https://www.lesswrong.com/posts/sTDfraZab47KiRMmT/views-on-when-agi-comes-and-on-strategy-to-reduce#comments) People sometimes say “well that’s a pretty impressive thing, you’re talking about the height of intellectual labor”.
A main reason I give examples like this is that math is an area where it’s feasible for there to be a legible absence of legibilization. (By legible, I mean interpersonal explicitness https://www.lesswrong.com/posts/KuKaQEu7JjBNzcoj5/explicitness .) Mathematicians are supposed to make interesting novel definitions that legibilize inexplicit ideas / ways of thinking. If they don’t, you can tell that their publications are not so interesting. It is legible that they failed to legibilize.
In fact I suspect there will be many much “easier” feats that AI won’t be able to do for a while (some decades). Easier, in the sense that many more humans are able to do those feats. Much harder, in the sense that it requires creativity, and therefore requires having the algorithm-pieces for creativity. That’s easy for humans because it’s our birthright, but “hard” for AI because it doesn’t have that yet. Lots of little novel “knacks”, of the sort people can pick up; surprising connections or analogies; solving lots of little problems, or coming up with a way of seeing some sort of situation in a way that makes it make sense.
But knacks, insights, predictions, inventions—these are not categories. I’m not saying “AI can’t make predictions” or “AI can’t learn knacks”, because that would be nonsensical, because those aren’t categories. I’m saying that some things humans do, require broad-spectrum creativity; but it’s hard to describe those things as a task, and so I don’t give it as an answer to the question about what AI can’t do. (If a task is stereotyped, not that hard, has clear feedback, and has lots of demonstrations—all things correlated with legibility—then probably AI will be able to do it soon!)
IIRC, GPT-3 invented this concept. It is at least non-trivial, and somewhat interesting according to the @cohenmacaulay, who shared this definition with us in “Bad at Arithmetic, Promising at Math”.
That was almost 3 years ago.
If there’s not a better example by now it was probably a fluke.
Then again, even humans aren’t super reliable at making big insights, and it’s not trivial even for very top-tier humans to actually be both creative and have significant impact in the world.
To be clear, I think GPT-n is worse than humans in this regard, but it’s not generally good practice to compare humans and AIs by trying to show that an AI or human can or can’t do something at all, and in general the treatment of capabilities as very discrete such that either an AI does or doesn’t have the capability at all has done harm to AI discourse.
There’s some reason to think around thresholds, like for example long tails requiring high reliability, but in general I’m much more skeptical of the need to attribute a deep reason/cause for why current AI might fail to automate AI research/take over the world, and think if LLMs stall out, a lot of the reason will be pretty prosaic.
Link below:
https://www.lesswrong.com/posts/Nbcs5Fe2cxQuzje4K/value-of-the-long-tail
No, humans do this all the time, constantly, originarily (https://www.lesswrong.com/posts/5tqFT3bcTekvico4d/do-confident-short-timelines-make-sense#Creativity___Originariness) when they are kids. They keep using roughly the same set of faculties on harder and harder problems, including sometimes making globally novel insights. Gippities learn in a different way which does not go on to do that. You can be helped in noticing that it’s a different way via sample complexity.
I am quite surprised that this happened 3 years ago! This seems really impressive for 3 years ago GPT series? And I expect the models to get better? Yes, it might be a fluke, but wouldn’t we expect current models to have a higher chance of doing a fluke this good?
Then why isn’t there a better example from a year ago?
I think this puts a lot of weight on a bespoke definition of “interesting” and that kind of obscures what you’re saying. I feel similar about your use of the concept of creativity.
I think that current LLMs are extremely “creative” for many plausible definitions of that word, so I guess it doesn’t really carve things at the joints for me. Visual art, stories, plausible baby names, what sorts of recipes you can try with x ingredients. All written-ont-the-tin use cases for these things.
I do not believe that LLMs think very much in the manner that we do at all. I just don’t think I would pitch that as lacking some true spark of creativity or something. It is too opaque to me what you’re saying.
So basically you just don’t think creativity is a thing? That’s one impasse we could be at. What I mean is gestured at here:
https://tsvibt.blogspot.com/2022/08/structure-creativity-and-novelty.html
More discussion here:
https://tsvibt.blogspot.com/2023/01/the-voyage-of-novelty.html
https://tsvibt.blogspot.com/2023/01/endo-dia-para-and-ecto-systemic-novelty.html
https://tsvibt.blogspot.com/2023/01/a-strong-mind-continues-its-trajectory.html
Hey, thanks for engaging.
I read what I thought were the relevant excerpts in what you linked there. I hadn’t really crossed paths with you before, but you seem to have a rich ontology and lexicon when it comes to theory of mind.
I am not sure if that pinpoints the disagreement or not. We might just be talking past each other. I’ll tell you what I think creativity is and then I’ll restate my objection to your prediction.
I do think “creativity” is a useful word, just maybe not a load bearing one in my ontology.
Like, if I really like a story and it has a lot of unexpected elements that I think it uses really well, that is what I might call creative. Or anything like that if it feels novel, exciting, clever sort of thing… Maybe if someone were giving something high praise and wanted to say it is very deep and clever they could say it is “very creative”. Especially if it was artistic or novel.
Also sometimes when it is just a lot of whacky things are together even if it’s not that clever. Like, when a kid combines a lot of elements into their pretend world or story.
Ya, I know it has something to do with a minds ability to keep learning and improving. Your “trajectory of creativity” concept is about a minds ability to continue to improve beyond the minds around it. I don’t resonate with those usages as much, but I can also kind of understand where it’s coming from and how you’re using the word.
I think my original objection / pushback was partly that it feels hard to operationalize this because what you find interesting is kind of just your thing and it doesn’t seem like a meaningful proxy for intelligence or something. I guess I would add that surely some people are already impressed and interested with some math ideas that chatbots can come up with. Also, perhaps if you could visualize extremely high dimensional spaces you would think that AlphaEvolves proofs were beautiful, elegant, and crisp. I’m not saying there’s no information/signal in what you’re saying; I just found it left a lot unclear for me when I first read it I guess.
I get that LLMs clearly aren’t as good at publishing new top tier math papers or whatever. I guess, gun to my head, I would put most of that down to, like… lacking many of the cognitive abilities needed to independently execute on large scale, messy tasks independently. Or some mix of attributes like that. I would also expect them to have really bad vibes based planning abilities… And plus by the time off the shelf AIs can write math papers of a given quality tier, the goalposts for interestingness will move accordingly… Maybe there is something to the idea that they are not generally inclined towards effing the ineffable and carving structure from reality, but also I doubt they’d have trouble with eg. neologisms.
I agree that it’s hard to operationalize; that’s part of what my OP was saying. And then I think it’s relatively easier to operationalize in mathematics, where it is in large part explicitly about creativity in my sense (but maybe not especially much in your sense). So that’s where I’m getting my prediction; if you don’t see what I mean by creativity, or wouldn’t make the same prediction, then fair enough, we’ll have to agree to disagree.