I currently work for Palisade Research as a generalist and strategy director and for the Survival and Flourishing Fund, as a grant round facilitator.
I’ve been personally and professional involved with the rationality and x-risk mitigation communities since 2015, most notably working at CFAR from 2016 to 2021 as an instructor and curriculum developer. I’ve also done contract work for MIRI, Lightcone, BERI, the Atlas fellowship, etc.
I’m the single person in the world that has done the most development work on the Double Crux technique, and have explored other frameworks for epistemically resolving disagreements and bridging ontologies.
Even as I’m no longer professionally focused on rationality training, I continue to invest in my personal practice of adaptive rationality, developing and training techniques for learning and absorbing key lessons faster than reality forces me to.
My personal website is elityre.com.
Eli Tyre
I’m not sure exactly how I’d define a coup but I’d say it has to be clear cut enough that, “Was it really an attempt at a coup?” is not really in contention in the aftermath.
I think this is a tricky standard, because many maybe-coups will only be widely regarded as coups depending on who won? (I think the January 6 maybe-coup would be contentious regardless, but for others.)
I like and basically endorse your cartoon model!
Modulo, I think that more of the capabilities are coming from the RLVF than from copying humans than you seem to think.[1]They get that almost entirely by copying from HPI represented in training data. They get some additional PI from various sources (RLVR, and just the pretraining itself (e.g. gippities know a bunch of things about the distribution of human text that no human knows), and from online reasoning (though of course there’s a memory problem, but that’s inessential)).
Why are you emphasizing the pretraining instead of the RL?
- ^
Though Zack dropped a paper in this thread which looks relevant to that question.
- ^
you’re also saying “we have fluid int or are close to it probably”.
I think I’m saying “crystalized intelligence can, to a large extent, substitute for fluid intelligence”. This is true to an extent, of humans, but it’s much more true of AIs, because they can have so much more crystalized intelligence than any human could hope to attain.
This is relevant to modeling if LLM agents will transform the world and to modeling if LLM agents will rapidly give way to something that’s much more capable.In particular, I (unconfidently) dispute that developing an AI with fluid intelligence is a research project that is itself heavily and crucially loaded on fluid intelligence.[1]
On my view, it is pretty likely that huge amounts of superhuman crystalized intelligence can find fluid-intelligence-emulating mechanisms (possibly with a necessary ingredient of a relatively small amount of genius fluid intelligence that even huge amounts of crystalized intelligence can’t substitute for, or possibly even without that input).
In that sense, we’re “close to” solving fluid intelligence, even if there’s decades of subjective research an iteration time between here and there.
I do additionally suspect that mechanisms to implement fluid intelligence are just not that hard to invent and/or scale to, starting from the AI tech of 2026. Like, it seems somewhat likely to me that that various dumb ideas would just totally work, or that doing the same stuff we’ve already been doing, but more so will totally work. However, I’m much less confident about this point, and perhaps you can teach me some things that would quickly cause me to change my mind.
I want to check if this comment is clarifying, or if it feel like me repeating things that I’ve already said.- ^
Though you flagged that those aren’t the words that you would use, so
- ^
gippities aren’t good at generating interesting novel concepts on par with humans, AFAIK
Sorry, this is a tangent from this comment thread, but an important one, I think:
LLMs aren’t good at generating interesting novel concepts on par with humans in deployment. But in deployment, we’ve turned off the learning, so of course they’re bad at inventing interesting novel concepts. A brilliant human with anterograde amnesia would also be quite bad at inventing interesting novel concepts.
It seems much more unclear if LLMs develop interesting new concepts in training, while they’re still learning.
They probably generate all kinds of interesting intuitive / S1 concepts and fine distinctions that allow them to get so good at the next token prediction task, just as experts in a domain generally learn all kinds of specialized conceptual representations.
(Though, apparently, and unlike human experts, the models don’t thereby learn words for those concepts, or have the ability to introspect and put handles on their conceptual representations, any more than I can introspect into how my visual cortex works.)
More speculatively, an LLM agent might invent new explicit concepts for itself and learn to use them, in RLVF training, especially if different rollouts are allowed to communicate with each other via a shared scratch-pad or something. I don’t think we have seen anything like this, and I’m not particularly expecting it at current capability levels, but I don’t think we can rule it out.
When we say that LLMs don’t generate new concepts, we’re selling them short. The part of the whole LLM system that has something-like-fluid intelligence to come up with new concepts is the training process, which we basically never interact with (currently).
@TsviBT here’s my distilled paraphrase of your view, perhaps mostly in my own conceptual vocabulary. Let me know how close this is.
The process of discovering how to make an AGI with fluid intelligence depends heavily and crucially on strong fluid intelligence. It’s a central example of a research task that requires insight and developing new, deep, technical concepts, not just patternmatching and reasoning by analogy to similar-seeming past problems.
In recent years the AI industry has made some progress on automating a wide range of (what Eli calls) “routine cognitive operations” or “crystalized intelligence”. We can now make AIs that are good at using pattern-matching to solve problems that are similar to problems that humans have solved a lot. This is very different from the ability to solve genuinely new problems (“fluid intelligence”).
But everyone seems to be eliding this distinction between fluid intelligence and crystalized intelligence!
Some people don’t seem to notice the difference at all, and think that the crystallized intelligence of the AIs is the same kind of thing as the deep fluid intelligence.
Other people at least give lip-service to the difference, and then say ~“well the LLMs, with their great crystalized intelligence, will invent mechanisms for fluid intelligence.”Insofar as inventing an AI with fluid intelligence is itself a project that is loaded on fluid intelligence, the development of these AIs with strong crystallized intelligence is nearly irrelevant to the question of when AI with fluid intelligence will arrive.
This is a kind of obvious point, but everyone around me seems to be operating on a strategic model that apparently ignores it!
I have some response, but first, is that about right, as an expression of your view?
Even you agree that gippity performances don’t exhibit much GI and are mainly the result of distilling performances present in the training data.
Um no. At least if “training data” is meant to refer to the text corpuses used in pre-training. I think the problem-solving capabilities are mostly coming from the RLVF.
Sumwat Longery: Oh ok. So your short AGI timelines are largely coming not from a belief that we’re conceptually / technically very close to having solved AGI, but rather from a belief that we’ve crossed a threshold of compounding accretion of human efforts towards solving AGI?
I would not endorse this.
Like, if I thought that we were conceptually / technically far from AIs that can automate the process of scientific discovery, I would much less expect a FOOM in the next 10 years (though we would still have an emergency, because automating science isn’t a necessary capability for destabilizing or ending the world via any of a number of different pathways).This would seem to demand not a sharp update to “we have GI” but rather a search for better understanding of the distinction between GI and technical performance.
I would be excited about attempts at clarification here! (Modulo that they seem potentially very infohazardous.)
Why I think that might happen:
Background: almost nothing that most humans do actually requires fluid intelligence. Most people, most of the time, are executing routine cognitive operations. And most of the people who are using their fluid intelligence on the job could do just as well or better if they had a massive memory of case studies to extrapolate from instead of attempting novel reasoning.
Most of earth’s geniuses currently spend most of their time doing routine cognitive operations—pattern matching from their prior experience to solve problems, often in the context of automatable tasks like implementing experiments or solving engineering problems. When those classes of task are automated, it will free up the capacity of the geniuses.
At this point, most of all the work in the world will be automated or in the process of being automated. Science and tech development will be going faster than ever in human history. It will be obvious to the whole world that AI is a really big deal.
Also, it will be obvious to many people that there’s something missing: the AIs are doing more and better design and engineering, faster, than human civilization ever did, and they’re accelerating the science, but they’re not doing the science. There will be enormous financial and strategic incentives to crack that.
As stated, these don’t have to sum to 1. B and C are mutually exclusive but A can be true even if B or C are also true.
(I also object a bit to calling “strong fluid intelligence” “AGI.”
Part of what’s at stake is how far can you get with basically just specialized knowledge and the ability to train new specialized knowledge. It would be surprising to me, but not out of the question, that there’s almost nothing that such an AI can’t do that an AI with more fluid intelligence can do. But I only object a bit.)
Ass numbers:A: 80%
B: 40%
C: 30%
If they are roughly equal, I would raise my eyebrow and say “that seems kinda strange, unless there’s a shared factor such as you thinking that actually we basically have ~AGI in current systems; if so, could you clarify that shared factor”.
I mean that’s kind of fair. But I in fact don’t have a lot of precise ability to distinguish between “one key idea is missing” and “only engineering schlep is missing”. Those wolds look very similar, to me, and so get similar amounts of mass.
I’m right now trying to inhabit this point, and try to really grok it.
I guess it could be the case that the kind of intelligence that you need to engineer software and the kind that you need to develop novel algorithms are almost completely disjoint and unrelated. You can basically solve “make an AI that can make software”, and not have even scratched the surface of “make an AI that can make new algorithms / new interesting math concepts”.
(Is this what you think?)
It would surprise me if this were true, because it seems like there’s a lot of overlap in the mental operations between those two kinds of work.
I feel like I understand the question you’re asking:
~”If you previously had a very spread-out prediction, and now you have a relatively more narrow prediction, between then and now, you must have made a pretty large Bayesian update—you saw some evidence with a quite lopsided odds ratio.
If you made such an update, you should be able to point to the evidence, and explain why you think the odds ratio is so lopsided. Please do that!”
(Is that about right?)
But, I don’t get why the evidence / arguments that people are offering isn’t clarifying for you.
Like, I keep trying to point to the same basic IMO pretty straightforward considerations, and you keep saying things like “somehow I come away having no idea how they made their update to get to a confident (sharp) distribution”. [1]
I’m not sure what kind of thing you’re asking for that’s different from the kinds of things that I’m already saying. Do you want more of a quantitative model? Do we just need to get further into the argument tree?
I further get that, from your perspective, you’re saying something like “dude, give me actual evidence and arguments”, and I’m somehow being a dunce about that. But I don’t get what exactly you’re asking for.- ^
Which to be clear, is socially and epistemically valid, on your part. Please continue to loudly say “I don’t get why everyone thinks this”, for as long as that’s true. I want to do the opposite of shaming you for not getting it.
- ^
I get that you’re saying something like that.
I think you’re saying “that coding fell to methods like these is both evidence that methods like these are more powerful than we might have guessed, and also evidence that coding required less general intelligence than we might have guessed.”
I suppose the question is whether this sort of thing does FOOM / takeover. Are you saying you can make up for weak intelligence with knowledge (gleaned from human text) well enough to do that?
More like you can make a weak intelligence with lots of specialized knowledge and skills, mostly gleaned from RL (though starting from the superhuman breadth of baseline knowledge that GPT-4 had), that can outcompete humans in acquiring power and/or FOOM.
has 500 genius human engineers
What do you mean by genius human engineers?
Part of my model is that like 25% of the math and CS phds on earth, and especially the ones that win Nobel prizes, will be working on this problem.Is your median time to actually FOOM/takeover more like 1 year, 4 years, or 10 years?
I don’t know. I think my median is FOOM in 2 years? This is an ass-number though. I don’t feel super confident.
I’m like 90% probability that it happens within 10 years, and 95% probability that it happens within 35 years?
[Sorry for the long delay.
I wrote this response the day you sent the above. But it felt clear that we were missing each other, and I wanted to try to inhabit your view, in an attempt to make more effective progress. But I was too tried to do that well at the time, and so put this away to come back to later. But I have a day job in addition to the other projects that I’m trying to push on, and this fell by the wayside for two weeks.]I’m not sure what’s weird about it, but yes, I think someone claiming to predict the future confidently as opposed to the more default background broad uncertainty would have the burden of proof.
Well part of the debate here is what the prior ought to be.
It’s in some sense a confident prediction to assert that Moore’s law will continue, in 1995. But, broadly, the burden of proof is more on the side of the guy who thinks that the trend will break. Or at least, it’s not totally clear what the prior should be and where the burden of proof should lie.
From your linked post:But we should also update that this behavior surprisingly turns out to not require as much general intelligence as we thought.
Yes exactly. I’ve updated that these tasks require less general intelligence than I thought, and as a consequence, I’ve updated that tasks in general require less general intelligence than I thought.
Do you think it has high fluid intelligence (assuming as best you can, arguendo, that this phrase maps to something meaningful + important)? If yes, why (given that you’d be disagreeing with a lot of other short timelines views)? If no, why talk about AGI that doesn’t include high fluid intelligence?
No, I think Claude 4.6 has quite weak fluid intelligence. I previously described that as “the LLMs are actually not very intelligent at all, but it turns out that you can make up for moderately weak intelligence with a lot of knowledge.”
If no, why talk about AGI that doesn’t include high fluid intelligence?
Because high fluid intelligence (at least as we currently conceive of it) 1) is maybe not necessary, and 2) might come from the default trajectory of LLM-AI development.
Like, it seems like you can maybe get a strategically superhuman AI by relying on a lattice of more-or-less specialized superhuman skills (including superhuman engineering, and superhuman persuasion, and superhuman corporate strategy, and so on), without having much fluid intelligence.
To be clear, it also seems possible to me that we will make superhuman AI agents that don’t have this fluid intelligence special sauce. Those AIs will be adequate to automate almost all human labor, because almost all of human labor is more-or-less routine application of crystalized knowledge. We’ll be living in a radical new world of ~full automation, except for a small number of geniuses who are adding critical insight steps to the new cyborg-process of doing science.[1]
But I will be surprised if we hang out in that regime for very long, before the combined might of humanity’s geniuses augmented their armies of superhumanly capable routine engineers, and enormous computer infrastructure to do massive experiments, can’t hit on a mechanism that replicates the human fluid intelligence special sauce.[2]
Maybe I’m wrong about how hard the problem of developing a mechanism that can do fluid intelligence is, or about if it’s the kind of thing that can be accelerated by armies of superhuman engineers. But just eyeballing how ~every AI capability since the advent of deep learning came to be, it seems like it involved a lot of tinkering, and running empirical experiments to see what works, and optimizing metrics, and bitter-lesson-style scaling, not eg Einstein style genius conceptual breakthroughs. To my only somewhat informed eye, it looks like the way AI capabilities are developed is exactly the kind of thing that armies of superhuman engineers doing the routine-cognition part of research should be able to do.
We call it “grad-student descent”, as a way to emphasize how much it resembles a dumb search process. And there will be a lot more AI agents, running a lot faster, than there ever were grad students.That sure sounds like positive knowledge of us having almost all of fluid intelligence AGI seedstuff. No?
No, I’m putting forward a disjunction:
Fluid intelligence isn’t necessary for Strategically Superhuman AI.
or
LLM based agents will develop fluid intelligence on the default technological trajectory, via the application of not-very-clever ideas.
or
There’s about one or two “breakthrough” ideas missing, that when combined with the existing LLM-agent techniques, will make LLM-agents that can do the fluid intelligence thing (or a substitute for the fluid intelligence thing). Having armies of LLM-agents that can automate engineering and experimentation seems like it should accelerate the discovery of those one or two breakthroughs.
Those last two legs of the disjunction are assuming that there are not many pieces left before fluid intelligence is solved, but not making much of a claim about how many pieces we already have. Like, depending on what one means by “pieces”, maybe we have 0 out of 1 (and we’re likely to get that one in the next five years), or maybe we have 95 out of 100 (and we’re likely to get the last five in the next five years).
They’re clearly better than me, already, at almost all of technical thinking.
Except for the most important parts, such as orienting to a new domain / new question in a manner that produces successful understanding in the long run.
I mean, that’s very true of current LLM-agents after they leave training. It’s also true (though less so, I think), of LLMs in training—they come away with a massive library of concepts that they can’t wield as deftly as a human
But it’s also true of AlphaZero, in some sense, in that alpha-zero improves much less from each game it plays than a human does. But also alpha-zero can play enough games, fast enough, to become superhuman at go in a few hours.
Response 1: This type of reasoning does not work for all those other previous big breakthroughs (such as the invention of the universal computer, of the operating system, or of google search).
Maybe. But this does seem to be what works in Deep Learning, even if not in other CS subfields.
Response 2: Consider the hypothesis that it went up fast because it used up available data.
How does this relate to the fact that AIs are now getting better by training on procedurally generated problems instead of human data?
Are you suggesting that RLVR, is only eliciting capabilities that are already in the base model, rather than instilling new capabilities?
Wait can you expand on this? Why do you think true generators of human thought are contained in them / picked up by LLMs?
Because GPT-4 can do more reasoning than I would have naively guessed, under the hypothesis “GPT-3 is only memorizing shallow patterns, not the real, deep patterns of cognition.”
What makes you think we’re close?
That the AI agents are already able to do, or are a few METR doublings from being able to do, almost all of the mental work that humans do, weighted by “time spent doing that work.”
. . .
But overall, it seems like we’re obviously talking past each other, or something. Maybe I can try to articulate your view as I understand it and you can offer corrections?
It sounds like you’re saying something like...
Look, the important and dangerous thing about AGI is that it can do the cognitive operations of science / discovery / inventing and operating in new fields, at a superhuman level. The danger lies with AI that is able to make fundamental discovries the way a scientist does (and then apply / wield those discoveries). An AI that isn’t really able to make fundamental discoveries is just not that dangerous.
LLMs and LLM-agents can do a lot of seemingly impressive stuff, but they’re really dramatically bad at orienting to new domains or making discoveries like that.They’re something like an Eliza-bot, in that Eliza-bot could use simple mechanisms to generate conversational outputs that appear like a conversation. Someone talking with Eliza might be astonished, and think with only a little improvement, the next generation would be able to converse as completely as a human can. But that’s an illusion: the simple mechanisms that Eliza is exploiting are basically not adequate to produce anything like a real conversation.
Similarly, the LLM-agents are able to do some portion of technical work that humans do, but they’re basically not doing the interesting parts. And the interesting parts are almost all of the problem. That the LLM agents are able to do some technical work is very little evidence of how much additional conceptual work needs to happen to solve the hard and interesting parts of AGI.
Someone who’s impressed with o3 or Claude Mythos, and thinks that there’s very little left to add before we get to AIs that can automate all or almost all of scientific progress is making an error analogous to someone who thinks that there’s very little left to add to Eliza to get AGI, because it’s so close to intelligent behavior.- ^
As a side note, this world comes along with all kinds of new dangers that it’s not clear that we’re equipped to deal with, that mostly fall under the headings of “misuse” and “concentration of power”. I’m not sure how high the risk is, but if we fumble this, we could totally loose the game.
- ^
This would be a very scary world, because if LLM-agents already have all the pieces to be a strategically superhuman agent, except for one, and we’ve built out huge compute infrastructure for running them, we’re in for a very hard takeoff once someone builds the first “real” AGI.
I would even go so far that in my utopia, if your advertisement confuses an IQ 80 person to believe something, and then you go like “ha ha, the small print says otherwise”, you should be treated as if your contract literally said what your ad says, ignoring the small print. (The small print can provide additional details, not fundamentally change the nature of the contract.) If you said it, and the other person heard it, own it. If it’s knowingly false, don’t put it in print.
This is a bit tricky, because often the fine print is going to adjudicate legitimate edge cases, and people may feel rug pulled if they end up in one of those edge-cases, even if the overall contract wasn’t meaningfully deceptive.
I have wondered if there should be a special symbol or font or something, which implies a higher-than-default-speech standard of reliability, backed by the courts. Anyone marking text with that symbol is committing that what that text says is literally precisely true, and anyone who can demonstrate otherwise can sue for damages.
eg if you call your business “24 hour fitness”, with the symbol, and actually you are closed at night on the weekends anyone who notices can claim the bounty. But without the symbol, that’s just a zany name you picked for your gym.
and if anything OpenPhil has some of the people with the best antibodies to this
What do you mean by this?
I think centrally he is talking about Anthropic
I think you’re right, and also it seems misleading / like a bad clustering to lump “the EAs” in with “Anthropic’s leadership”. I think those groups have some memetic connections, but they’re not the same group!
I feel like it’s more of a reasonable carving to lump in OpenPhil with “the EAs”, since they were/are effectively EA thought-leaders and they exerted a lot of influence, directly and indirectly.)
Unless you mean “making this my last day [on twitter]”, which might or might not be a good idea.
I’m pretty sure that they all read LessWrong.
I think the upvotes are completely justified?
I’m upvoting it because it’s ~ the most legible evidence to date, on one of the top ten most important questions in the world.