Optimizing for the flourishing of all sentient life regardless of substrate.
george_adams
I’ve heard this critique lobbed around a fair bit in AI safety circles: “{some org} is bad at philosophy”. What does this mean? I’ve heard this both from collaborators in person and on LW. A decent number of times, this critique has been directed at Anthropic. I’ll apologize for how vague (and unclear) this post is in advance but does anyone have any idea what people are getting at when they say this? It is usually an unmotivated statement too (at least in my judgement) and a statement meant to critique an organization’s approach to AI safety.
My prediction for the next few years (or until AGI) is that there’s going to start to be a winner take all approach to computer science talent. The majority of the job of software engineering will be automated (if it isn’t already). There are still robustness pockets that human software engineers can help fill for now. But I expect top AI researchers to continue making exorbitant amounts of money, even if little software engineering is involved. So, computer science talent will start to loosely resemble the competitiveness of professional sports, where there is a step function change between compensation if you “make it” or not. And the hiring bar will be extremely high. You can already start to see glimpses of this with the weight people put on Olympiad contestants when recruiting.
I think quantified intuitions is a reasonable, although incomplete version of what you describe. It specifically focuses on scope insensitivity rather than a traditional rationalist curriculum.
Markets pricing in AGI also is also conditional on markets believing something like the current legal/property rights system will continue to hold after AGI. If it is possible that AI is a bubble, and it’s not obvious that you will win anything if you get the AGI trade right, then traders won’t “price in” AGI even if it is extremely economically valuable and coming soon.
My argument is also not that markets won’t price in AI in its current form or increasing capabilities, it is specifically at the point where we actually have strong AGI systems phase shift.
I disagree with your post, but I will add an additional example: falling birthrates. I don’t remember in which of his essays it was (probably in Fanged Noumena), but Nick Land posits that the technocapital system of capitalism which he views as being AGI has figured out that it won’t need humans much longer and thus has no incentive to keep the birth rates up. I obviously do not literally believe this, but I think it helps illustrate what you’re trying to describe.
I know this is 7 months late! But I read this shortform yesterday and it somewhat resonated with me. And then today I read Noah Smith’s most recent blog post which perfectly described what I think you’re getting at so I’m linking it here.
Why trust your prior over the prior of the market/hedge funds? By this I mean why expect that this isn’t already priced in? AI (and AGI) is a big enough news story now such that I would expect hedge funds to be thinking about things like this. At recruiting events, I’ve asked quants how they’re thinking about this exact question and I usually got pretty decent AGI pilled responses.
It is certainly possible that the market hasn’t priced this in, but my prior is in the vast, vast majority of cases, there is some quant that has already sucked out any potential gains one could get.
I’m also a college student who has been wrestling with this question for my entire undergrad. In a short timelines world, I don’t think there are very good solutions. In longer timelines worlds, human labor remains economically valuable for longer.
I have found comfort in the following ideas:
1) The vast majority of people (including the majority of wealthy white-collar college-educated people) are in the same boat as you. The distribution of how AGI unrolls is likely to be so absurd that it’s hard to predict what holds value after this. Does money still matter after AGI/ASI? What kinds of capital matters after AGI/ASI? These questions are far from obvious for me. If you take these cruxes, then even people at AGI labs could be making the wrong financial bets. You could imagine a scenario where AGI lab X builds AGI first and comes to dominate the global economy so that everyone with stock options in AGI lab Y will be left with worthless capital ownership. You could even imagine a scenario of owning stock in an AGI lab that builds AGI and then that capital is no longer valuable.2) For a period of time, I suspect that young people are likely to have an advantage in terms of using “spiky” AI tools to do work. Being in the top few percentile of competence for coding with LLMs or doing math with LLMs or even doing other economically valuable tasks using AI is likely to have career opportunities.
3) You can expect some skills to be important up until the point of AGI. For example, I see coding and math in this boat. Not only will they be important, but the people doing the most crucial and civilization altering research will likely be very good at these skills. These people are likely to be the 1 in a million Ilya Sutskever’s of the world, but I still find it motivating to build up this skillset at a point which is really the golden age of computer science.
More generally, I have found it useful to think about outcomes as sampled from a distribution and working hard as pushing up the expected value of that distribution. I find this gives me much more motivation.
Claude’s rebuttal is exactly my claim. If major AI research breakthroughs could be done in 5 hours, then imo robustness wouldn’t matter as much. You could run a bunch of models in parallel and see what happens (this is part of why models are so good at olympiads), but an implicit part of my argument/crux is that AI research is necessarily deep (meaning you need to string some number of successfully completed tasks together such that you get an interesting final result). And if the model messes up one part, your chain breaks. Not only does this give you weird results, but it breaks your chain of causality[1], which is essential for AI research.
I’ve also tried doing “vibe AI researching” (no human in the loop) with current models and I find it just fails right away. If robustness doesn’t matter, why don’t we see current models consistently making AI research breakthroughs at their current 80% task completion rate?
A counterargument to this is that if METR’s graph trend keeps up, and task length gets to some threshold, I’ll call it a week for example, then you don’t really care about P(A)P(B)P(C)..., you can just do the tasks in parallel and see which one works. (However, if my logic holds, I would guess that METR’s task benchmark hits a plateau at some point before doing full-on research at least with current model robustness)
- ^
By chain of causality, I mean that I did task A. If I am extremely confident that task A is correct I can then do a search from task A. Say I stumble on some task B, then C. If I get an interesting result from task C, then I can keep searching from there so long as I am confident in my results. I can also mentally update my causal chain by some kind of ~backprop. “Oh using a CNN in task A, then setting my learning rate to be this in task B, made me discover this new thing in task C so now I can draw a generalized intuition to approach task D. Ok this approach to D failed, let me try this other approach”.
- ^
METR should test for a 99.9% task completion rate (in addition to the current 80% and 50%). A key missing ingredient holding back LLM economic impact is that they’re just not robust enough. This can be viewed analogously to the problem of self-driving. Every individual component of self-driving is ~solved, but stringing them together results in a non-robust final product. I believe that automating research/engineering completely will require nines of reliability that we just don’t have. And testing for nines of reliability could be done by giving the model many very short time horizon tasks and seeing how it performs.
This can be further motivated by considering what happens if we string together tasks with a non-99.99...% completion rate. Say we take the GPT 5.1 codex max result. METR claims this model has a 50% time horizon of 2 hours and 40 minutes. Say we tell the model to do task A which is 2 hours and 40 minutes. P(A) = 0.5. Now if the model decides it needs to do task B to further it’s research, we have P(B) = 0.5. P(A, B) = P(A)P(B) = 0.25 (These events are not independent, but I express them as such for illustrative effect). We can then consider task C, D, E, etc. This holds even for higher completion rates of 80%. Once we get up to 99.9%, we have P(A) = 0.999, P(B) = 0.999, P(A, B) = P(A)P(B) = ~0.998… This is where we can really start seeing autonomous research imo.
It would be interesting to benchmark humans at 99.9% task completion rate and see what their task length is.
(Disclaimer: I am not completely sure of METR’s methodology for determining task length)
I think this issue of “9s” of reliability should update people towards longer timelines. Tesla FSD has basically been able to do everything individually that we would call self-driving for the last ~4 years, but it isn’t 99.99...% reliable. I think LLMs replacing work will, by default, follow the same pattern.
Imo, this analogy breaks down if you take a holistic evolutionary lens. The amount of time you spent learning chess is minuscule compared to the amount of time evolution spent optimizing for creating the general learning machine that is your brain. It’s not obvious how to cleanly analogize the current frontier model training recipe to evolution. But, I claim that your brain has certain inductive biases at birth that make it possible to eventually learn to do thing X, and directly training on thing X wouldn’t have worked for evolution because the general model was just too bad.
“Gemini 3 estimates that there are 15-20k core ML academics and 100-150k supporting PhD students and Postdocs worldwide.”
In my opinion, this seems way too high. What was the logic or assumptions it used?
Land and buildings: 16.5B
IT assets: 13.6B
Where are the GPUs (TPUs mostly in Google’s case)? I figured these would be bigger given the capex of Google, MSFT, etc. on building enormous clusters.
I agree with most of the individual arguments you make, but this post still gives me “Feynman vibes.” I generally think there should be a stronger prior on things staying the same for longer. I also think that the distribution of how AGI goes is so absurd, it’s hard to reason about things like expectations for humans. (You acknowledge that in the post)
I agree with most things said but not with the conclusion. There is a massive chunk of human (typically male) psyche that will risk death/major consequences in exchange for increasing social status. Think of basically any war. A specific example is Kamikazee pilots in WW2 who flew in suicide missions for the good of the nation. The pilots were operating within a value system that rewarded individual sacrifice for the greater mission. The creators of AGI will have increasing social status (and competition, thanks to Moloch) until the point of AGI ruin.
(Also minor point that some accelerationists are proudly anti speciest and don’t care about the wellbeing of humans)
I would describe what we have already done as radical life extension. Perhaps we have a difference in definition. From this link:
The model that is most convincing of why we didn’t orient ourselves around something like a cult of increasing life expectancy is that we went down the path of least resistance of technological progress and economic growth.
I claim this was never a realistic goal. The set of cultures in which we have the cultural norms/tools to create technology and large economic growth (that is required for transhumanism) AND which prioritize transhumanism above everything else are not very numerous.
In some sense, you can see the afterlife promise of many religions as a form of transhumanism, and billions of people are on board with that. Yet basically, none of these religions have contributed to actually achieving something like transhumanism.