This feels kind of backwards, in the sense that I think something like 2032-2037 is probably the period that most people I know who have reasonably short timelines consider most likely.
AI 2027 is a particularly aggressive timeline compared to the median, so if you choose 2028 as some kind of Schelling time to decide whether things are markedly slower than expected then I think you are deciding on a strategy that doesn’t make sense by like 80% of the registered predictions that people have.
Even the AI Futures team themselves have timelines that put more probability mass on 2029 than 2027, IIRC.
Of course, I agree that in some worlds AI progress has substantially slowed down, and we have received evidence that things will take longer, but “are we alive and are things still OK in 2028?” is a terrible way to operationalize that. Most people do not expect anything particularly terrible to have happened by 2028!
My best guess, though I am far from confident, is that things will mostly get continuously more crunch-like from here, as things continue to accelerate. The key decision-point in my model at which things might become a bit different is if we hit the end of the compute overhang, and you can’t scale up AI further simply by more financial investment, but instead now need to substantially ramp up global compute production, and make algorithmic progress, which might markedly slow down progress.
I agree with a bunch of other things you say about it being really important to have some faith in humanity, and to be capable of seeing what a good future looks like even if it’s hard, and that this is worth spending a lot of effort and attention on, but just the “I propose 2028 as the time to re-evaluate things, and I think we really want to change things if stuff still looks fine” feels to me like it fails to engage with people’s actually registered predictions.
Situational Awareness and AI 2027 have been signal boosted to normal people more than other predictions, though. Like, they both have their own website, AI 2027 has a bunch of fancy client-side animation and Scott Alexander collaborated, and someone made a Youtube video on AI 2027.
While AI safety people have criticized the timeline predictions to some extent, there hasn’t been much in-depth criticism (aside from the recent very long post on AI 2027), the general sentiment on their timelines seems positive (although Situational Awareness has been criticized for contributing to arms race dynamics).
I get that someone who looks at AI safety people’s timelines in more detail would get a different impression. Though, notably, Metaculus lists Jan 2027 as a “community prediction” of “weakly general AI”. Sure, someone could argue that weakly general AI doesn’t imply human-level AGI soon after, but mostly when I see AI safety people point to this Metaculus market, it’s as evidence that experts believe human-level AGI will arrive in the next few years, there is not emphasis on the delta between weakly general AI and human-level AGI.
So I see how an outsider would see more 2027-2029 timelines from AI safety people and assume that’s what they consider a reasonable median, if they aren’t looking into it closely. This is partially due to internet/social media attention dynamics, where surprising predictions get more attention.
What I think we could agree on is that, if 2028 rolls around and things seem pretty much like today, then whatever distributed attention algorithm promotes things like Situational Awareness and AI 2027 to people’s attention in practice leads to flawed predictions, and people who were worried because of listening to this algorithm should chill out and re-evaluate. (We probably also agree they should chill out today because this popular content isn’t reflective of AI safety people’s general opinions.)
Yep, agree that there is currently a biased coverage towards very short timelines. I think this makes sense in that the worlds where things are happening very soon are the worlds that from the perspective of a reasonable humanity require action now.[1]
I think despite the reasonable justification for focusing on the shorter timelines worlds for decision-making reasons, I do expect this to overall cause a bunch of people to walk away with the impression that people confidently predicted short timelines, and this in turn will cause a bunch of social conflict and unfortunate social accounting to happen in most worlds.
I on the margin would be excited to collaborate with people who would want to do similar things to AI 2027 or Situational Awareness for longer timelines.
I.e. in as much as you model the government as making reasonable risk-tradeoffs in the future, the short timeline worlds are the ones that require intervention to cause changes in decision-making now.
I am personally more pessimistic about humanity doing reasonable things, and think we might just want to grieve over short timeline worlds, but I sure don’t feel comfortable telling other people to not ring the alarm bell on potentially very large risks happening very soon, which seems plausible enough to me that absolutely it should be among the top considerations for most decision-makers out there.
Even if it does make sense strategically to put more attention on shorter timelines, that sure does not seem to be what actually drives the memetic advantage of short forecasts over long forecasts. If you want your attention to be steered in strategically-reasonable ways, you should probably first fully discount for the apparent memetic biases, and then go back and decide how much is reasonable to re-upweight short forecasts. Whatever bias the memetic advantage yields is unlikely to be the right bias, or even the right order of magnitude of relative attention bias.
I mean, I am not even sure it’s strategic given my other beliefs, and I was indeed saying that on the margin more longer-timeline coverage is worth it, so I think we agree.
What’s the longest timeline that you could still consider a short timeline by your own metric, and therefore a world “we might just want to grieve over”? I ask because, in your original comment you mentioned 2037 as a reasonably short timeline, and personally if we had an extra decade I’d be a lot less worried.
Edit: Oops, I responded to the first part of your question, not the second. My guess is timelines with less than 5 years seem really very hard, though we should still try. I think there is lots of hope in the 5-15 year timeline worlds. 15 years is just roughly the threshold of when I would stop considering someone’s timelines “short”, as a category.
I admit, it’s pretty disheartening to hear that, even if we had until 2040 (which seems less and less likely to me anyway), you’d still think there’s not much we could do but grieve in advance.
…people who were worried because of listening to this algorithm should chill out and re-evaluate.
And communication strategies based on appealing to such people’s reliance on those algorithms should also re-evaluate.
E.g., why did folk write AI 2027? Did they honestly think the timeline was that short? Were they trying to convey a picture that would scare people with something on a short enough timeline that they could feel it?
If the latter, we might be doing humanity a disservice, both by exhausting people from something akin to adrenal fatigue, and also as a result of crying wolf.
Yes, I honestly thought the timeline was that short. I now think it’s 50% by end of 2028; over the last year my timelines have lengthened by about a year.
He makes some obvious points everyone already knows about bottlenecks etc. but then doesn’t explain why all that adds up to a decade or more, instead of of a year, or a month, or a century. In our takeoff speeds forecast we try to give a quantitative estimate that takes into account all the bottlenecks etc.
E.g., why did folk write AI 2027? Did they honestly think the timeline was that short?
Isn’t it more like “I think there’s a 10% chance of transformative AI by 2027, and that is like 100x higher than what it looks like most people think, so people really need to think thru that timeline”?
Like, I generally put my median year at 2030-2032; if we make it to 2028, the situation will still feel like “oh jeez we probably only have a few years left”, unless we made it to 2028 thru a mechanism that clearly blocks transformative AI showing up in 2032. (Like, a lot is hinging on what “feels basically like today” means.)
Isn’t it more like “I think there’s a 10% chance of transformative AI by 2027, and that is like 100x higher than what it looks like most people think, so people really need to think thru that timeline”?
That might be. It sounds really plausible. I don’t know why they wrote it!
But all the same: I don’t think most people know what 10% likelihood of a severe outcome is like or how to think about it sensibly. My read is that the vast majority of people need to treat 10% likelihood of doom as either “It’s not going to happen” (because 10% is small) or “It’s guaranteed to happen” (because it’s a serious outcome if it does happen, and it’s plausible). So, amplifying the public awareness of this possibility seems more to me like moving awareness of the scenario from “Nothing existential is going to happen” to “This specific thing is the default thing to expect.”
So I expect that unless something is done to… I don’t know, magically educate the population on statistical thinking, or propagate a public message that it’s roughly right but its timeline is wrong? then the net effect will be that either (a) AI 2027 will have been collectively forgotten by 2028 in roughly the same way that, say, Trudeau’s use of the Emergencies Act has been forgotten; or (b) the predictions failing to pan out will be used as reason to dismiss other AI doom predictions that are apparently considered more likely.
The main benefit I see is if some key folk are made to think about AI doom scenarios in general as a result of AI 2027, and start to work out how to deal with other scenarios.
But I don’t know. That’s been part of this community’s strategy for over two decades. Get key people thinking about AI risk. And I’m not too keen on the results I’ve seen from that strategy so far.
Though, notably, Metaculus lists Jan 2027 as a “community prediction” of “weakly general AI”. Sure, someone could argue that weakly general AI doesn’t imply human-level AGI soon after
it does imply that, but i’m somewhat loathe to mention this at all, because i think the predictive quality you get from one question to another varies astronomically, and this is not something the casual reader will be able to glean
I think something like 2032-2037 is probably the period that most people I know who have reasonably short timelines consider most likely.
I honestly didn’t know that. Thank you for mentioning it. Almost everything I hear is people worrying about AGI in the next few years, not AGI a decade from now.
if you choose 2028 as some kind of Schelling time to decide whether things are markedly slower than expected then I think you are deciding on a strategy that doesn’t make sense by like 80% of the registered predictions that people have.
Just to check: you’re saying that by 2028 something like 80% of the registered predictions still won’t have relevant evidence against them?
As both a pragmatic matter and a moral one, I really hope we can find ways of making more of those predictions more falsifiable sooner. If anything in the vague family of what I’m saying is right, but if folk won’t halt, melt, & catch fire for at least a decade, then that’s an awful lot of pointless suffering and wasted talent. I’m also skeptical that there’ll actually be a real HMC in a decade; if the metacognitive blindspot type I’m pointing at is active, then waiting a decade is part of the strategy for not having to look, and it’ll come up with yet more reasons not to look when AGI doesn’t happen ten years from now too.
“are we alive and are things still OK in 2028?” is a terrible way to operationalize that. Most people do not expect anything particularly terrible to have happened by 2028!
Cool. Noted. So, what can we observe by 2028 that’d cause us pause about whether there’s a collective confusion/projection playing a significant role?
The key decision-point in my model at which things might become a bit different is if we hit the end of the compute overhang, and you can’t scale up AI further simply by more financial investment, but instead now need to substantially ramp up global compute production, and make algorithmic progress, which might markedly slow down progress.
This angle misses my point. Your analysis makes sense, but it’s talking about something different.
I think you’re trying to suggest what observations would cause you to update about AI timelines being longer than you currently think they are. Yes?
I’m asking what we should observe if it turns out that the degree of focus on narrating doom is more due to an underlying emotional distress than due to correct calibration to the situation we’re in.
One effect of that might be that AI timelines bias toward the pessimistic, which means that on average we should keep finding that they’re longer than the consensus keeps converging on. But I think that’s both slow and noisy as a way of detecting it.
just the “I propose 2028 as the time to re-evaluate things, and I think we really want to change things if stuff still looks fine” feels to me like it fails to engage with people’s actually registered predictions.
Noted. Thanks for pointing it out. I do think I failed to engage this way.
I still think my overall point stands though. I didn’t think things would look fine; my expectation is that things will continue to look ever more dire. I’m just hoping that if we can distinguish between (a) dire because things are in fact getting worse versus (b) dire because that’s what the emotional Shepard tone does, then in 2.5 years we could pause and check which one seems to be responsible for the increase in doom, and if it’s the latter then HMC.
So, again: what could we observe at the start of 2028 that would create pause this way?
As you implied above, pessimism is driven only secondarily by timelines. If things in 2028 don’t look much different than they do now, that’s evidence for longer timelines (maybe a little longer, maybe a lot). But it’s inherently not much evidence about how dangerous superintelligence will be when it does arrive. If the situation is basically the same, then our state of knowledge is basically the same.
So what would be good evidence that worrying about alignment was unnecessary? The obvious one is if we get superintelligence and nothing very bad happens, despite the alignment problem remaining unsolved. But that’s like pulling the trigger to see if the gun is loaded. Prior to superintelligence, personally I’d be more optimistic if we saw AI progress requiring even more increasing compute than the current trend—if the first superintelligences were very reliant on massive pools of tightly integrated compute, and had very limited inference capacity, that would make us less vulnerable and give us more time to adapt to them. Also, if we saw a slowdown in algorithmic progress despite widespread deployment of increasingly capable coding software, that would be a very encouraging sign that recursive self-improvement might happen slowly.
So, again: what could we observe at the start of 2028 that would create pause this way?
Very little. I’ve been seriously thinking about ASI since the early 00s. Around 2004-2007, I put my timeline around 2035-2045, depending on the rate of GPU advancements. Given how hardware and LLM progress actually played out, my timeline is currently around 2035.
I do expect LLMs (as we know them now) to stall before 2028, if they haven’t already. Something is missing. I have very concrete guesses as to what is missing, and it’s an area of active research. But I also expect the missing piece adds less than a single power of 10 to existing training and inference costs. So once someone publishes it in any kind of convincing way, then I’d estimate better than an 80% chance of uncontrolled ASI within 10 years.
Now, there are lots of things I could see in 2035 that would cause me to update away from this scenario. I did, in fact, update away from my 2004-2007 predictions by 2018 or so, largely because nothing like ChatGPT 3.5 existed by that point. GPT 3 made me nervous again, and 3.5 Instruct caused me to update all the way back to my original timeline. And if we’re still stalled in 2035, then sure, I’ll update heavily away from ASI again. But I’m already predicting the LLM S-curve to flatten out around now, resulting in less investment in Chinchilla scaling and more investment in algorithmic improvement. But since algorithmic improvement is (1) hard to predict, and (2) where I think the actual danger lies, I don’t intend to make any near-term updates away ASI.
The key decision-point in my model at which things might become a bit different is if we hit the end of the compute overhang, and you can’t scale up AI further simply by more financial investment, but instead now need to substantially ramp up global compute production, and make algorithmic progress, which might markedly slow down progress.
I think compute scaling will slow substantially by around 2030 (edit: if we haven’t seen transformative AI). (There is some lag, so I expect the rate at which capex is annually increasing to already have slowed by mid 2028 or so, but this will take a bit before it hits scaling.)
Also, it’s worth noting that most algorithmic progress AI companies are making is driven by scaling up compute (because scaling up labor in an effective way is so hard: talented labor is limited, humans parallelize poorly, and you can’t pay more to make them run faster). So, I expect algorithmic progress will also slow around this point.
All these factors make me think that something like 2032 or maybe 2034 could be a reasonable Schelling time (I agree that 2028 is a bad Schelling time), but IDK if I see that much value in having a Schelling time (I think you probably agree with this).
In practice, we should be making large updates (in expectation) over the next 5 years regardless.
I think compute scaling will slow substantially by around 2030
There will be signs if it slows down earlier, it’s possible that in 2027-2028 we are already observing that there is no resolve to start building 5 GW Rubin Ultra training systems (let alone the less efficient but available-a-year-earlier 5 GW non-Ultra Rubin systems), so that we can update then already, without waiting for 2030.
This could result from some combination of underwhelming algorithmic progress, RLVR scaling not working out, and the 10x compute scaling from 100K H100 chips to 400K GB200 chips not particularly helping, so that AIs of 2027 fail to be substantially more capable than AIs of 2025.
But sure, this doesn’t seem particularly likely. And there will be even earlier signs that the scaling slowdown isn’t happening before 2027-2028 if the revenues of companies like OpenAI and Anthropic keep sufficiently growing (in 2025-2026), though most of these revenues might also be indirectly investment-fueled, threatening to evaporate if AI stops improving substantially.
my synthesis is I think people should chill out more today and sprint harder as the end gets near (ofc, some fraction of people should always be sprinting as if the end is near, as a hedge, but I think it should be less than now. also, if you believe the end really is <2 years away then disregard this). the burnout thing is real and it’s a big reason for me deciding to be more chill. and there’s definitely some weird fear driven action / negative vision thing going on. but also, sprinting now and chilling in 2028 seems like exactly the wrong policy
As a datapoint, none of this chilling out or sprinting hard discussion resonates with me. Internally I feel that I’ve been going about as hard as I know how to since around 2015, when I seriously got started on my own projects. I think I would be working about similarly hard if my timelines shortened by 5 years or lengthened by 15. I am doing what I want to do, I’m doing the best I can, and I’m mostly focusing on investing my life into building truth-seeking and world-saving infrastructure. I’m fixing all my psychological and social problems insofar as they’re causing friction to my wants and intentions, and as a result I’m able to go much harder today than I was in 2015. I don’t think effort is really a substantially varying factor in how good my output is or impact on the world. My mood/attitude is not especially dour and I’m not pouring blind hope into things I secretly know are dead ends. Sometimes I’ve been more depressed or had more burnout, but it’s not been much to do with timelines and more about the local environment I’ve been working in or internal psychological mistakes. To be clear, I try to take as little vacation time at work as I psychologically can (like 2-4 weeks per year), but that’s because there’s so much great stuff for me to build over the next decade(s), and that’d be true if I had 30-year timelines.
I am sure other people are doing differently-well, but I would like to hear from such people about their experience of things (or for people here to link to others’ writing). (I might also be more interested in the next Val post being an interview with someone, rather than broad advice.)
Added: I mean, I do sometimes work 70 hour weeks, and I sometimes work 50 hour weeks, but this isn’t a simple internal setting I can adjust, it’s way more a fact about what the work demands of me. I could work harder, but primarily by picking projects that require it and the external world is setting deadlines of me, not by “deciding” to work harder. (I’ve never really been able to make that decision, as far as I can quickly recall it’s always failed whenever I’ve tried.)
I would strongly, strongly argue that essentially “take all your vacation” is a strategy that would lead to more impact for you on your goals, almost regardless of what they are.
Humans need rest, and humans like the folks on LW tend not to take enough.
Naively, working more will lead to more output and if someone thinks they feel good while working a lot, I think the default guess should be that working more is improving their output. I would be interested in the evidence you have for the claim that people operating similar to Ben described should take more vacation.
I think there is some minimum amount of breaks and vacation that people should strongly default to taking and it also seems good to take some non-trivial amount of time to at least reflect on their situation and goals in different environments (you can think of this as a break, or as a retreat).
But, 2-4 weeks per year of vacation combined with working more like 70 hours a week seems like a non-crazy default if it feels good. This is only working around 2⁄3 of waking hours (supposing 9 hours for sleep and getting ready for sleep) and working ~95% of weeks. (And Ben said he works 50-70 hours, not always 70.)
It’s worth noting that “human perform better with more rest” isn’t a sufficient argument for thinking more rest is impactful: you need to argue this effect overwhelms the upsides of additional work. (Including things like returns to being particularly fast and possible returns to scale on working hours.)
I mean, two points: 1. We all work too many hours, working 70 hours a week persistently is definitely too many to maximize output. You get dumb fast after hour 40 and dive into negative productivity. There’s a robust organizational psych literature on this, I’m given to understand, that we all choose to ignore, because the first ~12 weeks or so, you can push beyond and get more done, but then it backfires.
2. You’re literally saying statements that I used to say before burning out, and that the average consultant or banker says as part of their path to burnout. And we cannot afford to lose either of you to burnout, especially not right now.
If you’re taking a full 4 weeks, great. 2 weeks a year is definitely not enough at a 70 hours a week pace, based on the observed long term health patterns of everyone I’ve known who works that pace for a long time. I’m willing to assert that you working 48/50ths of the hours a year you’d work otherwise is worth it, assuming fairly trivial speedups in productivity of literally just over 4% from being more refreshed, getting new perspectives from downing tools, etc.
Burnout is not a result of working a lot, it’s a result of work not feeling like it pays out in ape-enjoyableness[citation needed]. So they very well could be having a grand ol time working a lot if their attitude towards intended amount of success matches up comfortably with actual success and they find this to pay out in a felt currency which is directly satisfying. I get burned out when effort ⇒ results ⇒ natural rewards gets broken, eg because of being unable to succeed at something hard, or forgetting to use money to buy things my body would like to be paid with.
If someone did a detailed literature review or had relatively serious evidence, I’d be interested. By default, I’m quite skeptical of your level of confidence in this claims given that they directly contradict my experience and the experience of people I know. (E.g., I’ve done similar things for way longer than 12 weeks.)
To be clear, I think I currently work more like 60 hours a week depending on how you do the accounting, I was just defending 70 hours as reasonable and I think it makes sense to work up to this.
That said, I do think there’s enough evidence that I would bet (not at extreme odds) that it is bad for productivity to have organizational cultures that emphasize working very long hours (say > 60 hours / week), unless you are putting in special care to hire people compatible with that culture. Partly this is because I expect organizations to often be unable to overcome weak priors even when faced with blatant evidence.
i think there’s a lot of variance. i personally can only work in unpredictable short intense bursts, during which i get my best work done; then i have to go and chill for a while. if i were 1 year away from the singularity i’d try to push myself past my normal limits and push chilling to a minimum, but doing so now seems like a bad idea. i’m currently trying to fix this more durably in the long run but this is highly nontrival
Oh that makes sense, thanks. That seems more like a thing for people who’s work comes from internal inspiration / is more artistic, and also for people who have personal or psychological frictions that cause them to burn out a lot when they do this sort of burst-y work.
I think a lot of my work is heavily pulled out of me be the rest of the world setting deadlines (e.g. users making demands, people arriving for an event, etc), and I can cause those sorts of projects to pull lots of work out of me more regularly. I also think I don’t take that much damage from doing it.
it still seems bad to advocate for the exactly wrong policy, especially one that doesn’t make sense even if you turn out to be correct (as habryka points out in the original comment, many think 2028 is not really when most people expect agi to have happened). it seems very predictable that people will just (correctly) not listen to the advice, and in 2028 both sides on this issue will believe that their view has been vindicated—you will think of course rationalists will never change their minds and emotions on agi doom, and most rationalists will think obviously it was right not to follow the advice because they never expected agi to definitely happen before 2028.
i think you would have much more luck advocating for chilling today and citing past evidence to make your case..
it still seems bad to advocate for the exactly wrong policy, especially one that doesn’t make sense even if you turn out to be correct (as habryka points out in the original comment, many think 2028 is not really when most people expect agi to have happened).
I’m super sensitive to framing effects. I notice one here. I could be wrong, and I’m guessing that even if I’m right you didn’t intend it. But I want to push back against it here anyway. Framing effects don’t have to be intentional!
It’s not that I started with what I thought was a wrong or bad policy and tried to advocate for it. It’s that given all the constraints, I thought that preregistering a possibility as a “pause and reconsider” moment might be the most effective and respectful. It’s not what I’d have preferred if things were different. But things aren’t different from how they are, so I made a guess about the best compromise.
I then learned that I’d made some assumptions that weren’t right, and that determining such a pause point that would have collective weight is much more tricky. Alas.
But it was Oliver’s comment that brought this problem to my awareness. At no point did I advocate for what I thought at the time was the wrong policy. I had hope because I thought folk were laying down some timeline predictions that could be falsified soon. Turns out, approximately nope.
i think you would have much more luck advocating for chilling today and citing past evidence to make your case..
Empirically I disagree. That demonstrably has not been within the reach of my skill to do effectively. But it’s a sensible thing to consider trying again sometime.
to be clear, I am not intending to claim that you wrote this post believing that it was wrong. I believe that you are trying your best to improve the epistemics and I commend the effort.
I had interpreted your third sentence as still defending the policy of the post even despite now agreeing with Oliver, but I understand now that this is not what you meant, and that you are no longer in favor of the policy advocated in the post. my apologies for the misunderstanding.
I don’t think you should just declare that people’s beliefs are unfalsifiable. certainly some people’s views will be. but finding a crux is always difficult and imo should be done through high bandwidth talking to many people directly to understand their views first (in every group of people, especially one that encourages free thinking among its members, there will be a great diversity of views!). it is not effective to put people on blast publicly and then backtrack when people push back saying you misunderstood their position.
I realize this would be a lot of work to ask of you. unfortunately, coordination is hard. it’s one of the hardest things in the world. I don’t think you have any moral obligation to do this beyond any obligation you feel to making AI go well / improving this community. I’m mostly saying this to lay out my view of why I think this post did not accomplish its goals, and what I think would be the most effective course of action to find a set of cruxes that truly captures the disagreement. I think this would be very valuable if accomplished and it would be great if someone did it.
Of course, I agree that in some worlds AI progress has substantially slowed down, and we have received evidence that things will take longer, but “are we alive and are things still OK in 2028?” is a terrible way to operationalize that. Most people do not expect anything particularly terrible to have happened by 2028!
Sure, but to the extent that we put probability mass on AGI as early as 2027, we correspondingly should update from not having seen it, and especially not having seen the precursors we expect to see, by then.
I haven’t seen an AI produce a groundbreaking STEM paper by 2027, my probability LLMs + RL will scale to superintelligence, drops from about 80% to about 70%.
My best guess, though I am far from confident, is that things will mostly get continuously more crunch-like from here, as things continue to accelerate
I think on one hand, things will totally get more crunchlike as time goes on, but also, I think “working hard now” is more leveraged than “working hard later” because now is when the world is generally waking up and orienting, the longer you wait the more entrenched powers-that-be will be dominating the space and controlling the narrative.
I actually was planning to write a post that was sort-of-the-complement of Val’s, i.e. “Consider ‘acting as if short timelines’ for the next year or so, then re-evaluate” (I don’t think you need to wait till 2028 to see how prescient the 2027 models are looking)
I’m not sure what to think in terms of “how much to chill out” which is probably a wrong question. I think if there was a realistic option to “work harder and burn out more afterwards” this year, I would, but I sort of tried that and then immediately burned out more than seemed useful even as a shortterm tradeoff.
Most people do not expect anything particularly terrible to have happened by 2028!
Yup. I think we’re missing ~1 key breakthrough, followed by a bunch of smaller tweaks, before we actually hit AGI. But I also suspect that that road from AGI to ASI is very short, and that the notion of “aligned” ASI is straight-up copium. So if an ASI ever arrives, we’ll get whatever future the ASI chooses.
In other words, I believe that:
LLMs alone won’t quite get us to AGI.
But there exists a single, clever insight which would close at least half the remaining distance to AGI.
That insight is likely a “recipe for ruin”, in the sense that once published, it can’t be meaningfully controlled. The necessary training steps could be carried out in secret by many organizations, and a weak AGI might be able to run on a 2028 Mac Studio.
(No, I will not argue for the above points. I have a few specific candidates for the ~1 breakthrough between us and AGI, and yes, those candidates being very actively researched by serious people.)
But this makes it hard for me to build an AGI timeline. It’s possible someone has already had the key insight, and that they’re training a weak, broken AGI even as we speak. And it’s possible that as soon as they publish, the big labs will know enough to start training runs for a real AGI. But it’s also possible that we’re waiting on a theoretical breakthrough. And breakthroughs take time.
So I am… resigned. Que séra, séra. I won’t do capabilities work. I will try to explain to people that if we ever build an ASI, the ASI will very likely be the one making all the important decisions. But I won’t fool myself into thinking that “alignment” means anything more than “trying to build a slightly kinder pet owner for the human race.” Which is, you know, a worthy goal! If we’re going to lose control over everything, better to lose control to something that’s more-or-less favorably disposed.
I do agree that 2028 is a weird time to stop sounding the alarm. If I had to guess, 2026-2028 might be years of peak optimism, when things still look like they’re going reasonably well. If I had to pick a time period where things go obviously wrong, I’d go with 2028-2035.
there exists a single, clever insight which would close at least half the remaining distance to AGI
By my recollection, this specific possibility (and neighboring ones, like “two key insights” or whatever) has been one of the major drivers of existential fear in this community for at least as long as I’ve been part of it. I think Eliezer expressed something similar. Something like “For all we know, we’re just one clever idea away from AGI, and some guy in his basement will think of it and build it. That could happen at any time.”
I don’t know your reasons for thinking we’re just one insight away, and you explicitly say you don’t want to present the arguments here. Which makes sense to me!
I just want to note that from where I’m standing, this kind of thinking and communicating sure looks like a possible example of the type of communication pattern I’m talking about in the OP. I’m actually not picky about the trauma model specifically. But it totally fits the bill of “Models spread based significantly on how much doom they seem to plausibly forecast.” Which makes some sense if there really is a severe doom you’re trying to forecast! But it also puts a weird evolutionary incentive on the memes: if they can, they’ll develop mutations designed to seem very plausible and amplify the feeling of doom, decoupling from that pesky reality that slows down how effectively the memes can mutate content designed to encourage their spread.
I can’t know whether or not that’s what’s going on with your “one clever insight away” model, or with why you’re sharing it the way you are. I’d have to see the reasoning. I don’t mean to dismiss what you’re saying as “just trauma”; that’d be a doubly unkind way of oversimplifying what I’m saying.
But at the same time, I find myself skeptical of any naked AI doom model that sounds like “Look, we’re basically guaranteed to be screwed, but this margin is too small for me to explain. But the conclusion is very very bad.”
I cannot distinguish between that being an honest report of a good model that’s too big or gnarly to explain, versus a virulent meme riding on folks’ subconscious fixation (for whatever reason) on doom.
And thus I put it in my “Well, maybe.” box, and mostly ignore it.
and neighboring ones, like “two key insights” or whatever
I… kinda feel like there’s been one key insight since you were in the community? Specifically I’m thinking of transformers, or whatever it is that got us from pre-GPT era to GPT era.
Depending on what counts as “key” of course. My impression is there’s been significant algorithmic improvements since then but not on the same scale. To be fair it sounds like Random Developer has a lower threshold than I took the phrase to mean.
But I do think someone guessing “two key insights away from AGI” in say 2010, and now guessing “one key insight away from AGI”, might just have been right then and be right now?
(I’m aware that you’re not saying they’re not, but it seemed worth noting.)
(Re the “missed the point” reaction, I claim that it’s not so much that I missed the point as that I wasn’t aiming for the point. But I recognize that reactions aren’t able to draw distinctions that finely.)
By my recollection, this specific possibility (and neighboring ones, like “two key insights” or whatever) has been one of the major drivers of existential fear in this community for at least as long as I’ve been part of it.
I work with LLMs professionally, and my job currently depends on accurate capabilities evaluation. To give you an idea of the scale, I sometimes run a quarter million LLM requests a day. Which isn’t that much, but it’s something.
A year ago, I would have vaguely guesstimated that we were about “4-5 breakthroughs” away. But those were mostly unknown breakthroughs. One of those breakthroughs actually occurred (reasoning models and mostly coherent handling of multistep tasks).
But I’ve spent a lot of time since then experimenting with reasoning models, running benchmarks, and reading papers.
When I predict that “~1 breakthrough might close half the remaining distance to AGI,” I now have something much more specific in mind. There are multiple research groups working hard on it, including at least one frontier lab. I could sketch out a concrete research plan and argue in fairly specific detail why this is the right place to look for a breakthrough. I have written down very specific predictions (and stored them somewhere safe), just to keep myself honest.
If I thought getting close to AGI was a good thing, then I believe in this idea enough to spend, oh, US$20k out of pocket renting GPUs. I’ll accept that I’m likely wrong on the details, but I think I have a decent chance of being in the ballpark. I could at least fail interestingly enough to get a job offer somewhere with real resources.
But I strongly suspect that AGI leads almost inevitably to ASI, and to loss of human control over our futures.
And thus I put it in my “Well, maybe.” box, and mostly ignore it.
Good. I am walking a very fine line here. I am trying to be just credible and specific enough to encourage a few smart people to stop poking the demon core quite so enthusiastically, but not so specific and credible that I make anyone say, “Oh, that might work! I wonder if anyone working on that is hiring.”
I am painfully aware that OpenAI was founded to prevent a loss of human control, and that it has arguably done more than any other human organization to cause what it was founded to prevent.
(And please note—I have updated away from AI doom in the past, and there are conditions under which I would absolutely do so again. It’s just 2028 is a terrible year for making updates on my model, since my models for “AI Doom” and “AI fizzle” make many of the same predictions for the next few years.)
I don’t appreciate the local discourse norm of “let’s not mention the scary ideas but rest assured they’re very very scary”. It’s not healthy. If you explained the idea, we could shoot it down! But if it’s scary and hidden then we can’t.
Also, multiple frontier labs are currently working on it and you think your lesswrong comment is going to make a difference?
You should at least say by when you will consider this specific single breakthrough thing to be falsified.
There’s quite a difference between a couple frontier labs achieving AGI internally and the whole internet being able to achieve AGI on a llama/deepseek base model, for example.
Do the currently missing LLM abilities scale like pre-training, where each improvement requires spending 10x as much money?
Or do the currently missing abilities scale more like “reasoning”, where individual university groups could fine-tune an existing model for under $5,000 in GPU costs, and give it significant new abilities?
Or is the real situation somewhere in between?
Category (2) is what Bolstrom described as a “vulnerable world”, or a “recipe for ruin.” Also, not everyone believes that “alignment” will actually work for ASI. Under these assumptions, widely publishing detailed proposals in category (2) would seem unwise?
Also, even I believed that someone would figure out the necessary insights to build AGI, it still matters how quickly they do it. Given a choice of dying of cancer in 6 months or 12 (all other things being equal), I would pick 12.
(I really ought to make an actual discussion post on the right way to handle even “recipes for small-scale ruin.” After September 11th, this was a regular discussion among engineers and STEM types. It turns out that there are some truly nasty vulnerabilities that are known to experts, but that are not widely known to the public. If these vulnerabilities can be fixed, it’s usually better to publicize them. But what should you do if a vulnerability is fundamentally unfixable?)
Exactly! The frontier labs have the compute and incentive to push capabilities forward, while randos on lesswrong are instead more likely to study alignment in weak open source models
I think that we have both the bitter lesson that transformers will continue to gain capabilities with scale and also that there are optimizations that will apply to intelligent models generally and orthogonally to computing scale. The latter details seem dangerous to publicize widely in case we happen to be in the world of a hardware overhang allowing AGI or RSI (which I think could be achieved easier/sooner by a “narrower” coding agent and then leading rapidly to AGI) on smaller-than-datacenter clusters of machines today.
This feels kind of backwards, in the sense that I think something like 2032-2037 is probably the period that most people I know who have reasonably short timelines consider most likely.
AI 2027 is a particularly aggressive timeline compared to the median, so if you choose 2028 as some kind of Schelling time to decide whether things are markedly slower than expected then I think you are deciding on a strategy that doesn’t make sense by like 80% of the registered predictions that people have.
Even the AI Futures team themselves have timelines that put more probability mass on 2029 than 2027, IIRC.
Of course, I agree that in some worlds AI progress has substantially slowed down, and we have received evidence that things will take longer, but “are we alive and are things still OK in 2028?” is a terrible way to operationalize that. Most people do not expect anything particularly terrible to have happened by 2028!
My best guess, though I am far from confident, is that things will mostly get continuously more crunch-like from here, as things continue to accelerate. The key decision-point in my model at which things might become a bit different is if we hit the end of the compute overhang, and you can’t scale up AI further simply by more financial investment, but instead now need to substantially ramp up global compute production, and make algorithmic progress, which might markedly slow down progress.
I agree with a bunch of other things you say about it being really important to have some faith in humanity, and to be capable of seeing what a good future looks like even if it’s hard, and that this is worth spending a lot of effort and attention on, but just the “I propose 2028 as the time to re-evaluate things, and I think we really want to change things if stuff still looks fine” feels to me like it fails to engage with people’s actually registered predictions.
Situational Awareness and AI 2027 have been signal boosted to normal people more than other predictions, though. Like, they both have their own website, AI 2027 has a bunch of fancy client-side animation and Scott Alexander collaborated, and someone made a Youtube video on AI 2027.
While AI safety people have criticized the timeline predictions to some extent, there hasn’t been much in-depth criticism (aside from the recent very long post on AI 2027), the general sentiment on their timelines seems positive (although Situational Awareness has been criticized for contributing to arms race dynamics).
I get that someone who looks at AI safety people’s timelines in more detail would get a different impression. Though, notably, Metaculus lists Jan 2027 as a “community prediction” of “weakly general AI”. Sure, someone could argue that weakly general AI doesn’t imply human-level AGI soon after, but mostly when I see AI safety people point to this Metaculus market, it’s as evidence that experts believe human-level AGI will arrive in the next few years, there is not emphasis on the delta between weakly general AI and human-level AGI.
So I see how an outsider would see more 2027-2029 timelines from AI safety people and assume that’s what they consider a reasonable median, if they aren’t looking into it closely. This is partially due to internet/social media attention dynamics, where surprising predictions get more attention.
What I think we could agree on is that, if 2028 rolls around and things seem pretty much like today, then whatever distributed attention algorithm promotes things like Situational Awareness and AI 2027 to people’s attention in practice leads to flawed predictions, and people who were worried because of listening to this algorithm should chill out and re-evaluate. (We probably also agree they should chill out today because this popular content isn’t reflective of AI safety people’s general opinions.)
Yep, agree that there is currently a biased coverage towards very short timelines. I think this makes sense in that the worlds where things are happening very soon are the worlds that from the perspective of a reasonable humanity require action now.[1]
I think despite the reasonable justification for focusing on the shorter timelines worlds for decision-making reasons, I do expect this to overall cause a bunch of people to walk away with the impression that people confidently predicted short timelines, and this in turn will cause a bunch of social conflict and unfortunate social accounting to happen in most worlds.
I on the margin would be excited to collaborate with people who would want to do similar things to AI 2027 or Situational Awareness for longer timelines.
I.e. in as much as you model the government as making reasonable risk-tradeoffs in the future, the short timeline worlds are the ones that require intervention to cause changes in decision-making now.
I am personally more pessimistic about humanity doing reasonable things, and think we might just want to grieve over short timeline worlds, but I sure don’t feel comfortable telling other people to not ring the alarm bell on potentially very large risks happening very soon, which seems plausible enough to me that absolutely it should be among the top considerations for most decision-makers out there.
Even if it does make sense strategically to put more attention on shorter timelines, that sure does not seem to be what actually drives the memetic advantage of short forecasts over long forecasts. If you want your attention to be steered in strategically-reasonable ways, you should probably first fully discount for the apparent memetic biases, and then go back and decide how much is reasonable to re-upweight short forecasts. Whatever bias the memetic advantage yields is unlikely to be the right bias, or even the right order of magnitude of relative attention bias.
I mean, I am not even sure it’s strategic given my other beliefs, and I was indeed saying that on the margin more longer-timeline coverage is worth it, so I think we agree.
What’s the longest timeline that you could still consider a short timeline by your own metric, and therefore a world “we might just want to grieve over”? I ask because, in your original comment you mentioned 2037 as a reasonably short timeline, and personally if we had an extra decade I’d be a lot less worried.
About 15 years, I think?
Edit: Oops, I responded to the first part of your question, not the second. My guess is timelines with less than 5 years seem really very hard, though we should still try. I think there is lots of hope in the 5-15 year timeline worlds. 15 years is just roughly the threshold of when I would stop considering someone’s timelines “short”, as a category.
I admit, it’s pretty disheartening to hear that, even if we had until 2040 (which seems less and less likely to me anyway), you’d still think there’s not much we could do but grieve in advance.
And communication strategies based on appealing to such people’s reliance on those algorithms should also re-evaluate.
E.g., why did folk write AI 2027? Did they honestly think the timeline was that short? Were they trying to convey a picture that would scare people with something on a short enough timeline that they could feel it?
If the latter, we might be doing humanity a disservice, both by exhausting people from something akin to adrenal fatigue, and also as a result of crying wolf.
Yes, I honestly thought the timeline was that short. I now think it’s 50% by end of 2028; over the last year my timelines have lengthened by about a year.
Well extrapolating that it sounds like things are fine. :P
It has indeed been really nice, psychologically, to have timelines that are lengthening again. 2020 to 2024 that was not the case.
You wrote AI 2027 in April… what changed in such a short amount of time?
If your timelines lengthened over the last year, do you think writing AI 2027 was an honest reflection of your opinions at the time?
The draft of AI 2027 was done in December, then we had months of editing and rewriting in response to feedback. For more on what changed, see various comments I made online such as this one: https://www.lesswrong.com/posts/cxuzALcmucCndYv4a/daniel-kokotajlo-s-shortform?commentId=dq6bpAHeu5Cbbiuyd
We said right on the front page of AI 2027 in a footnote that our actual AGI timelines medians were somewhat longer than 2027:
I also mentioned my slightly longer timelines in various interviews about it, including the first one with Kevin Roose.
OpenAI researcher Jason Wei recently stated that there will be many bottlenecks to recursive self improvement (experiments, data), thoughts?
https://x.com/_jasonwei/status/1939762496757539297z
He makes some obvious points everyone already knows about bottlenecks etc. but then doesn’t explain why all that adds up to a decade or more, instead of of a year, or a month, or a century. In our takeoff speeds forecast we try to give a quantitative estimate that takes into account all the bottlenecks etc.
Isn’t it more like “I think there’s a 10% chance of transformative AI by 2027, and that is like 100x higher than what it looks like most people think, so people really need to think thru that timeline”?
Like, I generally put my median year at 2030-2032; if we make it to 2028, the situation will still feel like “oh jeez we probably only have a few years left”, unless we made it to 2028 thru a mechanism that clearly blocks transformative AI showing up in 2032. (Like, a lot is hinging on what “feels basically like today” means.)
I think Daniel also just has shorter timelines than most (which is correlated for wanting to more urgently communicate that knowledge).
That might be. It sounds really plausible. I don’t know why they wrote it!
But all the same: I don’t think most people know what 10% likelihood of a severe outcome is like or how to think about it sensibly. My read is that the vast majority of people need to treat 10% likelihood of doom as either “It’s not going to happen” (because 10% is small) or “It’s guaranteed to happen” (because it’s a serious outcome if it does happen, and it’s plausible). So, amplifying the public awareness of this possibility seems more to me like moving awareness of the scenario from “Nothing existential is going to happen” to “This specific thing is the default thing to expect.”
So I expect that unless something is done to… I don’t know, magically educate the population on statistical thinking, or propagate a public message that it’s roughly right but its timeline is wrong? then the net effect will be that either (a) AI 2027 will have been collectively forgotten by 2028 in roughly the same way that, say, Trudeau’s use of the Emergencies Act has been forgotten; or (b) the predictions failing to pan out will be used as reason to dismiss other AI doom predictions that are apparently considered more likely.
The main benefit I see is if some key folk are made to think about AI doom scenarios in general as a result of AI 2027, and start to work out how to deal with other scenarios.
But I don’t know. That’s been part of this community’s strategy for over two decades. Get key people thinking about AI risk. And I’m not too keen on the results I’ve seen from that strategy so far.
it does imply that, but i’m somewhat loathe to mention this at all, because i think the predictive quality you get from one question to another varies astronomically, and this is not something the casual reader will be able to glean
I honestly didn’t know that. Thank you for mentioning it. Almost everything I hear is people worrying about AGI in the next few years, not AGI a decade from now.
Just to check: you’re saying that by 2028 something like 80% of the registered predictions still won’t have relevant evidence against them?
As both a pragmatic matter and a moral one, I really hope we can find ways of making more of those predictions more falsifiable sooner. If anything in the vague family of what I’m saying is right, but if folk won’t halt, melt, & catch fire for at least a decade, then that’s an awful lot of pointless suffering and wasted talent. I’m also skeptical that there’ll actually be a real HMC in a decade; if the metacognitive blindspot type I’m pointing at is active, then waiting a decade is part of the strategy for not having to look, and it’ll come up with yet more reasons not to look when AGI doesn’t happen ten years from now too.
Cool. Noted. So, what can we observe by 2028 that’d cause us pause about whether there’s a collective confusion/projection playing a significant role?
This angle misses my point. Your analysis makes sense, but it’s talking about something different.
I think you’re trying to suggest what observations would cause you to update about AI timelines being longer than you currently think they are. Yes?
I’m asking what we should observe if it turns out that the degree of focus on narrating doom is more due to an underlying emotional distress than due to correct calibration to the situation we’re in.
One effect of that might be that AI timelines bias toward the pessimistic, which means that on average we should keep finding that they’re longer than the consensus keeps converging on. But I think that’s both slow and noisy as a way of detecting it.
Noted. Thanks for pointing it out. I do think I failed to engage this way.
I still think my overall point stands though. I didn’t think things would look fine; my expectation is that things will continue to look ever more dire. I’m just hoping that if we can distinguish between (a) dire because things are in fact getting worse versus (b) dire because that’s what the emotional Shepard tone does, then in 2.5 years we could pause and check which one seems to be responsible for the increase in doom, and if it’s the latter then HMC.
So, again: what could we observe at the start of 2028 that would create pause this way?
As you implied above, pessimism is driven only secondarily by timelines. If things in 2028 don’t look much different than they do now, that’s evidence for longer timelines (maybe a little longer, maybe a lot). But it’s inherently not much evidence about how dangerous superintelligence will be when it does arrive. If the situation is basically the same, then our state of knowledge is basically the same.
So what would be good evidence that worrying about alignment was unnecessary? The obvious one is if we get superintelligence and nothing very bad happens, despite the alignment problem remaining unsolved. But that’s like pulling the trigger to see if the gun is loaded. Prior to superintelligence, personally I’d be more optimistic if we saw AI progress requiring even more increasing compute than the current trend—if the first superintelligences were very reliant on massive pools of tightly integrated compute, and had very limited inference capacity, that would make us less vulnerable and give us more time to adapt to them. Also, if we saw a slowdown in algorithmic progress despite widespread deployment of increasingly capable coding software, that would be a very encouraging sign that recursive self-improvement might happen slowly.
Very little. I’ve been seriously thinking about ASI since the early 00s. Around 2004-2007, I put my timeline around 2035-2045, depending on the rate of GPU advancements. Given how hardware and LLM progress actually played out, my timeline is currently around 2035.
I do expect LLMs (as we know them now) to stall before 2028, if they haven’t already. Something is missing. I have very concrete guesses as to what is missing, and it’s an area of active research. But I also expect the missing piece adds less than a single power of 10 to existing training and inference costs. So once someone publishes it in any kind of convincing way, then I’d estimate better than an 80% chance of uncontrolled ASI within 10 years.
Now, there are lots of things I could see in 2035 that would cause me to update away from this scenario. I did, in fact, update away from my 2004-2007 predictions by 2018 or so, largely because nothing like ChatGPT 3.5 existed by that point. GPT 3 made me nervous again, and 3.5 Instruct caused me to update all the way back to my original timeline. And if we’re still stalled in 2035, then sure, I’ll update heavily away from ASI again. But I’m already predicting the LLM S-curve to flatten out around now, resulting in less investment in Chinchilla scaling and more investment in algorithmic improvement. But since algorithmic improvement is (1) hard to predict, and (2) where I think the actual danger lies, I don’t intend to make any near-term updates away ASI.
I think compute scaling will slow substantially by around 2030 (edit: if we haven’t seen transformative AI). (There is some lag, so I expect the rate at which capex is annually increasing to already have slowed by mid 2028 or so, but this will take a bit before it hits scaling.)
Also, it’s worth noting that most algorithmic progress AI companies are making is driven by scaling up compute (because scaling up labor in an effective way is so hard: talented labor is limited, humans parallelize poorly, and you can’t pay more to make them run faster). So, I expect algorithmic progress will also slow around this point.
All these factors make me think that something like 2032 or maybe 2034 could be a reasonable Schelling time (I agree that 2028 is a bad Schelling time), but IDK if I see that much value in having a Schelling time (I think you probably agree with this).
In practice, we should be making large updates (in expectation) over the next 5 years regardless.
There will be signs if it slows down earlier, it’s possible that in 2027-2028 we are already observing that there is no resolve to start building 5 GW Rubin Ultra training systems (let alone the less efficient but available-a-year-earlier 5 GW non-Ultra Rubin systems), so that we can update then already, without waiting for 2030.
This could result from some combination of underwhelming algorithmic progress, RLVR scaling not working out, and the 10x compute scaling from 100K H100 chips to 400K GB200 chips not particularly helping, so that AIs of 2027 fail to be substantially more capable than AIs of 2025.
But sure, this doesn’t seem particularly likely. And there will be even earlier signs that the scaling slowdown isn’t happening before 2027-2028 if the revenues of companies like OpenAI and Anthropic keep sufficiently growing (in 2025-2026), though most of these revenues might also be indirectly investment-fueled, threatening to evaporate if AI stops improving substantially.
my synthesis is I think people should chill out more today and sprint harder as the end gets near (ofc, some fraction of people should always be sprinting as if the end is near, as a hedge, but I think it should be less than now. also, if you believe the end really is <2 years away then disregard this). the burnout thing is real and it’s a big reason for me deciding to be more chill. and there’s definitely some weird fear driven action / negative vision thing going on. but also, sprinting now and chilling in 2028 seems like exactly the wrong policy
I agree. To be honest I didn’t think chilling out now was a real option. I hoped to encourage it in a few years with the aid of preregistration.
As a datapoint, none of this chilling out or sprinting hard discussion resonates with me. Internally I feel that I’ve been going about as hard as I know how to since around 2015, when I seriously got started on my own projects. I think I would be working about similarly hard if my timelines shortened by 5 years or lengthened by 15. I am doing what I want to do, I’m doing the best I can, and I’m mostly focusing on investing my life into building truth-seeking and world-saving infrastructure. I’m fixing all my psychological and social problems insofar as they’re causing friction to my wants and intentions, and as a result I’m able to go much harder today than I was in 2015. I don’t think effort is really a substantially varying factor in how good my output is or impact on the world. My mood/attitude is not especially dour and I’m not pouring blind hope into things I secretly know are dead ends. Sometimes I’ve been more depressed or had more burnout, but it’s not been much to do with timelines and more about the local environment I’ve been working in or internal psychological mistakes. To be clear, I try to take as little vacation time at work as I psychologically can (like 2-4 weeks per year), but that’s because there’s so much great stuff for me to build over the next decade(s), and that’d be true if I had 30-year timelines.
I am sure other people are doing differently-well, but I would like to hear from such people about their experience of things (or for people here to link to others’ writing). (I might also be more interested in the next Val post being an interview with someone, rather than broad advice.)
Added: I mean, I do sometimes work 70 hour weeks, and I sometimes work 50 hour weeks, but this isn’t a simple internal setting I can adjust, it’s way more a fact about what the work demands of me. I could work harder, but primarily by picking projects that require it and the external world is setting deadlines of me, not by “deciding” to work harder. (I’ve never really been able to make that decision, as far as I can quickly recall it’s always failed whenever I’ve tried.)
I would strongly, strongly argue that essentially “take all your vacation” is a strategy that would lead to more impact for you on your goals, almost regardless of what they are.
Humans need rest, and humans like the folks on LW tend not to take enough.
Naively, working more will lead to more output and if someone thinks they feel good while working a lot, I think the default guess should be that working more is improving their output. I would be interested in the evidence you have for the claim that people operating similar to Ben described should take more vacation.
I think there is some minimum amount of breaks and vacation that people should strongly default to taking and it also seems good to take some non-trivial amount of time to at least reflect on their situation and goals in different environments (you can think of this as a break, or as a retreat).
But, 2-4 weeks per year of vacation combined with working more like 70 hours a week seems like a non-crazy default if it feels good. This is only working around 2⁄3 of waking hours (supposing 9 hours for sleep and getting ready for sleep) and working ~95% of weeks. (And Ben said he works 50-70 hours, not always 70.)
It’s worth noting that “human perform better with more rest” isn’t a sufficient argument for thinking more rest is impactful: you need to argue this effect overwhelms the upsides of additional work. (Including things like returns to being particularly fast and possible returns to scale on working hours.)
I mean, two points:
1. We all work too many hours, working 70 hours a week persistently is definitely too many to maximize output. You get dumb fast after hour 40 and dive into negative productivity. There’s a robust organizational psych literature on this, I’m given to understand, that we all choose to ignore, because the first ~12 weeks or so, you can push beyond and get more done, but then it backfires.
2. You’re literally saying statements that I used to say before burning out, and that the average consultant or banker says as part of their path to burnout. And we cannot afford to lose either of you to burnout, especially not right now.
If you’re taking a full 4 weeks, great. 2 weeks a year is definitely not enough at a 70 hours a week pace, based on the observed long term health patterns of everyone I’ve known who works that pace for a long time. I’m willing to assert that you working 48/50ths of the hours a year you’d work otherwise is worth it, assuming fairly trivial speedups in productivity of literally just over 4% from being more refreshed, getting new perspectives from downing tools, etc.
Burnout is not a result of working a lot, it’s a result of work not feeling like it pays out in ape-enjoyableness[citation needed]. So they very well could be having a grand ol time working a lot if their attitude towards intended amount of success matches up comfortably with actual success and they find this to pay out in a felt currency which is directly satisfying. I get burned out when effort ⇒ results ⇒ natural rewards gets broken, eg because of being unable to succeed at something hard, or forgetting to use money to buy things my body would like to be paid with.
If someone did a detailed literature review or had relatively serious evidence, I’d be interested. By default, I’m quite skeptical of your level of confidence in this claims given that they directly contradict my experience and the experience of people I know. (E.g., I’ve done similar things for way longer than 12 weeks.)
To be clear, I think I currently work more like 60 hours a week depending on how you do the accounting, I was just defending 70 hours as reasonable and I think it makes sense to work up to this.
I think the evidence is roughly at “this should be a weakly held prior easily overturned by personal experience”: https://www.lesswrong.com/posts/c8EeJtqnsKyXdLtc5/how-long-can-people-usefully-work
That said, I do think there’s enough evidence that I would bet (not at extreme odds) that it is bad for productivity to have organizational cultures that emphasize working very long hours (say > 60 hours / week), unless you are putting in special care to hire people compatible with that culture. Partly this is because I expect organizations to often be unable to overcome weak priors even when faced with blatant evidence.
but most of my work is very meaningful and what i want to be doing
i don’t want to see paris or play the new zelda game more than i want to make lessonline happen
i think there’s a lot of variance. i personally can only work in unpredictable short intense bursts, during which i get my best work done; then i have to go and chill for a while. if i were 1 year away from the singularity i’d try to push myself past my normal limits and push chilling to a minimum, but doing so now seems like a bad idea. i’m currently trying to fix this more durably in the long run but this is highly nontrival
Oh that makes sense, thanks. That seems more like a thing for people who’s work comes from internal inspiration / is more artistic, and also for people who have personal or psychological frictions that cause them to burn out a lot when they do this sort of burst-y work.
I think a lot of my work is heavily pulled out of me be the rest of the world setting deadlines (e.g. users making demands, people arriving for an event, etc), and I can cause those sorts of projects to pull lots of work out of me more regularly. I also think I don’t take that much damage from doing it.
it still seems bad to advocate for the exactly wrong policy, especially one that doesn’t make sense even if you turn out to be correct (as habryka points out in the original comment, many think 2028 is not really when most people expect agi to have happened). it seems very predictable that people will just (correctly) not listen to the advice, and in 2028 both sides on this issue will believe that their view has been vindicated—you will think of course rationalists will never change their minds and emotions on agi doom, and most rationalists will think obviously it was right not to follow the advice because they never expected agi to definitely happen before 2028.
i think you would have much more luck advocating for chilling today and citing past evidence to make your case..
I’m super sensitive to framing effects. I notice one here. I could be wrong, and I’m guessing that even if I’m right you didn’t intend it. But I want to push back against it here anyway. Framing effects don’t have to be intentional!
It’s not that I started with what I thought was a wrong or bad policy and tried to advocate for it. It’s that given all the constraints, I thought that preregistering a possibility as a “pause and reconsider” moment might be the most effective and respectful. It’s not what I’d have preferred if things were different. But things aren’t different from how they are, so I made a guess about the best compromise.
I then learned that I’d made some assumptions that weren’t right, and that determining such a pause point that would have collective weight is much more tricky. Alas.
But it was Oliver’s comment that brought this problem to my awareness. At no point did I advocate for what I thought at the time was the wrong policy. I had hope because I thought folk were laying down some timeline predictions that could be falsified soon. Turns out, approximately nope.
Empirically I disagree. That demonstrably has not been within the reach of my skill to do effectively. But it’s a sensible thing to consider trying again sometime.
to be clear, I am not intending to claim that you wrote this post believing that it was wrong. I believe that you are trying your best to improve the epistemics and I commend the effort.
I had interpreted your third sentence as still defending the policy of the post even despite now agreeing with Oliver, but I understand now that this is not what you meant, and that you are no longer in favor of the policy advocated in the post. my apologies for the misunderstanding.
I don’t think you should just declare that people’s beliefs are unfalsifiable. certainly some people’s views will be. but finding a crux is always difficult and imo should be done through high bandwidth talking to many people directly to understand their views first (in every group of people, especially one that encourages free thinking among its members, there will be a great diversity of views!). it is not effective to put people on blast publicly and then backtrack when people push back saying you misunderstood their position.
I realize this would be a lot of work to ask of you. unfortunately, coordination is hard. it’s one of the hardest things in the world. I don’t think you have any moral obligation to do this beyond any obligation you feel to making AI go well / improving this community. I’m mostly saying this to lay out my view of why I think this post did not accomplish its goals, and what I think would be the most effective course of action to find a set of cruxes that truly captures the disagreement. I think this would be very valuable if accomplished and it would be great if someone did it.
Sure, but to the extent that we put probability mass on AGI as early as 2027, we correspondingly should update from not having seen it, and especially not having seen the precursors we expect to see, by then.
I haven’t seen an AI produce a groundbreaking STEM paper by 2027, my probability LLMs + RL will scale to superintelligence, drops from about 80% to about 70%.
Not quite responding t your main point, but:
I think on one hand, things will totally get more crunchlike as time goes on, but also, I think “working hard now” is more leveraged than “working hard later” because now is when the world is generally waking up and orienting, the longer you wait the more entrenched powers-that-be will be dominating the space and controlling the narrative.
I actually was planning to write a post that was sort-of-the-complement of Val’s, i.e. “Consider ‘acting as if short timelines’ for the next year or so, then re-evaluate” (I don’t think you need to wait till 2028 to see how prescient the 2027 models are looking)
I’m not sure what to think in terms of “how much to chill out” which is probably a wrong question. I think if there was a realistic option to “work harder and burn out more afterwards” this year, I would, but I sort of tried that and then immediately burned out more than seemed useful even as a shortterm tradeoff.
Yup. I think we’re missing ~1 key breakthrough, followed by a bunch of smaller tweaks, before we actually hit AGI. But I also suspect that that road from AGI to ASI is very short, and that the notion of “aligned” ASI is straight-up copium. So if an ASI ever arrives, we’ll get whatever future the ASI chooses.
In other words, I believe that:
LLMs alone won’t quite get us to AGI.
But there exists a single, clever insight which would close at least half the remaining distance to AGI.
That insight is likely a “recipe for ruin”, in the sense that once published, it can’t be meaningfully controlled. The necessary training steps could be carried out in secret by many organizations, and a weak AGI might be able to run on a 2028 Mac Studio.
(No, I will not argue for the above points. I have a few specific candidates for the ~1 breakthrough between us and AGI, and yes, those candidates being very actively researched by serious people.)
But this makes it hard for me to build an AGI timeline. It’s possible someone has already had the key insight, and that they’re training a weak, broken AGI even as we speak. And it’s possible that as soon as they publish, the big labs will know enough to start training runs for a real AGI. But it’s also possible that we’re waiting on a theoretical breakthrough. And breakthroughs take time.
So I am… resigned. Que séra, séra. I won’t do capabilities work. I will try to explain to people that if we ever build an ASI, the ASI will very likely be the one making all the important decisions. But I won’t fool myself into thinking that “alignment” means anything more than “trying to build a slightly kinder pet owner for the human race.” Which is, you know, a worthy goal! If we’re going to lose control over everything, better to lose control to something that’s more-or-less favorably disposed.
I do agree that 2028 is a weird time to stop sounding the alarm. If I had to guess, 2026-2028 might be years of peak optimism, when things still look like they’re going reasonably well. If I had to pick a time period where things go obviously wrong, I’d go with 2028-2035.
By my recollection, this specific possibility (and neighboring ones, like “two key insights” or whatever) has been one of the major drivers of existential fear in this community for at least as long as I’ve been part of it. I think Eliezer expressed something similar. Something like “For all we know, we’re just one clever idea away from AGI, and some guy in his basement will think of it and build it. That could happen at any time.”
I don’t know your reasons for thinking we’re just one insight away, and you explicitly say you don’t want to present the arguments here. Which makes sense to me!
I just want to note that from where I’m standing, this kind of thinking and communicating sure looks like a possible example of the type of communication pattern I’m talking about in the OP. I’m actually not picky about the trauma model specifically. But it totally fits the bill of “Models spread based significantly on how much doom they seem to plausibly forecast.” Which makes some sense if there really is a severe doom you’re trying to forecast! But it also puts a weird evolutionary incentive on the memes: if they can, they’ll develop mutations designed to seem very plausible and amplify the feeling of doom, decoupling from that pesky reality that slows down how effectively the memes can mutate content designed to encourage their spread.
I can’t know whether or not that’s what’s going on with your “one clever insight away” model, or with why you’re sharing it the way you are. I’d have to see the reasoning. I don’t mean to dismiss what you’re saying as “just trauma”; that’d be a doubly unkind way of oversimplifying what I’m saying.
But at the same time, I find myself skeptical of any naked AI doom model that sounds like “Look, we’re basically guaranteed to be screwed, but this margin is too small for me to explain. But the conclusion is very very bad.”
I cannot distinguish between that being an honest report of a good model that’s too big or gnarly to explain, versus a virulent meme riding on folks’ subconscious fixation (for whatever reason) on doom.
And thus I put it in my “Well, maybe.” box, and mostly ignore it.
I… kinda feel like there’s been one key insight since you were in the community? Specifically I’m thinking of transformers, or whatever it is that got us from pre-GPT era to GPT era.
Depending on what counts as “key” of course. My impression is there’s been significant algorithmic improvements since then but not on the same scale. To be fair it sounds like Random Developer has a lower threshold than I took the phrase to mean.
But I do think someone guessing “two key insights away from AGI” in say 2010, and now guessing “one key insight away from AGI”, might just have been right then and be right now?
(I’m aware that you’re not saying they’re not, but it seemed worth noting.)
(Re the “missed the point” reaction, I claim that it’s not so much that I missed the point as that I wasn’t aiming for the point. But I recognize that reactions aren’t able to draw distinctions that finely.)
I work with LLMs professionally, and my job currently depends on accurate capabilities evaluation. To give you an idea of the scale, I sometimes run a quarter million LLM requests a day. Which isn’t that much, but it’s something.
A year ago, I would have vaguely guesstimated that we were about “4-5 breakthroughs” away. But those were mostly unknown breakthroughs. One of those breakthroughs actually occurred (reasoning models and mostly coherent handling of multistep tasks).
But I’ve spent a lot of time since then experimenting with reasoning models, running benchmarks, and reading papers.
When I predict that “~1 breakthrough might close half the remaining distance to AGI,” I now have something much more specific in mind. There are multiple research groups working hard on it, including at least one frontier lab. I could sketch out a concrete research plan and argue in fairly specific detail why this is the right place to look for a breakthrough. I have written down very specific predictions (and stored them somewhere safe), just to keep myself honest.
If I thought getting close to AGI was a good thing, then I believe in this idea enough to spend, oh, US$20k out of pocket renting GPUs. I’ll accept that I’m likely wrong on the details, but I think I have a decent chance of being in the ballpark. I could at least fail interestingly enough to get a job offer somewhere with real resources.
But I strongly suspect that AGI leads almost inevitably to ASI, and to loss of human control over our futures.
Good. I am walking a very fine line here. I am trying to be just credible and specific enough to encourage a few smart people to stop poking the demon core quite so enthusiastically, but not so specific and credible that I make anyone say, “Oh, that might work! I wonder if anyone working on that is hiring.”
I am painfully aware that OpenAI was founded to prevent a loss of human control, and that it has arguably done more than any other human organization to cause what it was founded to prevent.
(And please note—I have updated away from AI doom in the past, and there are conditions under which I would absolutely do so again. It’s just 2028 is a terrible year for making updates on my model, since my models for “AI Doom” and “AI fizzle” make many of the same predictions for the next few years.)
I don’t appreciate the local discourse norm of “let’s not mention the scary ideas but rest assured they’re very very scary”. It’s not healthy. If you explained the idea, we could shoot it down! But if it’s scary and hidden then we can’t.
Also, multiple frontier labs are currently working on it and you think your lesswrong comment is going to make a difference?
You should at least say by when you will consider this specific single breakthrough thing to be falsified.
The universe isn’t obligated to cooperate with our ideals for discourse norms.
Exactly
The universe doesn’t care if you try to hide your oh so secret insights; multiple frontier labs are working on those insights
The only people who care are the people here getting more doomy and having worse norms for conversations.
There’s quite a difference between a couple frontier labs achieving AGI internally and the whole internet being able to achieve AGI on a llama/deepseek base model, for example.
One of my key concerns is the question of:
Do the currently missing LLM abilities scale like pre-training, where each improvement requires spending 10x as much money?
Or do the currently missing abilities scale more like “reasoning”, where individual university groups could fine-tune an existing model for under $5,000 in GPU costs, and give it significant new abilities?
Or is the real situation somewhere in between?
Category (2) is what Bolstrom described as a “vulnerable world”, or a “recipe for ruin.” Also, not everyone believes that “alignment” will actually work for ASI. Under these assumptions, widely publishing detailed proposals in category (2) would seem unwise?
Also, even I believed that someone would figure out the necessary insights to build AGI, it still matters how quickly they do it. Given a choice of dying of cancer in 6 months or 12 (all other things being equal), I would pick 12.
(I really ought to make an actual discussion post on the right way to handle even “recipes for small-scale ruin.” After September 11th, this was a regular discussion among engineers and STEM types. It turns out that there are some truly nasty vulnerabilities that are known to experts, but that are not widely known to the public. If these vulnerabilities can be fixed, it’s usually better to publicize them. But what should you do if a vulnerability is fundamentally unfixable?)
Exactly! The frontier labs have the compute and incentive to push capabilities forward, while randos on lesswrong are instead more likely to study alignment in weak open source models
I think that we have both the bitter lesson that transformers will continue to gain capabilities with scale and also that there are optimizations that will apply to intelligent models generally and orthogonally to computing scale. The latter details seem dangerous to publicize widely in case we happen to be in the world of a hardware overhang allowing AGI or RSI (which I think could be achieved easier/sooner by a “narrower” coding agent and then leading rapidly to AGI) on smaller-than-datacenter clusters of machines today.