This is an awesome post. I’ve read it before, but hadn’t fully internalized it.
My timelines on TAI / HLMI / 10x GDP growth are a bit longer than the BioAnchors report, but a lot of my objections to short timelines are specifically objecting to short timelines on rapid GDP growth. It’s obvious after reading this that what we care about is x-risk timelines, not GDP timelines. Forecasting when x-risk might spike is more difficult because it requires focusing on specific risk scenarios, like persuasion tools or fast takeoff, rather than general growth in AI capabilities and applications. I’m not immediately convinced that x-risk timelines are shorter than GDP timelines, but you make some good arguments and I’d like to think about it more.
This strengthens an argument I’ve been making career decisions based on: People should work on what will be valuable in specific, high-risk, short timelines scenarios, rather than what’s valuable in the most likely scenario. For example, persuasion tools might not be where the bulk of AI risk over the next 100 years comes from, but if true its PONR could be very soon, meaning people today are the only ones who can work on it. This wouldn’t make sense if you think we’re already doomed in short timelines scenarios, or if you think the bulk of risk comes from problems a few decades away that will take a few decades to solve. But that’s not me.
(Of course, it probably won’t literally be a day; probably it will be an extended period where we gradually lose influence over the future.)
I’d like to think about this as a distribution. Maybe we care about the probability of x-risk over time, and the most important time is not the date of human extinction, but the span of time during which x-risk is most rapidly rising. This probably runs into messy problems with probability and belief: it somewhat assumes a true underlying probability of x-risk at any given point, not subject to an individual observer’s beliefs but depending only on uncertainty about future actions we could choose to take. Is there a way to make this work, framing PONR as a distribution rather than a point in time?
I’d like to publicly preregister an opinion. It’s not worth making a full post because it doesn’t introduce any new arguments, so this seems like a fine place to put it.
I’m open to the possibility of short timelines on risks from language models. Language is a highly generalizable domain that’s seen rapid progress shattering expectations of slower timelines for several years in a row now. The self-supervised pretraining objective means that data is not a constraint (though it could be for language agents, tbd), and the market seems optimistic about business applications of language models.
While I would bet against (~80%) language models pushing annual GDP growth above 20% in the next 10 years, I strongly expect (~80%) risks from AI persuasion to materialize (e.g. becomes a mainstream topic of discussion, influence major political outcomes in the next 10 years) and I’m concerned (~20%) about tail risks from power-seeking LM agents (mainly hacking, but also financial trading, impersonation, or others). I’d be interested in (and should spend some time on) making clear falsifiable predictions here.
Credit to “What 2026 Looks Like” and “It Looks Like You’re Trying To Take Over The World” for making this case well before I believed it was possible. I’m also influenced by the widespread interest in LMs from AI safety grantmakers and researchers. This has been my belief for a few months, as I noted here, and I’ve taken action by working on LM truthfulness, which I expect to be most useful in scenarios of fast LM growth. (Though I don’t think it will substantially combat power-seeking LM agents, and I’m still learning about other research directions that might be more valuable.)
This is an awesome post. I’ve read it before, but hadn’t fully internalized it.
My timelines on TAI / HLMI / 10x GDP growth are a bit longer than the BioAnchors report, but a lot of my objections to short timelines are specifically objecting to short timelines on rapid GDP growth. It’s obvious after reading this that what we care about is x-risk timelines, not GDP timelines. Forecasting when x-risk might spike is more difficult because it requires focusing on specific risk scenarios, like persuasion tools or fast takeoff, rather than general growth in AI capabilities and applications. I’m not immediately convinced that x-risk timelines are shorter than GDP timelines, but you make some good arguments and I’d like to think about it more.
This strengthens an argument I’ve been making career decisions based on: People should work on what will be valuable in specific, high-risk, short timelines scenarios, rather than what’s valuable in the most likely scenario. For example, persuasion tools might not be where the bulk of AI risk over the next 100 years comes from, but if true its PONR could be very soon, meaning people today are the only ones who can work on it. This wouldn’t make sense if you think we’re already doomed in short timelines scenarios, or if you think the bulk of risk comes from problems a few decades away that will take a few decades to solve. But that’s not me.
I’d like to think about this as a distribution. Maybe we care about the probability of x-risk over time, and the most important time is not the date of human extinction, but the span of time during which x-risk is most rapidly rising. This probably runs into messy problems with probability and belief: it somewhat assumes a true underlying probability of x-risk at any given point, not subject to an individual observer’s beliefs but depending only on uncertainty about future actions we could choose to take. Is there a way to make this work, framing PONR as a distribution rather than a point in time?
I’d like to publicly preregister an opinion. It’s not worth making a full post because it doesn’t introduce any new arguments, so this seems like a fine place to put it.
I’m open to the possibility of short timelines on risks from language models. Language is a highly generalizable domain that’s seen rapid progress shattering expectations of slower timelines for several years in a row now. The self-supervised pretraining objective means that data is not a constraint (though it could be for language agents, tbd), and the market seems optimistic about business applications of language models.
While I would bet against (~80%) language models pushing annual GDP growth above 20% in the next 10 years, I strongly expect (~80%) risks from AI persuasion to materialize (e.g. becomes a mainstream topic of discussion, influence major political outcomes in the next 10 years) and I’m concerned (~20%) about tail risks from power-seeking LM agents (mainly hacking, but also financial trading, impersonation, or others). I’d be interested in (and should spend some time on) making clear falsifiable predictions here.
Credit to “What 2026 Looks Like” and “It Looks Like You’re Trying To Take Over The World” for making this case well before I believed it was possible. I’m also influenced by the widespread interest in LMs from AI safety grantmakers and researchers. This has been my belief for a few months, as I noted here, and I’ve taken action by working on LM truthfulness, which I expect to be most useful in scenarios of fast LM growth. (Though I don’t think it will substantially combat power-seeking LM agents, and I’m still learning about other research directions that might be more valuable.)