Thomas Kwa comments on Thomas Kwa’s Shortform

Thomas Kwa 1 Apr 2025 20:01 UTC
LW: 80 AF: 37
0
AF
Some versions of the METR time horizon paper from alternate universes:
Measuring AI Ability to Take Over Small Countries (idea by Caleb Parikh)
Abstract: Many are worried that AI will take over the world, but extrapolation from existing benchmarks suffers from a large distributional shift that makes it difficult to forecast the date of world takeover. We rectify this by constructing a suite of 193 realistic, diverse countries with territory sizes from 0.44 to 17 million km^2. Taking over most countries requires acting over a long time horizon, with the exception of France. Over the last 6 years, the land area that AI can successfully take over with 50% success rate has increased from 0 to 0 km^2, at the rate of 0 km^2 per year (95% CI 0.0-0.0 km^2/year); extrapolation suggests that AI world takeover is unlikely to occur in the near future. To address concerns about the narrowness of our distribution, we also study AI ability to take over small planets and asteroids, and find similar trends.
Measuring AI Ability to Worry About AI
Abstract: Since 2019, the amount of time LW has spent worrying about AI has doubled every seven months, and now constitutes the primary bottleneck to AI safety research. Automation of worrying would be transformative to the research landscape, but worrying includes several complex behaviors, ranging from simple fretting to concern, anxiety, perseveration, and existential dread, and so is difficult to measure. We benchmark the ability of frontier AIs to worry about common topics like disease, romantic rejection, and job security, and find that current frontier models such as Claude 3.7 Sonnet already outperform top humans, especially in existential dread. If these results generalize to worrying about AI risk, AI systems will be capable of autonomously worrying about their own capabilities by the end of this year, allowing us to outsource all our AI concerns to the systems themselves.
Estimating Time Since The Singularity
Early work on the time horizon paper used a hyperbolic fit, which predicted that AGI (AI with an infinite time horizon) was reached last Thursday. [1] We were skeptical at first because the R^2 was extremely low, but recent analysis by Epoch suggested that AI already outperformed humans at a 100-year time horizon by about 2016. We have no choice but to infer that the Singularity has already happened, and therefore the world around us is a simulation. We construct a Monte Carlo estimate over dates since the Singularity and simulator intentions, and find that the simulation will likely be turned off in the next three to six months.
[1]: This is true
What links here?
- Introducing The Spending What We Must Pledge by Thomas Kwa (EA Forum; 1 Apr 2025 7:11 UTC; 241 points)
- Buck 2 Apr 2025 0:39 UTC
  LW: 28 AF: 14
  0
  AF Parent
  A few months ago, I accidentally used France as an example of a small country that it wouldn’t be that catastrophic for AIs to take over, while giving a talk in France 😬
  - Arjun Panickssery 4 Apr 2025 1:22 UTC
    5 points
    2
    Parent
    Didn’t watch the video but is there the short version of this argument? France is at the 90th percentile of population sizes and also has the 4th-most nukes.
- wonder 2 Apr 2025 15:25 UTC
  1 point
  0
  Parent
  Would the take over for small countries also about humans using just an advanced AI for taking over? (or would the human using advanced AI for take over happen faster?)