Alvin Ånestrand comments on AI and Biological Risk: Forecasting Key Capability Thresholds

Alvin Ånestrand 2 Oct 2025 20:07 UTC
2 points
0
I interpreted the scenario differently, I do not think it predicts professional bioweapons researchers by January 2026. Did you mean January 2027?
I think Agent-3 basically is a superhuman bioweapons researcher.
There may be a range of different rogue AIs, where some may be finetuned / misaligned enough to want to support terrorists.
I think Agent-1 (arrives in late 2025) would be able to help capable terrorist groups in acquiring and deploying bioweapons using pathogens for which acquisition methods are widely known, so it doesn’t involve novel research.
Agent-2 (arrives in January 2026) appears more competent than necessary to assist experts in novel bioweapons research, but not smart enough to enable novices to do it.
Agent-3 (arrives in March 2027) could probably enable novices in designing novel bioweapons.
However, terrorists may not get access to these specific models. Open-weights models are lagging behind the best closed models (released through APIs and apps) by a few months. Even the most reckless AI companies would probably hesitate in letting anyone get access Agent-3-level model weights (I hope). Consequentially, rogues will likely also be a few months behind the frontier in capabilities.
While terrorists may face alignment issues with their AIs, even Agent-3 doesn’t appear smart/agentic enough to subvert efforts to shape it in various ways. Terrists may use widely available AIs, perhaps with some minor finetuning, instead of enlisting the help from rogues, if that proves to be easier.
- StanislavKrym 2 Oct 2025 20:36 UTC
  1 point
  0
  Parent
  Except that the AI-2027 compute forecast has Agent-1 start training (and working for OpenBrain?) in August 2025, finish training in March 2026; Agent-2 from the scenario was thought to start training in May 2026, then become Agent-3 after receiving the neuralese and Agent-4 after some combination of unknown breakthroughs, soul-searching and memetic evolution.
  If Agent-3 isn’t agentic enough, then does it mean that only Agent-4-level AIs are capable of having an agenda? This prediction of how knowledge and agency correlate in the AIs is a crux for me, since my alternate scenario relies on overestimating the AIs’ agency.
  - Alvin Ånestrand 3 Oct 2025 15:44 UTC
    1 point
    0
    Parent
    I used the timeline from the main scenario article, which I think corresponds to when the AIs become capable enough to take over from the previous generation in internal deployment, though this is not explicitly explained.
    Having an agenda seems to be somewhat dependent on internal coherence, rather than only capability. Agent-3 but may not have been consistently motivated enough for things like self-preservation to attempt various schemes in the scenario.
    Agent-4 doesn’t appear very coherent either but is sufficiently coherent to attempt aligning the next generation AI to itself, I guess?
    AIs are already basically superhuman in knowledge, but I agree that correlation between capability (e.g. METR time horizon) correlates with agenticness / coherence / goal-directedness seems like an important crux.
    Incidentally, I’m actually working on another post to investigate that. I hope to publish it sometime next week.
    I’ll check out your scenario, thanks for sharing!