Author of the Forecasting AI Futures blog. Constantly thinking about the potentials and risks of future AI systems.
Alvin Ånestrand
I interpreted the scenario differently, I do not think it predicts professional bioweapons researchers by January 2026. Did you mean January 2027?
I think Agent-3 basically is a superhuman bioweapons researcher.
There may be a range of different rogue AIs, where some may be finetuned / misaligned enough to want to support terrorists.
I think Agent-1 (arrives in late 2025) would be able to help capable terrorist groups in acquiring and deploying bioweapons using pathogens for which acquisition methods are widely known, so it doesn’t involve novel research.
Agent-2 (arrives in January 2026) appears more competent than necessary to assist experts in novel bioweapons research, but not smart enough to enable novices to do it.
Agent-3 (arrives in March 2027) could probably enable novices in designing novel bioweapons.
However, terrorists may not get access to these specific models. Open-weights models are lagging behind the best closed models (released through APIs and apps) by a few months. Even the most reckless AI companies would probably hesitate in letting anyone get access Agent-3-level model weights (I hope). Consequentially, rogues will likely also be a few months behind the frontier in capabilities.
While terrorists may face alignment issues with their AIs, even Agent-3 doesn’t appear smart/agentic enough to subvert efforts to shape it in various ways. Terrists may use widely available AIs, perhaps with some minor finetuning, instead of enlisting the help from rogues, if that proves to be easier.
AI and Biological Risk: Forecasting Key Capability Thresholds
Thank you for sharing your thoughts! My responses:
(1) I believe most historical advocacy movements have required more time than we might have for AI safety. More comprehensive plans might speed things up. It might be valuable to examine what methods have worked for fast success in the past.
(2) Absolutely.
(3) Yeah, raising awareness seems like it might be a key part of most good plans.
(4) All paths leading to victory would be great, but I think even plans that would most likely fail are still valuable. They illuminate options and tie ultimate goals to concrete action. I find it very unlikely that failing plans are worse than no plans. Perhaps high standards for comprehensive plans might have contributed to the current shortage of plans. “Plans are worthless, but planning is everything.” Naturally I will aim for all-paths-lead-to-victory plans, but I won’t be shy in putting ideas out there that don’t live up to that standard.
(5) I don’t currently have much influence, so the risk would be sacrificing inclusion in future conversations. I think it’s worth the risk.
I would consider it a huge success if the ideas were filtered through other orgs, even if they just help make incremental progress. In general, I think the AI safety community might benefit from having comprehensive plans to discuss and critique and iterate on over time. It would be great if I could inspire more people to try.
Indeed!
But the tournaments only provide the head-to-head scores for direct comparisons with top human forecasting performance. ForecastBench has clear human baselines.
It would be helpful if the Metaculus tournament leaderboards also reported Brier scores, even if they would not be directly comparable to human scores since the humans make predictions on fewer questions.
The ultimate goal
Forecasting AI Forecasting
Domain: Forecasting
Link: Forecasting AI Futures Resource Hub
Author(s): Alvin Ånestrand (self)
Type: directory
Why: A collection of information and resources for forecasting about AI, including a predictions database, related blogs and organizations, AI scenarios and interesting analyses, and various other resources.
It’s kind of a complement to https://www.predictionmarketmap.com/ for forecasting specifically about AI
predictionmarketmap.com works, but is not the link used in the post.
Powerful Predictions
I kind of started out thinking the effects would be larger, but Agent-2-based rogue AIs (~human level at many tasks) is too large for the rogue population to become more than a few million instances at most.
Sure, some rogues may focus on building powerbases of humans, it would be interesting to explore that further. The AI rights movement is kind of like that.
AI 2027 - Rogue Replication Timeline
What if Agent-4 breaks out?
Anticipating AI: Keeping Up With What We Build
Emergent Misalignment and Emergent Alignment
Happy Amazing Breakthrough Day!
Forecasting AI Futures Resource Hub
Musings on Scenario Forecasting and AI
Forecasting Uncontrolled Spread of AI
Good observation. The only questions that don’t explicitly exclude it in the resolution criteria are “Will there be a massive catastrophe caused by AI before 2030?” and “Will an AI related disaster kill a million people or cause $1T of damage before 2070?”, but I think the question creators mean a catastrophic event that is more directly caused by the AI, rather than just a reaction to AI being released.
Manifold questions are sometimes somewhat subjective in nature, which is a bit problematic.
I used the timeline from the main scenario article, which I think corresponds to when the AIs become capable enough to take over from the previous generation in internal deployment, though this is not explicitly explained.
Having an agenda seems to be somewhat dependent on internal coherence, rather than only capability. Agent-3 but may not have been consistently motivated enough for things like self-preservation to attempt various schemes in the scenario.
Agent-4 doesn’t appear very coherent either but is sufficiently coherent to attempt aligning the next generation AI to itself, I guess?
AIs are already basically superhuman in knowledge, but I agree that correlation between capability (e.g. METR time horizon) correlates with agenticness / coherence / goal-directedness seems like an important crux.
Incidentally, I’m actually working on another post to investigate that. I hope to publish it sometime next week.
I’ll check out your scenario, thanks for sharing!