What AI safety plans are there?
I’m trying to make a list of every written plan for how to make AGI go well. I am interested in hearing about any plans that I missed, or any lists of plans that have already been compiled.
Here are the plans I’ve found so far, in no particular order:
Google https://deepmind.google/discover/blog/taking-a-responsible-path-to-agi/
OpenAI https://openai.com/index/updating-our-preparedness-framework/
Anthropic https://www.anthropic.com/news/announcing-our-updated-responsible-scaling-policy
https://www.lesswrong.com/posts/8vgi3fBWPFDLBBcAx/planning-for-extreme-ai-risks
https://www.lesswrong.com/posts/bb5Tnjdrptu89rcyY/what-s-the-short-timeline-plan
https://time.com/6266923/ai-eliezer-yudkowsky-open-letter-not-enough/ or https://intelligence.org/2024/05/29/miri-2024-communications-strategy/ (note: these both touch on MIRI’s plan but they are not a complete breakdown of a plan)
https://www.lesswrong.com/posts/vZzg8NS7wBtqcwhoJ/nearcast-based-deployment-problem-analysis
https://drive.google.com/file/d/1LbQgcz_aehcnhFMupCtXqoR-BV5dX5Rv/view
It’s somewhat arbitrary as to how much detail something needs to qualify as “a plan”. I guess I would define a plan as “if we do these things and they all work, then we will be okay.” By this definition Google is the only company with an actual plan (as far as I know) but I included the closest thing I could find from OpenAI + Anthropic.
I think this is mostly composed of partial technical plans, but maybe you find it useful: https://ai-plans.com/.
Also, I really like these high-level scenarios and plans described by the MIRI technical governance team: https://www.lesswrong.com/posts/WkCfvqyjCzvRrwkaQ/ai-governance-to-avoid-extinction-the-strategic-landscape.
You might find this document I created to be interesting: Proposed Alignment Solutions.
Only other one I saw was for Microsoft. One other to watch would be Amazon (no explicit plan yet), and I guess Meta but is “open-source” a plan?