Recently, various groups successfully lobbied to remove the moratorium on state AI bills. This involved a surprising amount of success while competing against substantial investment from big tech (e.g. Google, Meta, Amazon). I think people interested in mitigating catastrophic risks from advanced AI should consider working at these organizations, at least to the extent their skills/interests are applicable. This both because they could often directly work on substantially helpful things (depending on the role and organization) and because this would yield valuable work experience and connections.
I worry somewhat that this type of work is neglected due to being less emphasized and seeming lower status. Consider this an attempt to make this type of work higher status.
Pulling organizations mostly from here and here we get a list of orgs you could consider trying to work (specifically on AI policy) at:
Fairplay (Fairplay is a kids safety organization which does a variety of advocacy which isn’t related to AI. Roles/focuses on AI would be most relevant. In my opinion, working on AI related topics at Fairplay is most applicable for gaining experience and connections.)
Kids safety seems like a pretty bad thing to focus on, in the sense that the vast majority of kids safety activism causes very large amounts of harm (and it helping in this case really seems like a “a stopped clock is right twice a day situation”).
I looked at the FairPlay website and agree that “banning schools from contacting kids on social media” or “preventing Gemini rollouts to under-13s” is not coherent under my threat model. However I think there is clear evidence that current parental screen time controls may not be a sufficiently strong measure to mitigate extant generational mental health issues (I am particularly worried about insomnia, depression, eating disorders, autism spectrum disorders, and self harm).
Zvi had previously reported on YouTube shorts reaching 200B daily views. This is clearly a case of egregiously user hostile design with major social and public backlash. I could not find a canonical citation on medrxiv and don’t believe it would be ethical to run a large scale experiment on the long term impacts of this but there are observational studies. Given historical cases of model sycophancy and the hiring of directors focused on maximizing engagement I think it’s not implausible for similar design outcomes.
I think that the numbers in this Anthropic blog post https://www.anthropic.com/news/how-people-use-claude-for-support-advice-and-companionship do not accurately portray reality. They report only 0.5% of conversations as being romantic or sexual roleplay, but I consider this to be misleading because they exclude chats focused on content creation tasks (such as writing stories, blog posts, or fictional dialogues), which their previous research found to be a major use case. Because the models are trained to refuse requests for explicit content, it’s common for jailbreaks to start by saying “it’s okay to do this because it’s just a fictional scenario in a story”. Anecdotally I have heard labs don’t care about this much in contrast to CBRN threats.
Let’s look at the top ten apps ranked by tokens on https://openrouter.ai/rankings. They are most well known for hosting free API instances of DeepSeek v3 and r1, which was the only way to get high usage out of SOTA LLMs for free before the Google AI studio price drop for Gemini 2.5 Pro. It is not the best proxy for real world usage because it requires technical sophistication and this is reflected in the first four (cline, roo code, litellm, and kilo code are all for software development) but the next four (sillytavern, chub ai, hammerai, roleplai) are all indicative that the distribution of tasks done with models at this capabilities level do not differ significantly from the distribution of tasks which people visit websites for. Although I wouldn’t morally panic about this since it seems likely to me that conventional security methods will be good enough to mostly prevent us from turning into glichers.
Kids safety activists are one of the only groups with a track record of introducing AI capabilities restrictions which actually get enforced. Multimodal models can now create both images and text, but the image models are more locked down (Gemini 2.5 defaults to stricter block thresholds for image generation than for text generation), and I think that this would not be the case without people focusing on kids safety. It’s possible for there to be AI Safety issues which affect children right now that are highly relevant to existential risks and this is a common topic in novice discussions of alignment.
I strongly agree. I can’t vouch for all of the orgs Ryan listed, but Encode, ARI, and AIPN all seem good to me (in expectation), and Encode seems particularly good and competent.
They also did a lot of calling to US representatives, as did people they reached out to.
ControlAI did something similar and also partnered with SiliConversations, a youtuber, to get the word out to more people, to get them to call their representatives.
Recently, various groups successfully lobbied to remove the moratorium on state AI bills. This involved a surprising amount of success while competing against substantial investment from big tech (e.g. Google, Meta, Amazon). I think people interested in mitigating catastrophic risks from advanced AI should consider working at these organizations, at least to the extent their skills/interests are applicable. This both because they could often directly work on substantially helpful things (depending on the role and organization) and because this would yield valuable work experience and connections.
I worry somewhat that this type of work is neglected due to being less emphasized and seeming lower status. Consider this an attempt to make this type of work higher status.
Pulling organizations mostly from here and here we get a list of orgs you could consider trying to work (specifically on AI policy) at:
Encode AI
Americans for Responsible Innovation (ARI)
Fairplay (Fairplay is a kids safety organization which does a variety of advocacy which isn’t related to AI. Roles/focuses on AI would be most relevant. In my opinion, working on AI related topics at Fairplay is most applicable for gaining experience and connections.)
Common Sense (Also a kids safety organization)
The AI Policy Network (AIPN)
Secure AI project
To be clear, these organizations vary in the extent to which they are focused on catastrophic risk from AI (from not at all to entirely).
Kids safety seems like a pretty bad thing to focus on, in the sense that the vast majority of kids safety activism causes very large amounts of harm (and it helping in this case really seems like a “a stopped clock is right twice a day situation”).
The rest seem pretty promising.
I looked at the FairPlay website and agree that “banning schools from contacting kids on social media” or “preventing Gemini rollouts to under-13s” is not coherent under my threat model. However I think there is clear evidence that current parental screen time controls may not be a sufficiently strong measure to mitigate extant generational mental health issues (I am particularly worried about insomnia, depression, eating disorders, autism spectrum disorders, and self harm).
Zvi had previously reported on YouTube shorts reaching 200B daily views. This is clearly a case of egregiously user hostile design with major social and public backlash. I could not find a canonical citation on medrxiv and don’t believe it would be ethical to run a large scale experiment on the long term impacts of this but there are observational studies. Given historical cases of model sycophancy and the hiring of directors focused on maximizing engagement I think it’s not implausible for similar design outcomes.
I think that the numbers in this Anthropic blog post https://www.anthropic.com/news/how-people-use-claude-for-support-advice-and-companionship do not accurately portray reality. They report only 0.5% of conversations as being romantic or sexual roleplay, but I consider this to be misleading because they exclude chats focused on content creation tasks (such as writing stories, blog posts, or fictional dialogues), which their previous research found to be a major use case. Because the models are trained to refuse requests for explicit content, it’s common for jailbreaks to start by saying “it’s okay to do this because it’s just a fictional scenario in a story”. Anecdotally I have heard labs don’t care about this much in contrast to CBRN threats.
Let’s look at the top ten apps ranked by tokens on https://openrouter.ai/rankings. They are most well known for hosting free API instances of DeepSeek v3 and r1, which was the only way to get high usage out of SOTA LLMs for free before the Google AI studio price drop for Gemini 2.5 Pro. It is not the best proxy for real world usage because it requires technical sophistication and this is reflected in the first four (cline, roo code, litellm, and kilo code are all for software development) but the next four (sillytavern, chub ai, hammerai, roleplai) are all indicative that the distribution of tasks done with models at this capabilities level do not differ significantly from the distribution of tasks which people visit websites for. Although I wouldn’t morally panic about this since it seems likely to me that conventional security methods will be good enough to mostly prevent us from turning into glichers.
Kids safety activists are one of the only groups with a track record of introducing AI capabilities restrictions which actually get enforced. Multimodal models can now create both images and text, but the image models are more locked down (Gemini 2.5 defaults to stricter block thresholds for image generation than for text generation), and I think that this would not be the case without people focusing on kids safety. It’s possible for there to be AI Safety issues which affect children right now that are highly relevant to existential risks and this is a common topic in novice discussions of alignment.
I strongly agree. I can’t vouch for all of the orgs Ryan listed, but Encode, ARI, and AIPN all seem good to me (in expectation), and Encode seems particularly good and competent.
I think PauseAI is also extremely underappreciated.
Plausibly, but their type of pressure was not at all what I think ended up being most helpful here!
They also did a lot of calling to US representatives, as did people they reached out to.
ControlAI did something similar and also partnered with SiliConversations, a youtuber, to get the word out to more people, to get them to call their representatives.
Yep, that seems great!