Vika comments on Safety researchers should take a public stance

Vika 29 Sep 2025 15:19 UTC
4 points
0
Similarly to Leo, I think racing to AGI is bad and it would be good to coordinate not to do that. I support proposals for AI regulations that would make this easier. I signed various open letters to this effect on AI red lines, AI Treaty, SB1047, and others.
I’m pretty uncertain if pushing for an AI pause now is an effective way to achieve this, and I think it’s quite plausibly better to pause later rather than now. In the next few years, we will have more solid evidence of misalignment, and we would be able to make better use of a pause period (which is likely to be finite) e.g. with automated alignment researchers. I don’t think calling for a pause/ban now is a costless action—early calls for a pause have the risk of crying wolf and using up the political will that could be used for a pause later. I signed the FLI pause letter in 2023, but looking back it seems a bit premature. A conditional pause in the future seems much easier to get adopted than a hard pause now.
I agree with everything Neel said in his top-level comment, and I’m puzzled by the number of disagreement votes on it.
- Ishual 29 Sep 2025 16:18 UTC
  3 points
  0
  Parent
  I think that signing sufficiently clear open letters/similar things seems sufficient to count as “taking a public stance”.
  
  [Here is a first rough attempt at expressing an idea:]
  I don’t think pushing for an AI pause now is what most people have in mind (definitely it doesn’t match what I had in mind when writing the post, keeping in mind that the post isn’t about what kind of public stance it would be effective for experts (inside and outside the frontier labs) to take). Instead, what matters imo is to have a legible thing (eg open letter/statement) that says clearly that you think a coordination to stop the current race dynamic would be good if feasible (because the current race has serious risks of extinction/disempowerment). Let me try to make the distinction (between that and the pause letter) clearer:
  
  The CAIS statement “Mitigating the risk of extinction from AI should be a global priority alongside other societal-scale risks such as pandemics and nuclear war.” clearly establishes that extinction from AI is a thing many experts worry about. It was signed in 2023. It didn’t cry wolf. It is still useful today (probably more useful today than when it came out). If you need to convince a policy maker or a member of the general public that extinction is not a fringe concern, there is a clear statement you can quote, a simple link you can share, and a list of names that speaks for itself. In practice, this info is evergreen I think.
  
  When a policy maker considers whether they should pay attention or discuss a “pause”, they have their concerns, and they are capable of worrying about whether this is “crying wolf” without our help. But maybe they wonder “would experts actually oppose me if I spoke about this?” or “do experts actually think some international coordination is required?” especially if they have experience with international coordination, I think they will suspect it is very hard to pull off, so they might just think that pushing for alignment project X to get marginally more funding is a better use of their time.
  
  > I signed the FLI pause letter in 2023, but looking back it seems a bit premature.
  
  I basically agree. But expressing the fact that plan A would be (a lot) better than plan B (along with some clarity on why) is useful even if it doesn’t cause an immediate shift from plan B to plan A. There is a strong sense in which expressing “plan A >> plan B, because xrisks” is a lot more your job as an expert than timing a pause. In other words, I think experts should make future cooperation easier rather than try to unilaterally create a shelling point.
  
  if we are fully ignoring the internal pressures to not say such things within frontier labs, and purely focus on the efficacy of taking a public stance, I think there is a way to take a public stance without incurring the cost you mention above.
  
  If in the future there is some serious discussion toward an international agreement of some kind, my guess is that it will have been made possible/[much easier] by individual people clearly and credibly expressing that they’d be on board with implementing such an agreement, and I think experts could be helpful here (by clearly expressing something like “plan A >> plan B, because xrisks”).