Lukas Finnveden comments on Safety researchers should take a public stance

Lukas Finnveden 21 Sep 2025 2:24 UTC
5 points
3
As such, I disagree with the various actions you recommend lab employees to take, and do not intend to take them myself.
It’s not clear that you disagree that much? You say you agree with leo’s statement, which seems to be getting lots of upvotes and “thanks” emojis suggesting that people are going “yes, this is great and what we asked for”.
I’m not sure what other actions there are to disagree with. There’s “advocate internally to ensure that the lab lets its employees speak out publicly, as mentioned above, without any official retaliation” — but I don’t really expect any official retaliation for statements like these so I don’t expect this to be a big fight where it’s costly to take a position.
- Neel Nanda 21 Sep 2025 7:54 UTC
  6 points
  6
  Parent
  To me, Leo’s statement is much weaker than what the post is asking people to say—it’s saying “conditional on us not knowing how to make ASI without killing everyone, it would be nice if we could coordinate on not racing to do it”—as a literal statement this seems obviously reasonable (eg someone who thinks we know how to make ASI safely, or will easily figure it out in time, could also agree with this, and someone who strongly opposes any kind of governmental AGI/ASI ban could agree with it, or even someone who thinks that in reality labs should do nothing but race; though I know Leo’s actual views are stronger than this)
  
  To me this is not advocating for an AGI ban, as an actual practical political request, it’s just saying “in theory, if we could coordinate, it would be nice”. The post is saying things I consider much stronger like:
  
  out publicly[2] against the current AI R&D regime and in favor of an AGI ban
  
  prefer an AGI ban[1] over the current path
  
  I am kinda confused by Leo’s comment being so highly upvoted, if it was genuinely all the authors of this post wanted then I suggest they write a different post, since I found the current one far more combative.
  - habryka 21 Sep 2025 7:58 UTC
    4 points
    −2
    Parent
    Leo is saying:
    I’ve been repeatedly loud and explicit about this but an happy to state again that racing to build superintelligence before we know how to make it not kill everyone (or cause other catastrophic outcomes) seems really bad and I wish we could coordinate to not do that.
    The “before” here IMO pretty clearly (especially in the context of the post) communicates “we do not know currently how to do this, we should currently not be racing, and I support efforts to stop racing at substantial cost”. Maybe I am wrong in that interpretation, if so I do think it’s indeed so weak as to not really mean anything.
    - Neel Nanda 21 Sep 2025 8:20 UTC
      2 points
      0
      Parent
      
      we do not know currently how to do this
      
      Agreed this is implied
      
      we should currently not be racing
      
      I believe Leo believes this, and it’s somewhat implied by the statement, though imo that statement is also consistent with eg “there’s a 50% chance that by the time we make AGI we’ll have figured how to align ASI, therefore it’s fine to continue to that point and then we can stop if need be or continue”
      
      I wish we could coordinate to not do that.
      
      I support efforts to stop racing at substantial cost
      
      To me the latter doesn’t seem the important claim, “should we take significant cost” doesn’t feel like a crux, rather “would this work at all” feels like my crux, and “I wish” reads to me like it’s side stepping that. This was the key bit of what Leo said that I consider much softer than what the post asks for.
      
      I’m also not sure the former implies the latter, but that gets messier—eg if you think that we will figure out how to align ASI after 6-12 months of pause at the right time, it doesn’t seem very costly to push for that. While if you think it will take at least 20 years and might totally fail, maybe it does. I consider the statement to be agnostic on this.
      
      To be clear, I know Leo’s actual beliefs are on the doomy end, but I’m trying to focus on what I think the statement actually says, and what I meant when I called it obviously reasonable.
      
      Other positions I consider consistent with the statement:
      
      It would be ridiculous to pause now, instead we should wait until we have good automated safety researchers under good control schemes, then pause for 6-12 months to solve alignment, then continue
      There’s absolutely no way to coordinate with China, there’s a 90% chance ASI is fine by default, and China getting to ASI first would be catastrophic, so the US should race to beat China and roll the dice
      Pushing for a ban is a terrible idea, because it will lead to a hardware overhang and could easily end too early for us to solve the problem, resulting in us having less time to practice on proto-AGI systems
  - Ishual 21 Sep 2025 10:14 UTC
    3 points
    0
    Parent
    if it was genuinely all the authors of this post wanted then I suggest they write a different post
    Leo’s statement is quite good without being all we wanted. (indeed, of the 3 things we wanted, 1 is about how we think it makes sense for others to relate to safety researcher based on what they say/[don’t say] publicly. and 1 is about trying to shift the lab’s behavior toward it being legibly safe for employees to say various things, which Leo’s comment is not about.) I internally track a pretty crucial difference between what I want to happen in the world (ie that we shift from plan B to plan A somehow) and how I believe people ought to relate to the public stance/[lack thereof] of safety researchers within frontier labs. I think there are maybe stronger stances Leo could have taken, and weaker ones, and I endorse having the way I relate/model/[act towards] Leo depend on which he takes. I think the public stance that would max lead to me maximally relating well to a safety researcher ought to be something like “I think coordinating to stop the race (even if in the form of some ban which I won’t choose the exact details of) would be better than the current race to ever more capable AI. I would support such coordination. I am currently trying to make the situation better in case there is no such coordination, but I don’t think the current situation is sufficiently promising to justify not coordinating. Also there is a real threat of humanity’s extinction if we don’t coordinate.” (or something to that effect)