Katalina Hernandez comments on The Problem with Defining an “AGI Ban” by Outcome (a lawyer’s take).

Katalina Hernandez 24 Sep 2025 9:46 UTC
4 points
1
Again, one of my favorite comments in this post! I also appreciate you organising your thoughts at the best of your capacity while acknowledging the unknowns.
“The point of this heavy-handed parable is that as long as the frontier labs are intent on building superintelligence, and as long as we allow them to play cute legal games, then our odds may be very bad indeed.” I obviously agree, painfully so. And I really appreciate the tangible measures you propose (caps on training runs, controls of data centres).
Regarding the “Mutually Assured Destruction point”: About two days ago, Dan Hendrycs released an article called AI Deterrence is our Best Option. Some of the arguments you raised (in connection to your national security experience) reminded me of his Mutual Assured AI Malfunction deterrence framework. I’m curious to know your thoughts!

I am definetly not an international cooperation policy expert^[1]. In fact, these discussions and the deep thinking that have gone towards them are giving me a renewed respect for international policy experts. @Harold also raised very good points about the different behavious towards “ambiguity” (or what I perceive as ambiguous) in different legal systems. In his case, speaking from his experience with the Japanese legal system.

In the future, I will focus on leveraging what I know of international private law (contract laws in multinational deals) to re-think legal strategy using these insights.

This has been very helpful!
1. ^
  My legal background is in corporate law, and I have experience working at multinational tech companies and Fintech startups (also, some human rights law experience). Which is why I focus on predictable strategies that tech companies could adopt to “Goodhart” their compliance with AI laws or a prohibition to develop AGI.
- testingthewaters 24 Sep 2025 13:30 UTC
  4 points
  0
  Parent
  Just on the point of MAIM, I would point out that one of the authors of that paper (Alexandr Wang) has seemingly jumped ship from the side of “stop superintelligence being built” [1] to the side of “build superintelligence ASAP”, since he now heads up the somewhat unsubtly named “Meta Superintelligence Labs” as Chief AI officer.
  [1]: I mean, as the head of Scale AI (a company that produces AI training data), I’m not sure he was ever on the side of “stop superintelligence from being built”, but he did coauthor the paper apparently.
  - Kabir Kumar 24 Sep 2025 13:45 UTC
    4 points
    0
    Parent
    Also, Dan Hendrycks works at xAI and makes capability benchmarks.
    - habryka 27 Sep 2025 0:16 UTC
      2 points
      −1
      Parent
      He definitely works mostly on things he considers safety. I don’t think he has done much capability benchmark work recently (though maybe I am wrong, but I figured I would register that the above didn’t match my current beliefs).
      - Kabir Kumar 27 Sep 2025 1:58 UTC
        3 points
        0
        Parent
        Earlier this year
  - Katalina Hernandez 24 Sep 2025 13:34 UTC
    1 point
    0
    Parent
    Oh :/. Thank you for bringing this to my attention!
- Random Developer 24 Sep 2025 13:15 UTC
  3 points
  0
  Parent
  To be clear, I have never been an actual national security person! But once upon a time, my coworkers occasionally needed to speak with non-proliferation people. (It’s a long story.)
  
  Some of the arguments you raised (in connection to your national security experience) reminded me of his Mutual Assured AI Malfunction deterrence framework. I’m curious to know your thoughts!
  
  I don’t know if that specific deterence regime is workable or not, but it’s a good direction to think about. One difference from nuclear weapons is that if you’re the only country with nukes, then you can sort of “win” a nuclear war (like happened at the end of WW2). But being the only country with an incomprehensible alien superintelligence is more like being the only country with a highly infectious airborne Ebola strain. Actually using it is Unilaterally Assured Destruction, to coin a term.
  
  But let me make one last attempt to turn this intuition into a policy proposal.
  1. Key decision makers need to realize in their gut that “building an incomprehensible alien superintelligence that’s really good at acting” is one of those things like “engineering highly infectious airborne Ebola” or “allowing ISIS to build megaton fusion weapons.” Unfortunately, the frontier labs are very persuasive right now, but the game isn’t over yet.
  2. If the primary threat involves scaling, then we would need to control data centers containing more than a certain number of GPUs. Large numbers of GPUs would need to be treated like large numbers of gas centrifuges, basically. Or maybe specific future types of GPUs will need tighter controls.
  3. If the primary or secondary threat involves improved algorithms, then we may need to treat that like nuclear non-proliferation, too. I know a physics major who once spoke to the aforementioned national security people about nuclear threats, and he asked, “But what about if you did XYZ?” The experts suddenly got quiet, and then they said, “No comment. Also, we would really appreciate it if you never mentioned that again to anybody.” There are things at the edges of the nuclear control regime that aren’t enforceably secret, but that we still don’t want to see posted all over the Internet. Some of the enforcement around this is apparently handled by unofficially explaining to smart people that they should pretty please Shut The Fuck Up. Or maybe certain algorithms need to be classified the way we classify the most important nuclear secrets.
  4. It’s likely that we will need an international deterence regime. There are people with deep expertise in this.
  One key point is that nobody here actually knows how to build a superintelligence. What we do have is a lot of people (including world class experts like Geoffrey Hinton) who strongly suspect that we’re close to figuring it out. And we have multiple frontier lab leaders who have stated that they plan to build “superintelligence” this decade. And we have a bunch of people who have noticed that (1) we don’t remotely understand how even current LLMs work, and (2) the AI labs’ plans for controlling a superintelligence are firmly in “underpants gnome” territory. And also even their optimistic employees regularly admit they’re playing Russian roulette with the human race. You don’t need deep knowledge of machine learning to suspect that this is the setup for a bad Hollywood movie about hubris.
  
  But key details of how to build superintelligence are still unknown, happily. So it’s hard to make specific policy prescriptions. Szilard and Einstein could correctly see that fission was dangerous, but they couldn’t propose detailed rules for controlling centrifuges.
  
  I do suspect that corporate legal compliance expertise will play a key role here! And thank you for working on it. We’ll need your expertise. But legal compliance can’t be the only tool in our toolkit. If we tried to enforce nuclear nonproliferation the way we try to enforce the GDPR, we’d probably already be dead. You will need buy-in and back-up of the sort that upholds nuclear non-proliferation. And that’s going to require a major attitude change.
  
  (But I will also hit up my lawyer friends and see if they have more concrete advice.)