Katalina Hernandez comments on The Problem with Defining an “AGI Ban” by Outcome (a lawyer’s take).

Katalina Hernandez 23 Sep 2025 6:15 UTC
4 points
0
I actually enjoyed reading this comment a lot, thank you! Particularly well argued!
I’d agree with a lot of this, but:
“I think the actual way that we all survive is because the leadership of the US and China feels in their gut that building poorly-understood alien superintelligences is likely to kill both them and their children. Then they’ll make the message clear: “Here’s a long list of things you shouldn’t do unless you want to wind up on the wrong end of a joint military strike. And if you discover a new way to build an alien superintelligence that we didn’t put on this list? Yeah, that’s forbidden, too.”
As much as I understand the underlying concern—I think this is is actually how it fails.
In most conversations I have with activists, I always circle back to this point: policymakers do not have a lot of leeway when Bigtech’s interests are at play. I understand this varies from jurisdiction to jurisdiction. As an example, I know a few people who participated in the working group for the GPAI Code of Practice in Europe. I kept hearing how the versions were constantly watered down following the feedback of the future signatories ( the Big Labs). Personal frustrations of these experts included the feeling of helplessness: they either conceded, or risked delays or the Code not being signed at all. And, mind you: this is a voluntary framework! Can you imagine how it is in the case of absolute, binding regulation?
Yes, getting the support of government leaders is important. But in terms the wording of the ban; what’s written down and signed, is what ends up in the desk of OpenAI (and others)’s in-house counsel with instructions to find a way around it. Which is kind of the main point of the post: if AIS orgs don’t prioritise legal strategy when putting such proposals forward, they’ll only make it easier for people like me (working for the other side) to argue their way out of it. I don’t want that to happen.
- Random Developer 23 Sep 2025 13:26 UTC
  4 points
  0
  Parent
  Thank you for your thoughtful and excellent response!
  
  In most conversations I have with activists, I always circle back to this point: policymakers do not have a lot of leeway when Bigtech’s interests are at play.
  
  If this remains true if and when we approach superintelligence, then there’s an excellent chance we all die.
  
  Yes, getting the support of government leaders is important. But in terms the wording of the ban; what’s written down and signed, is what ends up in the desk of OpenAI (and others)’s in-house counsel with instructions to find a way around it.
  
  Again, I fear that this is a future in which we all die. (To be fair, I think there are a lot of futures in which we all die, if we figure out how to build an incomprehensible alien superintelligence.)
  
  Since Yudkowsky and Soares had such fun with parables, let me try one.
  
  FrontierBioCo. You’re a government regulator of the pharma industry, and FrontierBioCo has announced that they’re researching cures for infectious diseases. As part of this, they’re doing gain-of-function research. They’ve announced two targets:
  - A version of MERS-CoV that increases the existing 30% case fatality rate, while achieving a slightly higher R value than the SARS-CoV-2 Omicron strain in human-to-human transmission, making it one of the most contagious diseases known.
  - A version of Ebola which can be transmitted via airborne droplets, with extremely high R values.
  And when you look closer, you see that they brag about their BSL 4 containment labs, but that their head of biosafety just quit, muttering, “BSL 4 my ass, it isn’t even BSL 2.” And it’s quite clear that FrontierBioCo’s management and legal team have every intention of exploiting regulatory loopholes. And they brag about “moving fast and breaking things”. Including laws and most likely sample vials.
  
  Let’s say you don’t want to die from airborne Ebola or the new MERS-CoV variant codenamed “Armaggedon”. What’s your regulatory strategy here?
  
  My regulatory strategy would start by calling in biologists and military advisors. And I’d ask questions like, “What is the minimum level of response that we are certain guarantees containment of FrontierBioCo’s labs?” And when FrontierBioCo’s lawyers later insist that were technically complying with the law, my response involves disbarment. And trying really hard to find a way to sentence everyone involved to prison. Because I believe that sometimes you need to actually listen to Ellen Ripley if you want to live.
  
  The point of this heavy-handed parable is that as long as the frontier labs are intent on building superintelligence, and as long as we allow them to play cute legal games, then our odds may be very bad indeed.
  
  Which brings me to my next questions...
  
  What does a (mostly) successful technology ban look like? The closest real-world analogy here is nuclear non-proliferation. We haven’t actually stopped the technology from spreading entirely, but we’ve limited the spread to mid-tier nation states, and we’ve successfully avoided anyone using nuclear weapons. There seem to have been a couple of key points:
  - Mutually Assured Destruction (MAD) actually does seem to work in the medium term, even with regional powers. India and Pakistan haven’t nuked each other yet. So I guess this is a point for the terrifying Cold War planners? (“The whole point of a Doomsday machine is lost if you keep it a secret!”)
  - Key technology bottlenecks. I’ve worked in national-security-adjacent fields, and one of the scarier claims we heard from the actual national security types was that building basic nuclear weapons mostly isn’t that hard. In particular, many top university engineering departments could apparently do it if they really tried. But there’s one important catch: It’s surprisingly hard for a rogue organization to lay hands on fissile materials without getting noticed by the superpowers.
  Note that the fissile material ban is frequently enforced either by crippling economic sanctions or by military strikes on enrichment facilities. This isn’t usually a question settled by who has the most clever lawyers, or by the regulatory fine print.
  
  What do we need to do to prevent someone building an incomprehensible alien superintelligence? Honestly, the scary part is I don’t know. I can see two major routes by which we might get there:
  1. Scaling laws. Perhaps building superintelligence will require scaling up to 10x or 1000x our current training runs. In this case, if you control the data centers, you can prevent the training runs. This is actually closely analogous to controlling nuclear weapons.
  2. Algorithmic improvements. If we’re unlucky, then perhaps superintelligence works more like “reasoning models”. Reasoning models represented a sharp increase in mathematical and coding skills. And it took about 4 months from OpenAI’s announcement of o1-preview until random university researchers could add reasoning support to an existing model for under $5,000. If building a superintelligence is this easy, then we live in what Bolstrom called a “vulnerable world” (PDF). This is the world where you can hypothetically build fusion bombs using pool cleaning chemicals. Again, this is a possible world in which I suspect we all die.
  If we live in world (1), then my ideal advice is to immediately and permanently cap the size of training runs. (This may fail, for the reasons you point out. But we should try.) If we live in world (2), then I don’t have any advice beyond “Hug your kids.”
  - Katalina Hernandez 24 Sep 2025 9:46 UTC
    4 points
    1
    Parent
    Again, one of my favorite comments in this post! I also appreciate you organising your thoughts at the best of your capacity while acknowledging the unknowns.
    “The point of this heavy-handed parable is that as long as the frontier labs are intent on building superintelligence, and as long as we allow them to play cute legal games, then our odds may be very bad indeed.” I obviously agree, painfully so. And I really appreciate the tangible measures you propose (caps on training runs, controls of data centres).
    Regarding the “Mutually Assured Destruction point”: About two days ago, Dan Hendrycs released an article called AI Deterrence is our Best Option. Some of the arguments you raised (in connection to your national security experience) reminded me of his Mutual Assured AI Malfunction deterrence framework. I’m curious to know your thoughts!
    
    I am definetly not an international cooperation policy expert^[1]. In fact, these discussions and the deep thinking that have gone towards them are giving me a renewed respect for international policy experts. @Harold also raised very good points about the different behavious towards “ambiguity” (or what I perceive as ambiguous) in different legal systems. In his case, speaking from his experience with the Japanese legal system.
    
    In the future, I will focus on leveraging what I know of international private law (contract laws in multinational deals) to re-think legal strategy using these insights.
    
    This has been very helpful!
    ^
    My legal background is in corporate law, and I have experience working at multinational tech companies and Fintech startups (also, some human rights law experience). Which is why I focus on predictable strategies that tech companies could adopt to “Goodhart” their compliance with AI laws or a prohibition to develop AGI.
    - testingthewaters 24 Sep 2025 13:30 UTC
      4 points
      0
      Parent
      Just on the point of MAIM, I would point out that one of the authors of that paper (Alexandr Wang) has seemingly jumped ship from the side of “stop superintelligence being built” [1] to the side of “build superintelligence ASAP”, since he now heads up the somewhat unsubtly named “Meta Superintelligence Labs” as Chief AI officer.
      [1]: I mean, as the head of Scale AI (a company that produces AI training data), I’m not sure he was ever on the side of “stop superintelligence from being built”, but he did coauthor the paper apparently.
      - Kabir Kumar 24 Sep 2025 13:45 UTC
        4 points
        0
        Parent
        Also, Dan Hendrycks works at xAI and makes capability benchmarks.
        habryka 27 Sep 2025 0:16 UTC
        2 points
        −1
        Parent
        He definitely works mostly on things he considers safety. I don’t think he has done much capability benchmark work recently (though maybe I am wrong, but I figured I would register that the above didn’t match my current beliefs).
        Kabir Kumar 27 Sep 2025 1:58 UTC
        3 points
        0
        Parent
        Earlier this year
      - Katalina Hernandez 24 Sep 2025 13:34 UTC
        1 point
        0
        Parent
        Oh :/. Thank you for bringing this to my attention!
    - Random Developer 24 Sep 2025 13:15 UTC
      3 points
      0
      Parent
      To be clear, I have never been an actual national security person! But once upon a time, my coworkers occasionally needed to speak with non-proliferation people. (It’s a long story.)
      
      Some of the arguments you raised (in connection to your national security experience) reminded me of his Mutual Assured AI Malfunction deterrence framework. I’m curious to know your thoughts!
      
      I don’t know if that specific deterence regime is workable or not, but it’s a good direction to think about. One difference from nuclear weapons is that if you’re the only country with nukes, then you can sort of “win” a nuclear war (like happened at the end of WW2). But being the only country with an incomprehensible alien superintelligence is more like being the only country with a highly infectious airborne Ebola strain. Actually using it is Unilaterally Assured Destruction, to coin a term.
      
      But let me make one last attempt to turn this intuition into a policy proposal.
      
      Key decision makers need to realize in their gut that “building an incomprehensible alien superintelligence that’s really good at acting” is one of those things like “engineering highly infectious airborne Ebola” or “allowing ISIS to build megaton fusion weapons.” Unfortunately, the frontier labs are very persuasive right now, but the game isn’t over yet.
      If the primary threat involves scaling, then we would need to control data centers containing more than a certain number of GPUs. Large numbers of GPUs would need to be treated like large numbers of gas centrifuges, basically. Or maybe specific future types of GPUs will need tighter controls.
      If the primary or secondary threat involves improved algorithms, then we may need to treat that like nuclear non-proliferation, too. I know a physics major who once spoke to the aforementioned national security people about nuclear threats, and he asked, “But what about if you did XYZ?” The experts suddenly got quiet, and then they said, “No comment. Also, we would really appreciate it if you never mentioned that again to anybody.” There are things at the edges of the nuclear control regime that aren’t enforceably secret, but that we still don’t want to see posted all over the Internet. Some of the enforcement around this is apparently handled by unofficially explaining to smart people that they should pretty please Shut The Fuck Up. Or maybe certain algorithms need to be classified the way we classify the most important nuclear secrets.
      It’s likely that we will need an international deterence regime. There are people with deep expertise in this.
      
      One key point is that nobody here actually knows how to build a superintelligence. What we do have is a lot of people (including world class experts like Geoffrey Hinton) who strongly suspect that we’re close to figuring it out. And we have multiple frontier lab leaders who have stated that they plan to build “superintelligence” this decade. And we have a bunch of people who have noticed that (1) we don’t remotely understand how even current LLMs work, and (2) the AI labs’ plans for controlling a superintelligence are firmly in “underpants gnome” territory. And also even their optimistic employees regularly admit they’re playing Russian roulette with the human race. You don’t need deep knowledge of machine learning to suspect that this is the setup for a bad Hollywood movie about hubris.
      
      But key details of how to build superintelligence are still unknown, happily. So it’s hard to make specific policy prescriptions. Szilard and Einstein could correctly see that fission was dangerous, but they couldn’t propose detailed rules for controlling centrifuges.
      
      I do suspect that corporate legal compliance expertise will play a key role here! And thank you for working on it. We’ll need your expertise. But legal compliance can’t be the only tool in our toolkit. If we tried to enforce nuclear nonproliferation the way we try to enforce the GDPR, we’d probably already be dead. You will need buy-in and back-up of the sort that upholds nuclear non-proliferation. And that’s going to require a major attitude change.
      
      (But I will also hit up my lawyer friends and see if they have more concrete advice.)