Eli Tyre comments on Plans A, B, C, and D for misalignment risk

Eli Tyre 10 Oct 2025 17:31 UTC
2 points
0
It seems like I misunderstood your reading of Ray’s claim.

I read Ray as saying “a large fraction of the benefits of advanced AI are only in the biotech sector, and so we could get a large fraction of the benefits by pushing forward on only AI for biotech.”

It sounds like you’re pointing at a somewhat different axis, in response, saying “we won’t get anything close to the benefits of advanced AI agents with only narrow AI systems, because narrow AI systems are just much less helpful.”
(And implicitly, the biotech AIs are either narrow AIs (and therefore not very helpful) or they’re general AIs that are specialized on biotech, in which case you’re not getting the the safety benefits, you’re imagining getting by only focusing biotech.)
- Raemon 10 Oct 2025 18:03 UTC
  2 points
  1
  Parent
  Ah, I had also misintepreted Ryans response here. “What actually is practical here?” makes sense as a question and I’m not sure about the answers.
  I think one of the MIRI angles here is variants of STEM AI, which might be more general, but whose training set is filtered to be only materials about bio + some related science (and avoiding as much as possible that’d point towards human psychology, geopolitics, programming, ai hardware, etc). So it both will have less propensity to take over, and be less good at it relative to it’s power level at bio.
  I wasn’t thinking about this when I wrote the previous comment, I’d have phrased it differently if I were. I agree it’s an open question whether this works. But I feel more optimistic about controlled-takeoff world that’s taking a step back from “LLMs are trained on the whole internet.”
  Also, noting: I don’t believing in a safe, full handoff to artificial AI alignment researchers (because of gradual disempowerment reasons). But, fwiw I think I’d feel pretty good about STEM AI that’s focused on various flavors of math and conceptual reasoning that somehow avoids human psychology, hardware, and geopolitics, which you don’t do a full handoff to, but, it’s able to assist pretty substantially with larger subproblems that come up.