Seth Herd comments on The Hopium Wars: the AGI Entente Delusion

Seth Herd 14 Oct 2024 19:48 UTC
12 points
4
I agree that software is a potential use-case for closed form proofs.

l thought their example of making a protein-creating system (or maybe it was a RNA creator) fully safe was interesting, because it seems like knowing what’s “toxic” would require a complete understanding of not only biology, but a complete understanding of which changes to the body humans do and don’t want. If even their chosen example seems utterly impossible, it doesn’t speak well for how thoroughly they’ve thought out the general proposal.

But yes, in the software domain it might actually work—conditions like “only entities with access to these keys should be allowed access to this system” seem simple enough to actually define to make closed form proofs relevant. And software security might make the world substantially safer in a multipolar scenario (although we should’ve forget that physical attacks are also quite possible).
- Noosphere89 14 Oct 2024 20:10 UTC
  6 points
  0
  Parent
  The problem with their chosen domain mostly boils down to them either misestimating how hard quantifying all possible higher order behaviors the program doesn’t have, or they somehow have a solution and aren’t telling us that.
  I like this comment as an articulation of the problem, and also some points about what formal proof systems can and can’t do:
  https://www.lesswrong.com/posts/B2bg677TaS4cmDPzL/limitations-on-formal-verification-for-ai-safety#kPRnieFrEEifZjksa
  If they knew of a path to being able to quantify all possible higher order behaviors in the proof system, I’d be more optimistic that their example would actually work IRL, but I don’t think this will happen, and be far more optimistic on software and hardware security overall.
  I also like some of the hardware security discussion here, as this might well be used with formal proofs to make things even more secure and encrypted. (though formal proofs aren’t involved):
  https://www.lesswrong.com/posts/nP5FFYFjtY8LgWymt/#e5uSAJYmbcgpa9sAv
  https://www.lesswrong.com/posts/nP5FFYFjtY8LgWymt/#TFmNy5MfkrvKTZGiA
  https://www.lesswrong.com/posts/nP5FFYFjtY8LgWymt/#3Jnpurgrdip7rrK8v
  I agree that physical attacks means that it’s probably not possible to get formal proofs alone to state-level security, but I do think that it might allow them to jump up several levels in security from non-state actors, from being essentially able to control the AI through exfiltration to being unable to penetrate a code-base at all, at least until the world is entirely transformed.
  I am of course assuming heavy use of AI labor here.
  What links here?
  - Noosphere89's comment on The Hopium Wars: the AGI Entente Delusion by Max Tegmark (16 Oct 2024 0:52 UTC; 5 points)