lc comments on Arguments for optimism on AI Alignment (I don’t endorse this version, will reupload a new version soon.)

lc 17 Oct 2023 6:59 UTC
18 points
7
It’s not just about “being taken seriously”, although that’s a nice bonus—it’s also about getting shared understanding about what makes programs secure vs. insecure. You need a method of touching grass so that researchers have some idea of whether or not they’re making progress on the real issues.
- Roko 18 Oct 2023 14:48 UTC
  1 point
  −1
  Parent
  
  You need a method of touching grass so that researchers have some idea of whether or not they’re making progress on the real issues.
  
  We already can’t make MNIST digit recognizers secure against adversarial attacks. We don’t know how to prevent prompt injection. Convnets are vulnerable to adversarial attacks. RL agents that play Go at superhuman levels are vulnerable to simple strategies that exploit gaps in their cognition.
  
  No, there’s plenty of evidence that we can’t make ML systems robust.
  
  What is lacking is “concrete” evidence that that will result in blood and dead bodies.
  - lc 18 Oct 2023 15:02 UTC
    4 points
    0
    Parent
    
    We already can’t make MNIST digit recognizers secure against adversarial attacks. We don’t know how to prevent prompt injection. Convnets are vulnerable to adversarial attacks. RL agents that play Go at superhuman levels are vulnerable to simple strategies that exploit gaps in their cognition.
    
    None of those things are examples of misalignment except arguably prompt injection, which seems like it’s being solved by OpenAI with ordinary engineering.