Daniel Kokotajlo comments on AI Views Snapshots

Daniel Kokotajlo 15 Dec 2023 22:42 UTC
14 points
2
Thanks! You seem really confident that enough of the alignment problem will be solved in time and that governments will be generally reasonable. I’d love to hear more elaboration on those points; this seems like the biggest disagreement between us.
- Audrey Tang 17 Dec 2023 6:30 UTC
  6 points
  0
  Parent
  Based on recent conversations with policymakers, labs and journalists, I see increased coordination around societal evaluation & risk mitigation — (cyber)security mindset is now mainstream.
  Also, imminent society-scale harm (e.g. contextual integrity harms caused by over-reliance & precision persuasion since ~a decade ago) has shown to be effective in getting governments to consider risk reasonably.
  - Daniel Kokotajlo 17 Dec 2023 23:52 UTC
    2 points
    0
    Parent
    I definitely agree that policymakers, labs, and journalists seem to be “waking up” to AGI risk recently. However the wakeup is not a binary thing & there’s still a lot of additional wakeup that needs to happen before people behave responsibly enough to keep the risk below, say, 10%. And my timelines are short enough that I don’t currently expect that to happen in time.
    
    What about the technical alignment problem crux?
    - Audrey Tang 18 Dec 2023 2:54 UTC
      2 points
      0
      Parent
      Based on my personal experience in pandemic resilience, additional wakeups can proceed swiftly as soon as a specific society-scale harm is realized.
      Specifically, as we are waking up to over-reliance harms and addressing them (esp. within security OODA loops), it would buy time for good enough continuous alignment.