Haiku comments on Google DeepMind: An Approach to Technical AGI Safety and Security

Haiku 6 Apr 2025 8:32 UTC
11 points
1
On a first reading of this summary, I take it as a small positive update on the quality of near-term AI Safety thinking at DeepMind. It treads familiar ground and doesn’t raise any major red flags for me.

The two things I would most like to know are:
1. Will DeepMind commit to halting frontier AGI research if they cannot provide a robust safety case for systems that are estimated to have a non-trivial chance of significantly harming humanity?
2. Does the safety team have veto power on development or deployment of systems they deem to be unsafe?
A “yes” to both would be a very positive surprise to me, and would lead me to additionally ask DeepMind to publicly support a global treaty that implements these bare-minimum policies in a way that can be verified and enforced.

A “no” to either would mean this work falls under milling behavior, and will not meaningfully contribute toward keeping humanity safe from DeepMind’s own actions.
- ryan_greenblatt 7 Apr 2025 18:40 UTC
  7 points
  3
  Parent
  
  A “no” to either would mean this work falls under milling behavior, and will not meaningfully contribute toward keeping humanity safe from DeepMind’s own actions.
  
  I think it’s probably possible greatly improve safety given a moderate budget for safety and not nearly enough buy in for (1) and (2). (At least not enough buy-in prior to a large incident which threatens to be very costly for the organization.)
  
  Overall, I think high quality thinking about AI safety seems quite useful even if this level of buy-in is unlikely.
  
  (I don’t think this report should update us much about having buy-in needed for (1)/(2), but the fact that it could be published at all in it’s current form is still encouraging.)