On a first reading of this summary, I take it as a small positive update on the quality of near-term AI Safety thinking at DeepMind. It treads familiar ground and doesn’t raise any major red flags for me.
The two things I would most like to know are:
Will DeepMind commit to halting frontier AGI research if they cannot provide a robust safety case for systems that are estimated to have a non-trivial chance of significantly harming humanity?
Does the safety team have veto power on development or deployment of systems they deem to be unsafe?
A “yes” to both would be a very positive surprise to me, and would lead me to additionally ask DeepMind to publicly support a global treaty that implements these bare-minimum policies in a way that can be verified and enforced.
A “no” to either would mean this work falls under milling behavior, and will not meaningfully contribute toward keeping humanity safe from DeepMind’s own actions.
A “no” to either would mean this work falls under milling behavior, and will not meaningfully contribute toward keeping humanity safe from DeepMind’s own actions.
I think it’s probably possible greatly improve safety given a moderate budget for safety and not nearly enough buy in for (1) and (2). (At least not enough buy-in prior to a large incident which threatens to be very costly for the organization.)
Overall, I think high quality thinking about AI safety seems quite useful even if this level of buy-in is unlikely.
(I don’t think this report should update us much about having buy-in needed for (1)/(2), but the fact that it could be published at all in it’s current form is still encouraging.)
On a first reading of this summary, I take it as a small positive update on the quality of near-term AI Safety thinking at DeepMind. It treads familiar ground and doesn’t raise any major red flags for me.
The two things I would most like to know are:
Will DeepMind commit to halting frontier AGI research if they cannot provide a robust safety case for systems that are estimated to have a non-trivial chance of significantly harming humanity?
Does the safety team have veto power on development or deployment of systems they deem to be unsafe?
A “yes” to both would be a very positive surprise to me, and would lead me to additionally ask DeepMind to publicly support a global treaty that implements these bare-minimum policies in a way that can be verified and enforced.
A “no” to either would mean this work falls under milling behavior, and will not meaningfully contribute toward keeping humanity safe from DeepMind’s own actions.
I think it’s probably possible greatly improve safety given a moderate budget for safety and not nearly enough buy in for (1) and (2). (At least not enough buy-in prior to a large incident which threatens to be very costly for the organization.)
Overall, I think high quality thinking about AI safety seems quite useful even if this level of buy-in is unlikely.
(I don’t think this report should update us much about having buy-in needed for (1)/(2), but the fact that it could be published at all in it’s current form is still encouraging.)