Zac Hatfield-Dodds comments on Concrete Reasons for Hope about AI

Zac Hatfield-Dodds 15 Jan 2023 14:31 UTC
3 points
0
I’m basing my impression here on having read much of Nate’s public writing on AI, and a conversation over shared lunch at a conference a few months ago. His central estimate for $P (d o o m)$ is certainly substantially higher than mine, but as I remember it we have pretty similar views of the underlying dynamics to date, somewhat diverging about the likelihood of catastrophe with very capable systems, and both hope that future evidence favors the less-doom view.
- Unfortunately I agree that “shut down” and “no catastrophe” are still missing pieces. I’m more optimistic than my model of Nate that the HHH research agenda constitutes any progress towards this goal though.
- I think labs correctly assess that they’re neither working with or at non-trivial immediate risk of creating x-risky models, nor yet cautious enough to do so safely. If labs invested in this, I think they could probably avoid accidentally creating an x-risky system without abandoning ML research before seeing warning signs.
- I agree that pre-AGI empirical alignment work only gets you so far, and that you probably get very little time for direct empirical work on the deadliest problems (two years if very fortunate, days to seconds if you’re really not). But I’d guess my estimate of “only so far” is substantially further than Nate’s, largely off different credence in a sharp left turn.
  - I was struck by how similarly we assessed the current situation and evidence available so far, but that is a big difference and maybe I shouldn’t describe our views as similar.
- I generally agree with Nate’s warning shots post, and with some comments (e.g.), but the “others” I was thinking would likely agree to a moratorium were other labs, not governments (c.f.), which could buy say 6--24 vital months.