Koen.Holtman comments on Ways to buy time

Koen.Holtman 18 Nov 2022 21:54 UTC
3 points
0
I am an AI/AGI alignment researcher. I do not feel very optimistic about the effectiveness of your proposed interventions, mainly because I do not buy your underlying risk model and and solution model. Overall I am getting a vibe that you believe AGI will be invented soon, which is a valid assumption for planning specific actions, but then things get more weird in your solution model. To give one specific example of this:

It will largely be the responsibility of safety and governance teams to push labs to not publish papers that differentially-advance-capabilities, maintain strong information security, invest in alignment research, use alignment strategies, and not deploy potentially dangerous models.

There is an underlying assumption in the above reasoning, and in many other of your slowdown proposals, that the AI labs themselves will have significant influence on how their AI innovations will be used by downstream actors. You are assuming that they can prevent downstream actors from creating misaligned AI/AGI by not publishing certain research and not releasing foundation models with certain capabilities.

This underlying assumption, one where the labs or individual ML researchers have significant choke-point power that can lower x-risk, is entirely wrong. To unpack this statement a bit more: current advanced AI, including foundation models, is a dual-use technology that can be configured to do good as well as evil, that has the potential to be deployed by actors who will be very careful about it, and other actors who will be very careless. Also, we have seen that if one lab withholds its latest model, another party will quickly open-source an equally good model. Maybe real AGI, if it ever gets invented, will be a technology with an entirely different nature, but I am not going to bet on it.

More generally: I am seeing you make a mistake that I have seen a whole crowd of influencers and community builders is making, You are following the crowd, and the crowd focuses too much on the idea that they need to convince ‘researchers in top AI labs’ and other ‘ML researchers’ in ‘top conferences’ about certain dangers:
- The crowd focuses on influencing AI research labs and ML researchers without considering if these parties have the technical or organisational/political power to control how downstream users will use AI or future AGI. In general, they do not have this power to control. If you are really worried about an AI lab inventing an AGI soon (personally I am not, but for the sake of the argument), you will need to focus on its management, not on its researchers.
- The crowd focuses on influencing ML researchers without considering if these parties even have the technical skills or attitude needed to be good technical alignment researchers. Often, they do not. (I expand on this topic here. The TL;DR: treating the management of the impact of advances in ML on society as an ML research problem makes about as much sense as our forefathers treating the management of the impact of the stream engine on society as a steam engine engineering problem. For the long version, see the paper linked to the post.)
Overall, when it comes to putting more manpower into outreach, I feel that safety awareness outreach to downstream users, and those who might regulate their actions via laws, moral persuasion, or product release decisions, is far more important.