AGI in a vulnerable world

Link post

I’ve been think­ing about a class of AI-take­off sce­nar­ios where a very large num­ber of peo­ple can build dan­ger­ous, un­safe AGI be­fore any­one can build safe AGI. This seems par­tic­u­larly likely if:

  • It is con­sid­er­ably more difficult to build safe AGI than it is to build un­safe AGI.

  • AI progress is soft­ware-con­strained rather than com­pute-con­strained.

  • Com­pute available to in­di­vi­d­u­als grows quickly and un­safe AGI turns out to be more of a straight­for­ward ex­ten­sion of ex­ist­ing tech­niques than safe AGI is.

  • Or­ga­ni­za­tions are bad at keep­ing soft­ware se­cret for a long time, i.e. it’s hard to get a con­sid­er­able lead in de­vel­op­ing any­thing.

    • This may be be­cause in­for­ma­tion se­cu­rity is bad, or be­cause ac­tors are will­ing to go to ex­treme mea­sures (e.g. ex­tor­tion) to get in­for­ma­tion out of re­searchers.

Another re­lated sce­nario is one where safe AGI is built first, but isn’t defen­sively ad­van­taged enough to pro­tect against harms by un­safe AGI cre­ated soon af­ter­ward.

The in­tu­ition be­hind this class of sce­nar­ios comes from an ex­trap­o­la­tion of what ma­chine learn­ing progress looks like now. It seems like large or­ga­ni­za­tions make the ma­jor­ity of progress on the fron­tier, but smaller teams are close be­hind and able to re­pro­duce im­pres­sive re­sults with dra­mat­i­cally fewer re­sources. I don’t think the large or­ga­ni­za­tions mak­ing AI progress are (cur­rently) well-equipped to keep soft­ware se­cret if mo­ti­vated and well-re­sourced ac­tors put effort into ac­quiring it. There are strong open­ness norms in the ML com­mu­nity as a whole, which means knowl­edge spreads quickly. I worry that there are strong in­cen­tives for progress to con­tinue to be very open, since de­creased open­ness can ham­per an or­ga­ni­za­tion’s abil­ity to re­cruit tal­ent. If com­pute available to in­di­vi­d­u­als in­creases a lot, and build­ing un­safe AGI is much eas­ier than build­ing safe AGI, we could sud­denly find our­selves in a vuln­er­a­ble world.

I’m not sure if this is a mean­ingfully dis­tinct or un­der­em­pha­sized class of sce­nar­ios within the AI risk space. My in­tu­ition is that there is more at­ten­tion on in­cen­tives failures within a small num­ber of ac­tors, e.g. via arms races. I’m cu­ri­ous for feed­back about whether many-peo­ple-can-build-AGI is a class of sce­nar­ios we should take se­ri­ously and if so, what things so­ciety could do to make them less likely, e.g. in­vest in high-effort info-se­cu­rity and se­crecy work. AGI de­vel­op­ment seems much more likely to go ex­is­ten­tially badly if more than a small num­ber of well-re­sourced ac­tors are able to cre­ate AGI.

By Asya Bergal