To contribute to AI safety, consider doing AI research

Among those con­cerned about risks from ad­vanced AI, I’ve en­coun­tered peo­ple who would be in­ter­ested in a ca­reer in AI re­search, but are wor­ried that do­ing so would speed up AI ca­pa­bil­ity rel­a­tive to safety. I think it is a mis­take for AI safety pro­po­nents to avoid go­ing into the field for this rea­son (bet­ter rea­sons in­clude be­ing well-po­si­tioned to do AI safety work, e.g. at MIRI or FHI). This mis­take con­tributed to me choos­ing statis­tics rather than com­puter sci­ence for my PhD, which I have some re­grets about, though luck­ily there is enough over­lap be­tween the two fields that I can work on ma­chine learn­ing any­way. I think the value of hav­ing more AI ex­perts who are wor­ried about AI safety is far higher than the down­side of adding a few drops to the ocean of peo­ple try­ing to ad­vance AI. Here are sev­eral rea­sons for this:

  1. Con­cerned re­searchers can in­form and in­fluence their col­leagues, es­pe­cially if they are out­spo­ken about their views.

  2. Study­ing and work­ing on AI brings un­der­stand­ing of the cur­rent challenges and break­throughs in the field, which can use­fully in­form AI safety work (e.g. wire­head­ing in re­in­force­ment learn­ing agents).

  3. Op­por­tu­ni­ties to work on AI safety are be­gin­ning to spring up within academia and in­dus­try, e.g. through FLI grants. In the next few years, it will be pos­si­ble to do an AI-safety-fo­cused PhD or post­doc in com­puter sci­ence, which would hit two birds with one stone.

To elab­o­rate on #1, one of the pre­vailing ar­gu­ments against tak­ing long-term AI safety se­ri­ously is that not enough ex­perts in the AI field are wor­ried. Sev­eral promi­nent re­searchers have com­mented on the po­ten­tial risks (Stu­art Rus­sell, Bart Sel­man, Mur­ray Shana­han, Shane Legg, and oth­ers), and more are con­cerned but keep quiet for rep­u­ta­tional rea­sons. An ac­com­plished, strate­gi­cally out­spo­ken and/​or well-con­nected ex­pert can make a big differ­ence in the at­ti­tude dis­tri­bu­tion in the AI field and the level of fa­mil­iar­ity with the ac­tual con­cerns (which are not about malev­olence, sen­tience, or march­ing robot armies). Hav­ing more in­formed skep­tics who have maybe even read Su­per­in­tel­li­gence, and fewer un­in­formed skep­tics who think AI safety pro­po­nents are afraid of Ter­mi­na­tors, would pro­duce much needed di­rect and pro­duc­tive dis­cus­sion on these is­sues. As the pro­por­tion of in­formed and con­cerned re­searchers in the field ap­proaches crit­i­cal mass, the rep­u­ta­tional con­se­quences for speak­ing up will de­crease.

A year af­ter FLI’s Puerto Rico con­fer­ence, the sub­ject of long-term AI safety is no longer taboo among AI re­searchers, but re­mains rather con­tro­ver­sial. Ad­dress­ing AI risk on the long term will re­quire safety work to be a sig­nifi­cant part of the field, and close col­lab­o­ra­tion be­tween those work­ing on safety and ca­pa­bil­ity of ad­vanced AI. Stu­art Rus­sell makes the apt anal­ogy that “just as nu­clear fu­sion re­searchers con­sider the prob­lem of con­tain­ment of fu­sion re­ac­tions as one of the pri­mary prob­lems of their field, is­sues of con­trol and safety will be­come cen­tral to AI as the field ma­tures”. If more peo­ple who are already con­cerned about AI safety join the field, we can make this hap­pen faster, and help wis­dom win the race with ca­pa­bil­ity.

(Cross-posted from my blog. Thanks to Janos Kra­mar for his help with edit­ing this post.)