Critch on career advice for junior AI-x-risk-concerned researchers

In a re­cent e-mail thread, An­drew Critch sent me the fol­low­ing “sub­tle prob­lem with send­ing ju­nior AI-x-risk-con­cerned re­searchers into AI ca­pa­bil­ities re­search”. Here’s the ex­pla­na­tion he wrote of his view, shared with his per­mis­sion:


I’m fairly con­cerned with the prac­tice of tel­ling peo­ple who “re­ally care about AI safety” to go into AI ca­pa­bil­ities re­search, un­less they are very ju­nior re­searchers who are us­ing gen­eral AI re­search as a place to im­prove their skills un­til they’re able to con­tribute to AI safety later. (See Lev­er­ag­ing Academia).

The rea­son is not a fear that they will con­tribute to AI ca­pa­bil­ities ad­vance­ment in some man­ner that will be marginally detri­men­tal to the fu­ture. It’s also not a fear that they’ll fail to change the com­pany’s cul­ture in the ways they’d hope, and end up feel­ing dis­cour­aged. What I’m afraid of is that they’ll feel pres­sure to start pre­tend­ing to them­selves, or to oth­ers, that their work is “rele­vant to safety”. Then what we end up with are com­pa­nies and de­part­ments filled with peo­ple who are “con­cerned about safety”, cre­at­ing a false sense of se­cu­rity that some­thing rele­vant is be­ing done, when all we have are a bunch of sim­mer­ing con­cerns and con­comi­tant ra­tio­nal­iza­tions.

This fear of mine re­quires some con­text from my back­ground as a re­searcher. I see this prob­lem with en­vi­ron­men­tal­ists who “re­ally care about cli­mate change”, who tell them­selves they’re “work­ing on it” by study­ing the roots of a fairly ar­bi­trary species of tree in a fairly ar­bi­trary ecosys­tem that won’t gen­er­al­ize to any­thing likely to help with cli­mate change.

My as­sess­ment that their work won’t gen­er­al­ize is mostly not from my own out­side view; it comes from ask­ing the re­searcher about how their work is likely to have an im­pact, and get­ting a re­sponse that ei­ther says noth­ing more than “I’m not sure, but it seems rele­vant some­how”, or an ar­gu­ment with a lot of caveats like “X might help with Y, which might help with Z, which might help with cli­mate change, but we re­ally can’t be sure, and it’s not my job to defend the rele­vance of my work. It’s in­trin­si­cally in­ter­est­ing to me, and you never know if some­thing could turn out to be use­ful that seemed use­less at first.”

At the same time, I know other cli­mate sci­en­tists who seem to have ac­tu­ally done an ex­plicit or im­plicit Fermi es­ti­mate for the prob­a­bil­ity that they will per­son­ally soon dis­cover a species of bac­te­ria that could safely scrub the Earth’s at­mo­sphere of ex­cess car­bon. That’s much bet­ter.

I’ve seen the same sort of prob­lem with poli­ti­cal sci­en­tists who are “re­ally con­cerned about nu­clear war” who tell them­selves they’re “work­ing on it” by try­ing to pro­duce a minor gen­er­al­iza­tion of an edge case of a vot­ing the­o­rem that, when asked, they don’t think will be used by any­one ever.

At the same time, I know other poli­ti­cal sci­en­tists who seem to be try­ing re­ally hard to work back­ward from a cer­tain geopoli­ti­cal out­come, and earnestly work­ing out the de­tails of what the world would need to make that out­come hap­pen. That’s much bet­ter.

Hav­ing said this, I do think it’s fine and good if so­ciety wants to spon­sor a per­son to study ob­scure roots of ob­scure trees that prob­a­bly won’t help with cli­mate change, or edge cases of the­o­rems that no one will ever use or even take in­spira­tion from, but I would like ev­ery­one to be on the same page that in such cases what we’re spon­sor­ing is in­tel­lec­tual free­dom and de­vel­op­ment, and not cli­mate change pre­ven­tion or nu­clear war pre­ven­tion. If folks want to study fairly ob­scure phe­nom­ena be­cause it feels like the next thing their mind needs to un­der­stand the world bet­ter, we shouldn’t pres­sure them to have to think that the next thing they learn might “stop cli­mate change” or “pre­vent nu­clear war”, or else we fuel the fire of false pre­tenses about which of the world’s re­search gaps are be­ing earnestly taken care of.

Un­for­tu­nately, the above pat­tern of “jus­tify­ing” re­search by just re­flect­ing on what you care about, ra­tio­nal­iz­ing it, and not check­ing the ra­tio­nal­iza­tion for ra­tio­nal­ity, ap­pears to me to be ex­tremely preva­lent among folks who care about cli­mate change or nu­clear war, and this is not some­thing I want to see repli­cated el­se­where, es­pe­cially not in the bur­geon­ing fields of AI safety, AI ethics, or AI x-risk re­duc­tion. And I’m con­cerned that if we tell folks to go into AI re­search just to “be con­cerned”, we’ll be fuel­ing a false sense of se­cu­rity by filling de­part­ments and com­pa­nies with peo­ple who “seem to re­ally care” but aren’t do­ing cor­re­spond­ingly rele­vant re­search work, and cre­at­ing a re­search cul­ture where con­cerns about safety, ethics, or x-risk do not re­sult in ac­tu­ally pri­ori­tiz­ing re­search into safety, ethics, or x-risk.

When you’re giv­ing gen­eral-pur­pose ca­reer ad­vice, the meme “do AI your­self, so you’re around to help make it safe” is a re­ally bad meme. It fuels a nar­ra­tive that says “Be­ing a good per­son stand­ing next to the de­vel­op­ment of dan­ger­ous tech makes the tech less dan­ger­ous.” Just stand­ing nearby doesn’t ac­tu­ally help un­less you’re do­ing tech­ni­cal safety re­search. Just stand­ing nearby does cre­ate a false sense of se­cu­rity through the mere-ex­po­sure effect. And the “just stand nearby” at­ti­tude drives peo­ple to worsen race con­di­tions by cre­at­ing new com­peti­tors in differ­ent ge­o­graph­i­cal lo­ca­tions, so they can ex­er­cise their Stand Nearby pow­ers to en­sure the tech is safe.

Im­por­tant: the above para­graphs are ad­vice about what ad­vice to give, be­cause of the so­cial pres­sures and ten­den­cies to ra­tio­nal­ize that ad­vice-giv­ing of­ten pro­duces. By con­trast, if you’re a per­son who’s wor­ried about AI, and think­ing about a ca­reer in AI re­search, I do not wish to dis­cour­age you from go­ing into AI ca­pa­bil­ities re­search. To you, what I want to say is some­thing differ­ent....

Step 1: Learn by do­ing. Lev­er­age Academia. Get into a good grad school for AI re­search, and fo­cus first on learn­ing things that feel like they will help you per­son­ally to un­der­stand AI safety bet­ter (or AI ethics, or AI x-risk; re­place by your area of in­ter­est through­out). Don’t worry about whether you’re “con­tribut­ing” to AI safety too early in your grad­u­ate ca­reer. Be­fore you’re ac­tu­ally ready to make real con­tri­bu­tions to the field, try to avoid ra­tio­nal­iz­ing do­ing things be­cause “they might help with safety”; in­stead, do things be­cause “they might help me per­son­ally to un­der­stand safety bet­ter, in ways that might be idiosyn­cratic to me and my own learn­ing pro­cess.”

Re­mem­ber, what you need to learn to un­der­stand safety, and what the field needs to progress, might be pretty differ­ent, and you need to have the free­dom to learn what­ever gaps seem im­por­tant to you per­son­ally. Early in your re­search ca­reer, you need to be in “con­sume” mode more than “pro­duce” mode, and it’s fine if your way of “con­sum­ing” knowl­edge and skill is to “pro­duce” things that aren’t very ex­ter­nally valuable. So, try to avoid ra­tio­nal­iz­ing the ex­ter­nally-us­able safety-value of ideas or tools you pro­duce on your way to un­der­stand­ing how to pro­duce ex­ter­nally-us­able safety re­search later.

The so­cietal value of you pro­duc­ing your ear­liest re­search re­sults will be that they help you per­son­ally to fill gaps in your mind that mat­ter for your per­sonal un­der­stand­ing of AI safety, and that’s all the jus­tifi­ca­tion you need in my books. So, do fo­cus on learn­ing things that you need to un­der­stand safety bet­ter, but don’t ex­pect those things to be a “con­tri­bu­tion” that will mat­ter to oth­ers.

Step 2: Once you’ve learned enough that you’re able to start con­tribut­ing to re­search in AI safety (or ethics, or x-risk), then start fo­cus­ing di­rectly on mak­ing safety re­search con­tri­bu­tions that oth­ers might find in­sight­ful. When you’re ready enough to start ac­tu­ally pro­duc­ing ad­vances in your field, that’s when it’s time to start think­ing about the so­cial im­pact of those ad­vances would be, and start shift­ing your fo­cus some­what away from learn­ing (con­sum­ing) and some­what more to­ward con­tribut­ing (pro­duc­ing).


(Con­tent from Critch ends here.)