[Link] Nate Soares is answering questions about MIRI at the EA Forum

Nate Soares, MIRI’s new Ex­ec­u­tive Direc­tor, is go­ing to be an­swer­ing ques­tions to­mor­row at the EA Fo­rum (link). You can post your ques­tions there now; he’ll start re­ply­ing Thurs­day, 15:00-18:00 US Pa­cific time.

Quot­ing Nate:

Last week Mon­day, I took the reins as ex­ec­u­tive di­rec­tor of the Ma­chine In­tel­li­gence Re­search In­sti­tute. MIRI fo­cuses on study­ing tech­ni­cal prob­lems of long-term AI safety. I’m happy to chat about what that means, why it’s im­por­tant, why we think we can make a differ­ence now, what the open tech­ni­cal prob­lems are, how we ap­proach them, and some of my plans for the fu­ture.

I’m also happy to an­swer ques­tions about my per­sonal his­tory and how I got here, or about per­sonal growth and mind­hack­ing (a sub­ject I touch upon fre­quently in my blog, Mind­ing Our Way), or about what­ever else piques your cu­ri­os­ity.

Nate is a reg­u­lar poster on LessWrong un­der the name So8res—you can find stuff he’s writ­ten in the past here.

Up­date: Ques­tion-an­swer­ing is live!

Up­date #2: Looks like Nate’s wrap­ping up now. Feel free to dis­cuss the ques­tions and an­swers, here or at the EA Fo­rum.

Up­date #3: Here are some in­ter­est­ing snip­pets from the AMA:

Alex Al­tair: What are some of the most ne­glected sub-tasks of re­duc­ing ex­is­ten­tial risk? That is, what is no one work­ing on which some­one re­ally, re­ally should be?

Nate Soares: Policy work /​ in­ter­na­tional co­or­di­na­tion. Figur­ing out how to build an al­igned AI is only part of the prob­lem. You also need to en­sure that an al­igned AI is built, and that’s a lot harder to do dur­ing an in­ter­na­tional arms race. (A race to the finish would be pretty bad, I think.)

I’d like to see a lot more peo­ple figur­ing out how to en­sure global sta­bil­ity & co­or­di­na­tion as we en­ter a time pe­riod that may be fairly dan­ger­ous.

Diego Caleiro: 1) Which are the im­plicit as­sump­tions, within MIRI’s re­search agenda, of things that “cur­rently we have ab­solutely no idea of how to do that, but we are tak­ing this as­sump­tion for the time be­ing, and hop­ing that in the fu­ture ei­ther a more prac­ti­cal ver­sion of this idea will be fea­si­ble, or that this ver­sion will be a guid­ing star for prac­ti­cal im­ple­men­ta­tions”? [...]

2) How do these as­sump­tions di­verge from how FLI, FHI, or non-MIRI peo­ple pub­lish­ing on the AGI 2014 book con­ceive of AGI re­search?

3) Op­tional: Jus­tify the differ­ences in 2 and why MIRI is tak­ing the path it is tak­ing.

Nate Soares: 1) The things we have no idea how to do aren’t the im­plicit as­sump­tions in the tech­ni­cal agenda, they’re the ex­plicit sub­ject head­ings: de­ci­sion the­ory, log­i­cal un­cer­tainty, Vingean re­flec­tion, cor­rigi­bil­ity, etc :-)

We’ve tried to make it very clear in var­i­ous pa­pers that we’re deal­ing with very limited toy mod­els that cap­ture only a small part of the prob­lem (see, e.g., ba­si­cally all of sec­tion 6 in the cor­rigi­bil­ity pa­per).

Right now, we ba­si­cally have a bunch of big gaps in our knowl­edge, and we’re try­ing to make math­e­mat­i­cal mod­els that cap­ture at least part of the ac­tual prob­lem—sim­plify­ing as­sump­tions are the norm, not the ex­cep­tion. All I can eas­ily say that com­mon sim­plify­ing as­sump­tions in­clude: you have lots of com­put­ing power, there is lots of time be­tween ac­tions, you know the ac­tion set, you’re try­ing to max­i­mize a given util­ity func­tion, etc. As­sump­tions tend to be listed in the pa­per where the model is de­scribed.

2) The FLI folks aren’t do­ing any re­search; rather, they’re ad­minis­ter­ing a grant pro­gram. Most FHI folks are fo­cused more on high-level strate­gic ques­tions (What might the path to AI look like? What meth­ods might be used to miti­gate xrisk? etc.) rather than ob­ject-level AI al­ign­ment re­search. And re­mem­ber that they look at a bunch of other X-risks as well, and that they’re also think­ing about policy in­ter­ven­tions and so on. Thus, the com­par­i­son can’t eas­ily be made. (Eric Drexler’s been do­ing some think­ing about the ob­ject-level FAI ques­tions re­cently, but I’ll let his lat­est tech re­port fill you in on the de­tails there. Stu­art Arm­strong is do­ing AI al­ign­ment work in the same vein as ours. Owain Evans might also be do­ing ob­ject-level AI al­ign­ment work, but he’s new there, and I haven’t spo­ken to him re­cently enough to know.)

In­so­far as FHI folks would say we’re mak­ing as­sump­tions, I doubt they’d be point­ing to as­sump­tions like “UDT knows the policy set” or “as­sume we have lots of com­put­ing power” (which are ob­vi­ously sim­plify­ing as­sump­tions on toy mod­els), but rather as­sump­tions like “do­ing re­search on log­i­cal un­cer­tainty now will ac­tu­ally im­prove our odds of hav­ing a work­ing the­ory of log­i­cal un­cer­tainty be­fore it’s needed.”

3) I think most of the FHI folks & FLI folks would agree that it’s im­por­tant to have some­one hack­ing away at the tech­ni­cal prob­lems, but just to make the ar­gu­ments more ex­plicit, I think that there are a num­ber of prob­lems that it’s hard to even see un­less you have your “try to solve FAI” gog­gles on. [...]

We’re still in the pre­for­mal stage, and if we can get this the­ory to the for­mal stage, I ex­pect we may be able to get a lot more eyes on the prob­lem, be­cause the ever-crawl­ing feel­ers of academia seem to be much bet­ter at ex­plor­ing for­mal­ized prob­lems than they are at for­mal­iz­ing pre­for­mal prob­lems.

Then of course there’s the heuris­tic of “it’s fine to shout ‘model un­cer­tainty!’ and hover on the sidelines, but it wasn’t the arm­chair philoso­phers who did away with the epicy­cles, it was Ke­pler, who was up to his elbows in epicy­cle data.” One of the big ways that you iden­tify the things that need work­ing on is by try­ing to solve the prob­lem your­self. By ask­ing how to ac­tu­ally build an al­igned su­per­in­tel­li­gence, MIRI has gen­er­ated a whole host of open tech­ni­cal prob­lems, and I pre­dict that that host will be a very valuable as­set now that more and more peo­ple are turn­ing their gaze to­wards AI al­ign­ment.

Buck Sh­legeris: What’s your re­sponse to Peter Hur­ford’s ar­gu­ments in his ar­ti­cle Why I’m Skep­ti­cal Of Un­proven Causes...?

Nate Soares: (1) One of Peter’s first (im­plicit) points is that AI al­ign­ment is a spec­u­la­tive cause. I tend to dis­agree.

Imag­ine it’s 1942. The Man­hat­tan pro­ject is well un­der way, Leo Szilard has shown that it’s pos­si­ble to get a neu­tron chain re­ac­tion, and physi­cists are hard at work figur­ing out how to make an atom bomb. You sug­gest that this might be a fine time to start work­ing on nu­clear con­tain­ment, so that, once hu­mans are done bomb­ing the ev­er­lov­ing breath out of each other, they can har­ness nu­clear en­ergy for fun and profit. In this sce­nario, would nu­clear con­tain­ment be a “spec­u­la­tive cause”?

There are cur­rently thou­sands of per­son-hours and billions of dol­lars go­ing to­wards in­creas­ing AI ca­pa­bil­ities ev­ery year. To call AI al­ign­ment a “spec­u­la­tive cause” in an en­vi­ron­ment such as this one seems fairly silly to me. In what sense is it spec­u­la­tive to work on im­prov­ing the safety of the tools that other peo­ple are cur­rently build­ing as fast as they can? Now, I sup­pose you could ar­gue that ei­ther (a) AI will never work or (b) it will be safe by de­fault, but both those ar­gu­ments seem pretty flimsy to me.

You might ar­gue that it’s a bit weird for peo­ple to claim that the most effec­tive place to put char­i­ta­ble dol­lars is to­wards some field of sci­en­tific study. Aren’t char­i­ta­ble dol­lars sup­posed to go to starv­ing chil­dren? Isn’t the NSF sup­posed to han­dle sci­en­tific fund­ing? And I’d like to agree, but so­ciety has kinda been drop­ping the ball on this one.

If we had strong rea­son to be­lieve that hu­mans could build strangelets, and so­ciety were pour­ing billions of dol­lars and thou­sands of hu­man-years into mak­ing strangelets, and al­most no money or effort was go­ing to­wards strangelet con­tain­ment, and it looked like hu­man­ity was likely to cre­ate a strangelet some­time in the next hun­dred years, then yeah, I’d say that “strangelet safety” would be an ex­tremely wor­thy cause.

How wor­thy? Hard to say. I agree with Peter that it’s hard to figure out how to trade off “safety of po­ten­tially-very-highly-im­pact­ful tech­nol­ogy that is cur­rently un­der fu­ri­ous de­vel­op­ment” against “chil­dren are dy­ing of malaria”, but the only way I know how to trade those things off is to do my best to run the num­bers, and my back-of-the-en­velope calcu­la­tions cur­rently say that AI al­ign­ment is fur­ther be­hind than the globe is poor.

Now that the EA move­ment is start­ing to look more se­ri­ously into high-im­pact in­ter­ven­tions on the fron­tiers of sci­ence & math­e­mat­ics, we’re go­ing to need to come up with more so­phis­ti­cated ways to as­sess the im­pacts and trade­offs. I agree it’s hard, but I don’t think throw­ing out ev­ery­thing that doesn’t visi­bly pay off in the ex­tremely short term is the an­swer.

(2) Alter­na­tively, you could ar­gue that MIRI’s ap­proach is un­likely to work. That’s one of Peter’s ex­plicit ar­gu­ments: it’s very hard to find in­ter­ven­tions that re­li­ably af­fect the fu­ture far in ad­vance, es­pe­cially when there aren’t hard ob­jec­tive met­rics. I have three dis­agree­ments with Peter on this point.

First, I think he picks the wrong refer­ence class: yes, hu­mans have a re­ally hard time gen­er­at­ing big so­cial shifts on pur­pose. But that doesn’t nec­es­sar­ily mean hu­mans have a re­ally hard time gen­er­at­ing math—in fact, hu­mans have a sur­pris­ingly good track record when it comes to gen­er­at­ing math!

Hu­mans ac­tu­ally seem to be pretty good at putting the­o­ret­i­cal foun­da­tions un­der­neath var­i­ous fields when they try, and var­i­ous peo­ple have demon­stra­bly suc­ceeded at this task (Church & Tur­ing did this for com­put­ing, Shan­non did this for in­for­ma­tion the­ory, Kol­mogorov did a fair bit of this for prob­a­bil­ity the­ory, etc.). This sug­gests to me that hu­mans are much bet­ter at pro­duc­ing tech­ni­cal progress in an un­ex­plored field than they are at gen­er­at­ing so­cial out­comes in a com­plex eco­nomic en­vi­ron­ment. (I’d be in­ter­ested in any at­tempt to quan­ti­ta­tively eval­u­ate this claim.)

Se­cond, I agree in gen­eral that any one in­di­vi­d­ual team isn’t all that likely to solve the AI al­ign­ment prob­lem on their own. But the cor­rect re­sponse to that isn’t “stop fund­ing AI al­ign­ment teams”—it’s “fund more AI al­ign­ment teams”! If you’re try­ing to en­sure that nu­clear power can be har­nessed for the bet­ter­ment of hu­mankind, and you as­sign low odds to any par­tic­u­lar re­search group solv­ing the con­tain­ment prob­lem, then the an­swer isn’t “don’t fund any con­tain­ment groups at all,” the an­swer is “you’d bet­ter fund a few differ­ent con­tain­ment groups, then!”

Third, I ob­ject to the whole “there’s no feed­back” claim. Did Kol­mogorov have tight feed­back when he was de­vel­op­ing an early for­mal­iza­tion of prob­a­bil­ity the­ory? It seems to me like the an­swer is “yes”—figur­ing out what was & wasn’t a math­e­mat­i­cal model of the prop­er­ties he was try­ing to cap­ture served as a very tight feed­back loop (math­e­mat­i­cal the­o­rems tend to be un­am­bigu­ous), and in­deed, it was suffi­ciently good feed­back that Kol­mogorov was suc­cess­ful in putting for­mal foun­da­tions un­der­neath prob­a­bil­ity the­ory.

In­ter­stice: What is your AI ar­rival timeline?

Nate Soares: Even­tu­ally. Pre­dict­ing the fu­ture is hard. My 90% con­fi­dence in­ter­val con­di­tioned on no global catas­tro­phes is maybe 5 to 80 years. That is to say, I don’t know.

Tarn Somervell Fletcher: What are MIRI’s plans for pub­li­ca­tion over the next few years, whether peer-re­viewed or arxiv-style pub­li­ca­tions?

More speci­fi­cally, what are the a) long-term in­ten­tions and b) short-term ac­tual plans for the pub­li­ca­tion of work­shop re­sults, and what kind of pri­or­ity does that have?

Nate Soares: Great ques­tion! The short ver­sion is, writ­ing more & pub­lish­ing more (and gen­er­ally en­gag­ing with the aca­demic main­stream more) are very high on my pri­or­ity list.

Main­stream pub­li­ca­tions have his­tor­i­cally been fairly difficult for us, as un­til last year, AI al­ign­ment re­search was seen as fairly kooky. (We’ve had a num­ber of pa­pers re­jected from var­i­ous jour­nals due to the “weird AI mo­ti­va­tion.”) Go­ing for­ward, it looks like that will be less of an is­sue.

That said, writ­ing ca­pa­bil­ity is a huge bot­tle­neck right now. Our re­searchers are cur­rently try­ing to (a) run work­shops, (b) en­gage with & eval­u­ate promis­ing po­ten­tial re­searchers, (c) at­tend con­fer­ences, (d) pro­duce new re­search, (e) write it up, and (f) get it pub­lished. That’s a lot of things for a three-per­son re­search team to jug­gle! Pri­or­ity num­ber 1 is to grow the re­search team (be­cause oth­er­wise noth­ing will ever be un­blocked), and we’re aiming to hire a few new re­searchers be­fore the year is through. After that, in­creas­ing our writ­ing out­put is likely the next high­est pri­or­ity.

Ex­pect our writ­ing out­put this year to be similar to last year’s (i.e., a small hand­ful of peer re­viewed pa­pers and a larger hand­ful of tech­ni­cal re­ports that might make it onto the arXiv), and then hope­fully we’ll have more & higher qual­ity pub­li­ca­tions start­ing in 2016 (the pub­lish­ing pipeline isn’t par­tic­u­larly fast).

Tor Barstad: Among re­cruit­ing new tal­ent and hav­ing fund­ing for new po­si­tions, what is the great­est bot­tle­neck?

Nare Soares: Right now we’re tal­ent-con­strained, but we’re also fairly well-po­si­tioned to solve that prob­lem over the next six months. Jes­sica Tay­lor is join­ing us in au­gust. We have an­other re­searcher or two pretty far along in the pipeline, and we’re run­ning four or five more re­search work­shops this sum­mer, and CFAR is run­ning a sum­mer fel­lows pro­gram in July. It’s quite plau­si­ble that we’ll hire a hand­ful of new re­searchers be­fore the end of 2015, in which case our run­way would start look­ing pretty short, and it’s pretty likely that we’ll be fund­ing con­strained again by the end of the year.

Diego Caleiro: I see a trend in the way new EAs con­cerned about the far fu­ture think about where to donate money that seems dan­ger­ous, it goes:

I am an EA and care about im­pact­ful­ness and ne­glect­ed­ness → Ex­is­ten­tial risk dom­i­nates my con­sid­er­a­tions → AI is the most im­por­tant risk → Donate to MIRI.

The last step fre­quently in­volves very lit­tle thought, it bor­ders on a cached thought.

Nate Soares: Huh, that hasn’t been my ex­pe­rience. We have a num­ber of po­ten­tial donors who ring us up and ask who in AI al­ign­ment needs money the most at the mo­ment. (In fact, last year, we di­rected a num­ber of donors to FHI, who had much more of a fund­ing gap than MIRI did at that time.)

Joshua Fox:

1. What are your plans for tak­ing MIRI to the next level? What is the next level?

2. Now that MIRI is fo­cused on math re­search (a good move) and not on out­reach, there is less of a role for vol­un­teers and sup­port­ers. With the dona­tion from Elon Musk, some of which will pre­sum­ably get to MIRI, the marginal value of small dona­tions has gone down. How do you plan to keep your sup­port­ers en­gaged and donat­ing? (The al­ter­na­tive, which is per­haps fea­si­ble, could be for MIRI to be an in­de­pen­dent re­search in­sti­tu­tion, with­out a lot of pub­lic en­gage­ment, funded by a few big donors.)

Nate Soares:

1. (a) grow the re­search team, (b) en­gage more with main­stream academia. I’d also like to spend some time ex­per­i­ment­ing to figure out how to struc­ture the re­search team so as to make it more effec­tive (we have a lot of flex­i­bil­ity here that main­stream aca­demic in­sti­tutes don’t have). Once we have the first team grow­ing steadily and run­ning smoothly, it’s not en­tirely clear whether the next step will be (c.1) grow it faster or (c.2) spin up a sec­ond team in­side MIRI tak­ing a differ­ent ap­proach to AI al­ign­ment. I’ll punt that ques­tion to fu­ture-Nate.

2. So first of all, I’m not con­vinced that there’s less of a role for sup­port­ers. If we had just ten peo­ple earn­ing-to-give at the (amaz­ing!) level of Ethan Dick­in­son, Jesse Lip­trap, Mike Blume, or Alexei An­dreev (note: Alexei re­cently stopped earn­ing-to-give in or­der to found a startup), that would bring in as much money per year as the Thiel Foun­da­tion. (I think peo­ple of­ten vastly over­es­ti­mate how many peo­ple are earn­ing-to-give to MIRI, and un­der­es­ti­mate how use­ful it is: the small donors taken to­gether make a pretty big differ­ence!)

Fur­ther­more, if we suc­cess­fully ex­e­cute on (a) above, then we’re go­ing to be burn­ing through money quite a bit faster than be­fore. An FLI grant (if we get one) will cer­tainly help, but I ex­pect it’s go­ing to be a lit­tle while be­fore MIRI can sup­port it­self on large dona­tions & grants alone.