CFAR’s new focus, and AI Safety

A bit about our last few months:

  • We’ve been work­ing on get­ting a sim­ple clear mis­sion and an or­ga­ni­za­tion that ac­tu­ally works. We think of our goal as analo­gous to the tran­si­tion that the old Sin­gu­lar­ity In­sti­tute un­der­went un­der Luke­prog (dur­ing which chaos was re­placed by a sim­ple, in­tel­ligible struc­ture that made it eas­ier to turn effort into for­ward mo­tion).

  • As part of that, we’ll need to find a way to be in­tel­ligible.

  • This is the first of sev­eral blog posts aimed at caus­ing our new form to be visi­ble from out­side. (If you’re in the Bay Area, you can also come meet us at tonight’s open house.) (We’ll be talk­ing more about the causes of this mis­sion-change; the ex­tent to which it is in fact a change, etc. in an up­com­ing post.)

Here’s a short ex­pla­na­tion of our new mis­sion:
  • We care a lot about AI Safety efforts in par­tic­u­lar, and about oth­er­wise in­creas­ing the odds that hu­man­ity reaches the stars.

  • Also, we[1] be­lieve such efforts are bot­tle­necked more by our col­lec­tive episte­mol­ogy, than by the num­ber of peo­ple who ver­bally en­dorse or act on “AI Safety”, or any other “spread­able view­pointdis­con­nected from its deriva­tion.

  • Our aim is there­fore to find ways of im­prov­ing both in­di­vi­d­ual think­ing skill, and the modes of think­ing and so­cial fabric that al­low peo­ple to think to­gether. And to do this among the rel­a­tively small sets of peo­ple tack­ling ex­is­ten­tial risk.

To elab­o­rate a lit­tle:

Ex­is­ten­tial wins and AI safety

By an “ex­is­ten­tial win”, we mean hu­man­ity cre­ates a sta­ble, pos­i­tive fu­ture. We care a heck of a lot about this one.
Our work­ing model here ac­cords roughly with the model in Nick Bostrom’s book Su­per­in­tel­li­gence. In par­tic­u­lar, we be­lieve that if gen­eral ar­tifi­cial in­tel­li­gence is at some point in­vented, it will be an enor­mously big deal.
(Lately, AI Safety is be­ing dis­cussed by ev­ery­one from The Economist to Newsweek to Obama to an open let­ter from eight thou­sand. But we’ve been think­ing on this, and backchain­ing partly from it, since be­fore that.)

Who we’re fo­cus­ing on, why

Our pre­limi­nary in­ves­ti­ga­tions agree with The Onion’s; de­spite some look­ing, we have found no ul­tra-com­pe­tent group of peo­ple be­hind the scenes who have fully got things cov­ered.
What we have found are:
  • AI and ma­chine learn­ing grad­u­ate stu­dents, re­searchers, pro­ject-man­agers, etc. who care; who can think; and who are in­ter­ested in think­ing bet­ter;

  • Stu­dents and oth­ers af­fili­ated with the “Effec­tive Altru­ism” move­ment, who are look­ing to di­rect their ca­reers in ways that can do the most good;

  • Ra­tion­al­ity geeks, who are in­ter­ested in se­ri­ously work­ing to un­der­stand how the heck think­ing works when it works, and how to make it work even in do­mains as con­fus­ing as AI safety.

Th­ese folks, we sus­pect, are the ones who can give hu­man­ity the most boost in its sur­vival-odds per dol­lar of CFAR’s pre­sent efforts (which is a state­ment partly about us, but so it goes). We’ve been fo­cus­ing on them.
(For the sake of ev­ery­one. Would you rather: (a) have bad ra­tio­nal­ity skills your­self; or (b) be kil­led by a sci­en­tist or policy-maker who also had bad ra­tio­nal­ity skills?)

Brier-boost­ing, not Sig­nal-boosting

Every­one thinks they’re right. We do, too. So we have some temp­ta­tion to take our own fa­vorite cur­rent mod­els of AI Safety strat­egy and to try to get ev­ery­one else to shut up about their mod­els and be­lieve ours in­stead.
This un­der­stand­ably pop­u­lar ac­tivity is of­ten called “sig­nal boost­ing”, “rais­ing aware­ness”, or do­ing “out­reach”.
At CFAR, though, we force our­selves not to do “sig­nal boost­ing” in this way. Our strat­egy is to spread gen­eral-pur­pose think­ing skills, not our cur­rent opinions. It is im­por­tant that we get the truth-seek­ing skills them­selves to snow­ball across rele­vant play­ers, be­cause ul­ti­mately, cre­at­ing a safe AI (or oth­er­wise se­cur­ing an ex­is­ten­tial win) is a re­search prob­lem. No­body, to­day, has copy­able opinions that will get us there.
We like to call this “Brier boost­ing”, be­cause a “Brier score” is a mea­sure of pre­dic­tive ac­cu­racy.
Lever and World
(song by CFAR alum Ray Arnold)
[1] By “We be­lieve X”, we do not mean to as­sert that ev­ery CFAR staff mem­ber in­di­vi­d­u­ally be­lieves X. (Similarly for “We care about Y). We mean rather that CFAR as an or­ga­ni­za­tion is plan­ning/​act­ing as though X is true. (Much as if CFAR promises you a ra­tio­nal­ity T-shirt, that isn’t an in­di­vi­d­ual promise from each of the in­di­vi­d­u­als at CFAR; it is rather a promise from the or­ga­ni­za­tion as such.)
If we’re go­ing to build an art of ra­tio­nal­ity, we’ll need to figure out how to cre­ate an or­ga­ni­za­tion where peo­ple can in­di­vi­d­u­ally be­lieve what­ever the heck they end up ac­tu­ally be­liev­ing as they chase the ev­i­dence, while also hav­ing the or­ga­ni­za­tion qua or­ga­ni­za­tion be pre­dictable/​in­tel­ligible.
ETA:
You may also want to check out two doc­u­ments we posted in the days since this post: