Imposing FAI

All the posts on FAI the­ory as of late have given me cause to think. There’s some­thing in the con­ver­sa­tions about it that has always bugged me, but it is some­thing that I haven’t found the words for be­fore now.

It is some­thing like this:

Say that you man­age to con­struct an al­gorithm for FAI...

Say that you can show that it isn’t go­ing to be a dan­ger­ous mis­take...

And say you do all of this, and pop­u­larize it, be­fore AGI is cre­ated (or at least, be­fore an AGI goes *FOOM*)...

...

How in the name of Sa­gan are you ac­tu­ally go­ing to ENFORCE the idea that all AGIs are FAIs?

I mean, if it re­quired some rare ma­te­rial (like nu­clear weapons) or large lab­o­ra­to­ries (like biolog­i­cal wmds) or some other re­source that you could at least make ar­tifi­cially scarce, you could set up a body that en­sures that any AGI cre­ated is an FAI.

But if all it is, is the right al­gorithms, the right code, and enough com­put­ing power… even if you de­sign a the­ory for FAI, how would you keep some­one from mak­ing UFAI any­way? Between peo­ple ex­per­i­ment­ing with the prin­ci­ples (once known), mak­ing mis­takes, and the prospect of ac­tively mal­i­cious *hu­mans*… it just seems like un­less you some­how come up with an in­ter­nal mechanism that makes FAI bet­ter and stronger than any UFAI could be, and the solu­tion turns out to be such that any idiot could see that it was a bet­ter solu­tion… that UFAI is go­ing to ex­ist at some point no mat­ter what.

At that point, it seems like the ques­tion be­comes not “How do we make FAI?” (al­though that might be a sec­ondary ques­tion) but rather “How do we pre­vent the cre­ation of, elimi­nate, or re­duce po­ten­tial dam­age from UFAI?” Now, it seems like FAI might be one thing that you do to­ward that goal, but if UFAI is a highly likely con­se­quence of AGI even *with* an FAI the­ory, shouldn’t the fo­cus be on how to con­tain a UFAI event?