Imposing FAI

All the posts on FAI theory as of late have given me cause to think. There’s something in the conversations about it that has always bugged me, but it is something that I haven’t found the words for before now.

It is something like this:

Say that you manage to construct an algorithm for FAI...

Say that you can show that it isn’t going to be a dangerous mistake...

And say you do all of this, and popularize it, before AGI is created (or at least, before an AGI goes *FOOM*)...

...

How in the name of Sagan are you actually going to ENFORCE the idea that all AGIs are FAIs?

I mean, if it required some rare material (like nuclear weapons) or large laboratories (like biological wmds) or some other resource that you could at least make artificially scarce, you could set up a body that ensures that any AGI created is an FAI.

But if all it is, is the right algorithms, the right code, and enough computing power… even if you design a theory for FAI, how would you keep someone from making UFAI anyway? Between people experimenting with the principles (once known), making mistakes, and the prospect of actively malicious *humans*… it just seems like unless you somehow come up with an internal mechanism that makes FAI better and stronger than any UFAI could be, and the solution turns out to be such that any idiot could see that it was a better solution… that UFAI is going to exist at some point no matter what.

At that point, it seems like the question becomes not “How do we make FAI?” (although that might be a secondary question) but rather “How do we prevent the creation of, eliminate, or reduce potential damage from UFAI?” Now, it seems like FAI might be one thing that you do toward that goal, but if UFAI is a highly likely consequence of AGI even *with* an FAI theory, shouldn’t the focus be on how to contain a UFAI event?