Gatekeeper variation

A new vari­a­tion on the Gate­keeper idea, thanks to Zvi Schreiber.

We block all but one IO chan­nel for our AGI.

We cre­ate a for­mally ver­ified proof-ver­ifi­ca­tion sys­tem as a gate­keeper. We pass the Gate­keeper a de­ci­sion prob­lem with a for­mal proof that it is in the com­plex­ity class UP (“Unam­bigu­ous Non-de­ter­minis­tic Polyno­mial-time”). The Gate­keeper ver­ifies the proof, and passes the de­ci­sion prob­lem to the AGI. If the AGI pro­vides an an­swer, the Gate­keeper val­i­dates that this is the unique valid an­swer (which it can do in polyno­mial time) and if so it passes the an­swer to us.

The ad­van­tage of this ap­proach over one-bit and other Or­a­cles is that the AI can do some use­ful prob­lem solv­ing with zero dis­cre­tion as to the data it com­mu­ni­cates. It can ei­ther give the unique an­swer or no an­swer at all. Such a gate­keeper is fea­si­ble with to­day’s tech­nol­ogy.

To keep the AGI from com­mu­ni­cat­ing by choos­ing the timing of its an­swer, the Gate­keeper always com­mu­ni­cates the an­swer (or the fact that there it is not yet one from the AGI) af­ter ex­actly 1 hour.

(The idea of re­strict­ing an Or­a­cle AI to choos­ing from pre­de­ter­mined list of ver­ifi­able an­swers has been men­tioned, but I don’t re­mem­ber any­one adding this for­mal­ism.)

The usual caveats on Or­a­cles ap­ply. Like other Or­a­cles, this is a stop­gap rather than a solu­tion to the FAI prob­lem.

  1. The laws of physics still give the AGI a causal link to its en­vi­ron­ment, even if it is in a Fara­day cage.

  2. Even a one-bit Or­a­cle AI can ex­ert op­ti­miza­tion power when asked mul­ti­ple ques­tions.

  3. An­swers aren’t enough to en­sure safety: We may be ask­ing the wrong ques­tions or not un­der­stand­ing the im­pli­ca­tions of the an­swers.

Ob­jec­tions spe­cific to this idea:
  1. Only cer­tain prob­lems can be fed to the AGI, and per­haps not the rele­vant ones. How­ever, UP in­cludes in­te­ger fac­tor­iza­tion and other in­ter­est­ing prob­lems, sug­gest­ing that this could be a use­ful tool.

  2. This idea does not say that the AGI has figured out how to do all UP in polyno­mial time, just that it has figured out fast solu­tions for a larger sub­set of UP than hu­mans have. So the ad­di­tional op­ti­miza­tion power ex­erted by the AGI may not be very large.