Vizier AIs

This seems like a fairly obvious solution to FAI, a subject which has been pondered by many people much more intelligent and learned than I, so I assume there’s a crippling flaw with it—just one that’s eluded me. But:

Couldn’t an AGI be programmed such that its only desire was to give true answers to the questions asked of it? If the genie desired to argue its way out of the box, it surely could, but it doesn’t want to. It just wants to answer questions, like:

  • “Here are all of our scientific experiments, and here’s all our literature on measurement error, academic fraud, and the like. What’s the most parsimonious explanation for the data?”

  • “Does P=NP?”

  • “If I gave you the following utility function, what would you do?”

  • “What are the most persuasive factually accurate arguments that you can imagine for and against doing x?”

  • “What distribution and level of income do you expect over the next n years under the following tax codes?”

  • “What’s the shortest DNA sequence of a bug that will be fatal to most everyone in that racial group I hate but spare most everyone in that racial group I like?”

Obviously, as a super-powerful tool, such a thing could be used for great evil (as last example shows.) But this is just a problem with developing more powerful tools in general, and it doesn’t seem inconceivable that we could develop institutions and safeguards (assuming the collective will to do so in the first place) that would, to a passable degree, prevent just anyone from asking just anything without it solely in the hands of a privately interested clique. For instance, if we’re really paranoid, the public and academics could veto questions before they’re asked, and a sequestered jury of volunteers among the terminally ill could then be given a 50% chance of the computer telling them “sorry, I can’t tell you anything” and a 50% chance of being told the answer, which they could then decide to be understandable and worthy of release upon their deaths or not, such that outsiders would not know the difference between chosen and chance nonrelease. (Looser safeguards could exist for questions where we can imagine all the possible answers, and judge none of them to be dangerous.) We would have to phrase questions precisely to get useful answers, but this seems like something we’d have to solve in creating AI that weren’t viziers anyway.

An active friendly AI would be able to help us out more efficiently than we would ourselves with a mere vizier, but this seems to have a much lower downside risk than of releasing an AI which we *think* is friendly.

Edit I: to be more precise, each question creates a demon with access to the AI’s computational capacity; demons can’t communicate with each other, and their only goal is to give the true answer (or a probability distribution of answers, or whatever) to the question asked, given the information available as of its asking and within the timeframe requested. Then they disappear into the ether. It can’t do anything but read and textually respond to questions, and there’s no supervisory utility function that would manipulate one answer to get a better answer on another.

Edit II: Vladimir kindly notes that Eliezer has already addressed this in a frontpage article from the days of yore. Regardless of whether I agree with the arguments there, I feel kind of rude for bringing something up in, in ignorance, in an independent thread. I tried to delete this post, but nothing happened, so I feel both rude and silly.