asking an AI to make itself friendly

edit: I think I have phrased this re­ally poorly and that this has been mis­in­ter­preted. See my com­ment be­low for clar­ifi­ca­tion.

A lot of thought has been put into the dis­cus­sion of how one would need to define the goals of an AI so that it won’t find any “loop­holes” and act in an un­in­tended way.

As­sum­ing one already had an AI that is ca­pa­ble of un­der­stand­ing hu­man psy­chol­ogy, which seems nec­es­sary to me to define the AI’s goals any­way, wouldn’t it be rea­son­able to as­sume that the AI would have an un­der­stand­ing of what hu­mans want?

If that is the case, would the fol­low­ing ap­proach work to make the AI friendly?

-give it the tem­po­rary goal to always an­swer ques­tions thruth­fully as far as pos­si­ble while ad­mit­ting uncertainty

-also give it the goal to not al­ter re­al­ity in any way be­sides an­swer­ing ques­tions.

-ask it what it thinks would be the op­ti­mal defi­ni­tion of the goal of a friendly AI, from the point of view of hu­man­ity, ac­count­ing for things that hu­mans are too stupid to see com­ing.

-have a dis­cus­sion be­tween it and a group of ethi­cists/​philoso­phers wherein both par­ties are en­couraged to point out any flaws in the defi­ni­tion.

-have this go on for a long time un­til ev­ery­one (es­pe­cially the AI, see­ing as it is smarter than any­one else) is cer­tain that there is no flaw in the defi­ni­tion and that it ac­counts for all kinds of eth­i­cal con­tin­gen­cies that might arise af­ter the sin­gu­lar­ity.

-im­ple­ment the re­sult as the new goal of the AI.

What do you think of this ap­proach?