Just Imitate Humans?

Do peo­ple think we could make a sin­gle­ton (or achieve global co­or­di­na­tion and pre­ven­ta­tive polic­ing) just by imi­tat­ing hu­man poli­cies on com­put­ers? If so, this seems pretty safe to me.

Some rea­sons for op­ti­mism: 1) these could be run much faster than a hu­man thinks, and 2) we could make very many of them.

Ac­quiring data: put a group of peo­ple in a house with a com­puter. Show them things (images, videos, au­dio files, etc.) and give them a chance to re­spond at the key­board. Their key­board ac­tions are the ac­tions, and ev­ery­thing be­tween ac­tions is an ob­ser­va­tion. Then learn the policy of the group of hu­mans. By the way, these can be happy hu­mans who earnestly try to fol­low in­struc­tions. To model their policy, we can take the max­i­mum a pos­te­ri­ori es­ti­mate over a set of poli­cies which in­cludes the truth, and freeze the policy once we’re satis­fied. (This is with un­limited com­pu­ta­tion; we’d have to use heuris­tics and ap­prox­i­ma­tions in real life). With a max­i­mum a pos­te­ri­ori es­ti­mate, this will be quick to run once we freeze the policy, and we’re no longer track­ing tons of hy­pothe­ses, es­pe­cially if we used some sort of speed prior. Let be the num­ber of in­ter­ac­tion cy­cles we record be­fore freez­ing the policy. For suffi­ciently large , it seems to me that run­ning this is safe.

What are peo­ple’s in­tu­itions here? Could enough hu­man-imi­tat­ing ar­tifi­cial agents (run­ning much faster than peo­ple) pre­vent un­friendly AGI from be­ing made?

If we think this would work, there would still be the (nei­ther triv­ial nor hope­less) challenge of con­vinc­ing all se­ri­ous AGI labs that any at­tempt to run a su­per­hu­man AGI is un­con­scionably dan­ger­ous, and we should stick to imi­tat­ing hu­mans.