Tool for maximizing paperclips vs a paperclip maximizer

To clar­ify some point that is be­ing dis­cussed in sev­eral threads here, tool vs in­ten­tional agent dis­tinc­tion:

A tool for max­i­miz­ing pa­per­clips would—for effi­ciency pur­poses—have a world-model which it has god’s eye view of (not ac­cess­ing it through em­bed­ded sen­sors like eyes), im­ple­ment­ing/​defin­ing a counter of pa­per­clips within this model. Out­put of this counter is what is be­ing max­i­mized by a prob­lem solv­ing por­tion of the tool. Not the real world paperclips

No real world in­ten­tion­al­ity ex­ist in this tool for max­i­miz­ing pa­per­clips; the pa­per­clip-mak­ing-prob­lem-solver would max­i­mize the out­put of the counter, not real world pa­per­clips. Such tool can be hooked up to ac­tu­a­tors, and to sen­sors, and made to af­fect the world with­out hu­man in­ter­me­di­ary; but it won’t im­ple­ment real world in­ten­tion­al­ity.

An in­ten­tional agent for max­i­miz­ing pa­per­clips is the fa­mil­iar ‘pa­per­clip max­i­mizer’, that truly loves the real world pa­per­clips and wants to max­i­mize them, and would try to im­prove it’s un­der­stand­ing of the world to know if it’s pa­per­clip mak­ing efforts are suc­cess­ful.

The real world in­ten­tion­al­ity is on­tolog­i­cally ba­sic in hu­man lan­guage and con­se­quently there is very strong bias to de­scribe the former as the lat­ter.

The dis­tinc­tion: the wire­head­ing (ei­ther di­rect or through ma­nipu­la­tion of in­puts) is a valid solu­tion to the prob­lem that is be­ing solved by the former, but not by the lat­ter. Of course one could ra­tio­nal­ize and pos­tu­late tool that is not gen­eral pur­pose enough as to wire­head, for­get­ting that the is­sue be­ing feared is a tool that’s gen­eral pur­pose to de­sign bet­ter tool or self im­prove. That is an in­cred­ibly frus­trat­ing fea­ture of ra­tio­nal­iza­tion. The as­pects of prob­lem are for­got­ten when think­ing back­wards.

The is­sues with the lat­ter: We do not know if hu­mans ac­tu­ally im­ple­ment real world in­ten­tion­al­ity in such a way that it is not de­stroyed un­der full abil­ity to self mod­ify (and we can ob­serve that we very much like to ma­nipu­late our own in­puts; see art, porn, fic­tion, etc). We do not have sin­gle cer­tain ex­am­ple of such sta­ble real world in­ten­tion­al­ity, and we do not know how to im­ple­ment it (that may well be im­pos­si­ble). We also are prone to as­sum­ing that two un­solved prob­lems in AI—gen­eral prob­lem solv­ing and this real world in­ten­tion­al­ity—are a sin­gle prob­lem, or are solved nec­es­sar­ily to­gether. A map com­pres­sion is­sue.