private_messaging comments on How many people here agree with Holden? [Actually, who agrees with Holden?]

private_messaging 15 May 2012 15:33 UTC
6 points
I don’t think we need to prove wireheading here. Suffices that it only cares about the input, and so will find a way to set that input. You wire it to paperclip counter to maximize paperclips, it’ll be also searching for a way to replace counter with infinity or ‘trick’ the counter (anything goes). You sit here yourself rewarding it for making paperclips, with a pushbutton, it’s search will include tricking you into pushing the button.

I also think that if you want it to self preserve you’ll need to code in special stuff to equate self inside world model (which is not a full model of itself otherwise infinite recursion) with self in the real world. Actually on the recent comment by Eliezer maybe we agree on this:

http://lesswrong.com/lw/3kz/new_years_predictions_thread_2011/3a20

ahh by the way: it has to be embedded in the real world, which doesn’t seem to allow for infinite computing power, so, no full perfect simulation of real world inside AIXI (or ad infinitum recursion) is allowed.

edit: and by AIXI i meant one of the computable approximations (e.g. AIXI-tl).