AIXI could self-improve through its actions into something that cares about paperclips or something else in the environment, beside its own reward (or maybe it’ll just inadvertently self-destruct). This could happen since it lacks the appropriate notions (heuristics) to care about the results of experimenting with own hardware.
AIXI is a bad example for preservation of own goals (for one, it doesn’t have a goal unless it self-modifies to introduce one, thus ceasing to be AIXI, it only has a sequence of past observations/rewards).
(Those were examples of changing the original algorithm, in particular lifting its limitations. I don’t think the particular details are probable, no more than paperclips in particular are the convergent morality of all random AGIs, but starting from an AIXI engine, some kind of powerful intelligent outcome, as opposed to nothing happening, seems possible.)
AIXI could self-improve through its actions into something that cares about paperclips or something else in the environment, beside its own reward (or maybe it’ll just inadvertently self-destruct). This could happen since it lacks the appropriate notions (heuristics) to care about the results of experimenting with own hardware.
AIXI is a bad example for preservation of own goals (for one, it doesn’t have a goal unless it self-modifies to introduce one, thus ceasing to be AIXI, it only has a sequence of past observations/rewards).
Or it could decide to give me chocolate and then self-destruct. You’re privileging hypotheses without good reason.
You’re right, I forgot about the chocolate.
(Those were examples of changing the original algorithm, in particular lifting its limitations. I don’t think the particular details are probable, no more than paperclips in particular are the convergent morality of all random AGIs, but starting from an AIXI engine, some kind of powerful intelligent outcome, as opposed to nothing happening, seems possible.)