try make AIXI maximize paperclips without it also searching for a way to show itself paperclip porn; the problem appears entirely non solvable.
You think it is in principle impossible to make (an implementation of) AIXI that understands the map/territory distinction, and values paperclips in the territory more than paper clips in the map? I may be misunderstanding the nature of AIXI, but as far as I know it’s trying to maximize some “reward” number. If you program it so that the reward number is equal to “the number of paperclips in the territory as far as you know” it wouldn’t choose to believe there were a lot of paperclips because that wouldn’t increase its estimate (by its current belief-generating function) of the number of extant paperclips.
Will someone who’s read more on AIXI please tell me if I have it all backward? Thanks.
AIXI’s “reward number” is given directly to it via an input channel, and it’s non-trivial to change it so that it’s equal to “the number of paperclips in the territory as far as you know”. UDT can be seen as a step in this direction.
UDT shows how an agent might be able to care about something other than an externally provided reward, namely how a computation, or a set of computations, turn out. It’s conjectured that arbitrary goals, such as “maximize the number of paperclips across this distribution of possible worlds” (and our actual goals, whatever they may turn out to be) can be translated into such preferences over computations and then programmed into an AI, which will then take actions that we’d consider reasonable in pursue of such goals.
(Note this is a simplification that ignores issues like preferences over uncomputable worlds, but hopefully gives you an idea what the “step” consists of.)
The most stupid/incompetent part of LW AI belief cluster is it not understanding that ‘the number of paperclips in the territory as far as you know’ will require some sort of mathematical definition of paperclip in the territory, along with a lot of other stuff so far only defined in words (map territory distinction, and I don’t mean the distinction between number of paperclips in the world model and number of paperclips in the real world, I mean the fuzzy idiocy that arises from the people whom are babbling about map and territory themselves not actually implementing the map territory distinction and not understanding that real world ‘paperclips’ can only be in some sort of map of the real world because the real world haven’t got any high level object called ‘paperclip’ ). [Or not understanding how involved such a definition would be]
And then again, the AI is trying to maximize number of this mathematical definition of paperclip in the mathematical definition of territory, which, the way applied math is, would have other solutions than those matching English technobabble.
I don’t see how UDT gets you anywhere closer (and if I seen that it would, I would be even more against SI because this is precisely the research for creating the dangerous AI, set up by a philosopher who has been given access to funds to hire qualified people to do something that’s entirely pointless and only creates risk where there was none)
edit: to clarify on the map territory distinction. Understanding the distinction does not change the fact that multiple world states are mapped to one goal state, in the goal-definition itself, and are not distinguished by the goal-definition.
From what I can see, there’s thorough confusion between ‘understanding map-territory distinction’ in the sense of understanding the logic of map and territory being distinct and the mapping being lossy, and the ‘understanding map-territory distinction’ in the loose sense like ‘understanding how to drive a car’, i.e. in the sense of somehow distinguishing the real world states that are mapped to same map state, and preferring across them.
Or not understanding how involved such a definition would be
Why do you think I used the word “non-trivial”? Are you not aware that in technical fields “non-trivial” means “difficult”?
if I seen that it would, I would be even more against SI because this is precisely the research for creating the dangerous AI, set up by a philosopher who has been given access to funds to hire qualified people to do something that’s entirely pointless and only creates risk where there was none
It’s dangerous because it’s more powerful than other types of AI? If so, why would it be “entirely pointless”, and why do you think other AI researchers won’t eventually invent the same ideas (which seems to be implied by “creates risk where there was none”)?
In case you weren’t aware, I myself have arguedagainst SIAI pushing forward decision theory at this time, so I’m not trying to undermine your conclusion but just find your argument wrong, or at least confusing.
Why do you think I used the word “non-trivial”? Are you not aware that in technical fields “non-trivial” means “difficult”?
I didn’t state disagreement with you. I stated my disdain for most of LW community which just glosses it out as a detail not worth discussing. edit: or worse yet as inherent part of any ‘AI’.
It’s dangerous because it’s more powerful than other types of AI?
“Powerful” is a bad concept. I wouldn’t expect it to be a better problem solver for things like ‘how to make a better microchip’, but perhaps it could be a better problem solver for ‘how to hack internet’ because it is unethical but can come up with the idea and be ‘motivated’ to do it, while others aren’t. (I do not think that UDT is relevant to the difficult issues there—fortunately)
and why do you think other AI researchers won’t eventually invent the same ideas
The ideas in question (to the extent to which they are developed by SI so far) are trivial. They are also entirely useless for solving problems like how to make a better microchip, or how to drive a car. I do not expect non-SI funded research into automated problem solving to try to work out this kind of stuff, due to it’s uselessness. (note: the implementation of such ideas would be highly non trivial for anything like ‘real world paperclips with the intelligence module not solving the problem of breaking the paperclip counter’).
You think it is in principle impossible to make (an implementation of) AIXI that understands the map/territory distinction, and values paperclips in the territory more than paper clips in the map?
Any intelligent agent functioning in the real world is always ever limited to working with maps: internal information constructs which aim to represent/simulate the unknown external world. AIXI’s definition (like any good formal mathematical agent definition), formalizes this distinction. AIXI assumes the universe is governed by some computable program, but it does not have direct access to that program, so instead it must create an internal simulation based on its observation history.
AIXI could potentially understand the “map/territory distinction”, but it could no more directly value or access objects in the territory than your or I. Just like us, and any other real world agents, AIXI can only work with it’s map.
All that being said, humans can build maps which at least attempt to distinguish between objects in the world, simulations of objects in simulated worlds, simulations of worlds in simulated worlds, and so on, and AIXI potentially could build such maps as well.
You think it is in principle impossible to make (an implementation of) AIXI that understands the map/territory distinction, and values paperclips in the territory more than paper clips in the map?
You need to somehow specify a conversion from the real world state (quarks, leptops, etc etc) to a number of paperclips, so that the paperclips can be ordered differently, or have slightly different compositions. That conversion is essentially a map.
You do not want goal to distinguish between ’1000 paperclips that are lying in a box in this specific configuration’ and ‘1000 paperclips that are lying in a box in that specific configuration’.
There isn’t such discriminator in the territory. There is only in your mapping process.
I’m feeling that much of the reasoning here is driven by verbal confusion. To understand the map-territory issue, is to understand the above. But to understand also has the meaning as in ‘understand how to drive a car’, with the implied sense that understanding of map territory distinction would somehow make you not be constrained by associated problems.
Indeed. The problem of making sure that you are maximizing the real entity you want to maximize , and not a proxy is roughly equivalent to the disproving solipsism, which, itself,is widely regarded as almost impossible,by philosophers. Realists tend to assume their way out of the quandary...but assumption isn’t proof. In other words, there is no proof that humans are maximizing (good stuff) , and not just (good stuff porn)
You think it is in principle impossible to make (an implementation of) AIXI that understands the map/territory distinction, and values paperclips in the territory more than paper clips in the map? I may be misunderstanding the nature of AIXI, but as far as I know it’s trying to maximize some “reward” number. If you program it so that the reward number is equal to “the number of paperclips in the territory as far as you know” it wouldn’t choose to believe there were a lot of paperclips because that wouldn’t increase its estimate (by its current belief-generating function) of the number of extant paperclips.
Will someone who’s read more on AIXI please tell me if I have it all backward? Thanks.
AIXI’s “reward number” is given directly to it via an input channel, and it’s non-trivial to change it so that it’s equal to “the number of paperclips in the territory as far as you know”. UDT can be seen as a step in this direction.
I don’t see how UDT is a step in this direction. Can you explain?
UDT shows how an agent might be able to care about something other than an externally provided reward, namely how a computation, or a set of computations, turn out. It’s conjectured that arbitrary goals, such as “maximize the number of paperclips across this distribution of possible worlds” (and our actual goals, whatever they may turn out to be) can be translated into such preferences over computations and then programmed into an AI, which will then take actions that we’d consider reasonable in pursue of such goals.
(Note this is a simplification that ignores issues like preferences over uncomputable worlds, but hopefully gives you an idea what the “step” consists of.)
The most stupid/incompetent part of LW AI belief cluster is it not understanding that ‘the number of paperclips in the territory as far as you know’ will require some sort of mathematical definition of paperclip in the territory, along with a lot of other stuff so far only defined in words (map territory distinction, and I don’t mean the distinction between number of paperclips in the world model and number of paperclips in the real world, I mean the fuzzy idiocy that arises from the people whom are babbling about map and territory themselves not actually implementing the map territory distinction and not understanding that real world ‘paperclips’ can only be in some sort of map of the real world because the real world haven’t got any high level object called ‘paperclip’ ). [Or not understanding how involved such a definition would be]
And then again, the AI is trying to maximize number of this mathematical definition of paperclip in the mathematical definition of territory, which, the way applied math is, would have other solutions than those matching English technobabble.
I don’t see how UDT gets you anywhere closer (and if I seen that it would, I would be even more against SI because this is precisely the research for creating the dangerous AI, set up by a philosopher who has been given access to funds to hire qualified people to do something that’s entirely pointless and only creates risk where there was none)
edit: to clarify on the map territory distinction. Understanding the distinction does not change the fact that multiple world states are mapped to one goal state, in the goal-definition itself, and are not distinguished by the goal-definition.
From what I can see, there’s thorough confusion between ‘understanding map-territory distinction’ in the sense of understanding the logic of map and territory being distinct and the mapping being lossy, and the ‘understanding map-territory distinction’ in the loose sense like ‘understanding how to drive a car’, i.e. in the sense of somehow distinguishing the real world states that are mapped to same map state, and preferring across them.
Why do you think I used the word “non-trivial”? Are you not aware that in technical fields “non-trivial” means “difficult”?
It’s dangerous because it’s more powerful than other types of AI? If so, why would it be “entirely pointless”, and why do you think other AI researchers won’t eventually invent the same ideas (which seems to be implied by “creates risk where there was none”)?
In case you weren’t aware, I myself have argued against SIAI pushing forward decision theory at this time, so I’m not trying to undermine your conclusion but just find your argument wrong, or at least confusing.
I didn’t state disagreement with you. I stated my disdain for most of LW community which just glosses it out as a detail not worth discussing. edit: or worse yet as inherent part of any ‘AI’.
“Powerful” is a bad concept. I wouldn’t expect it to be a better problem solver for things like ‘how to make a better microchip’, but perhaps it could be a better problem solver for ‘how to hack internet’ because it is unethical but can come up with the idea and be ‘motivated’ to do it, while others aren’t. (I do not think that UDT is relevant to the difficult issues there—fortunately)
The ideas in question (to the extent to which they are developed by SI so far) are trivial. They are also entirely useless for solving problems like how to make a better microchip, or how to drive a car. I do not expect non-SI funded research into automated problem solving to try to work out this kind of stuff, due to it’s uselessness. (note: the implementation of such ideas would be highly non trivial for anything like ‘real world paperclips with the intelligence module not solving the problem of breaking the paperclip counter’).
Any intelligent agent functioning in the real world is always ever limited to working with maps: internal information constructs which aim to represent/simulate the unknown external world. AIXI’s definition (like any good formal mathematical agent definition), formalizes this distinction. AIXI assumes the universe is governed by some computable program, but it does not have direct access to that program, so instead it must create an internal simulation based on its observation history.
AIXI could potentially understand the “map/territory distinction”, but it could no more directly value or access objects in the territory than your or I. Just like us, and any other real world agents, AIXI can only work with it’s map.
All that being said, humans can build maps which at least attempt to distinguish between objects in the world, simulations of objects in simulated worlds, simulations of worlds in simulated worlds, and so on, and AIXI potentially could build such maps as well.
You need to somehow specify a conversion from the real world state (quarks, leptops, etc etc) to a number of paperclips, so that the paperclips can be ordered differently, or have slightly different compositions. That conversion is essentially a map.
You do not want goal to distinguish between ’1000 paperclips that are lying in a box in this specific configuration’ and ‘1000 paperclips that are lying in a box in that specific configuration’.
There isn’t such discriminator in the territory. There is only in your mapping process.
I’m feeling that much of the reasoning here is driven by verbal confusion. To understand the map-territory issue, is to understand the above. But to understand also has the meaning as in ‘understand how to drive a car’, with the implied sense that understanding of map territory distinction would somehow make you not be constrained by associated problems.
Indeed. The problem of making sure that you are maximizing the real entity you want to maximize , and not a proxy is roughly equivalent to the disproving solipsism, which, itself,is widely regarded as almost impossible,by philosophers. Realists tend to assume their way out of the quandary...but assumption isn’t proof. In other words, there is no proof that humans are maximizing (good stuff) , and not just (good stuff porn)