In other words, the more negative utilons the agent is threatening to contribute, the lower my probability of him being able to do it becomes. Thus, the expected value should converge to zero.
The basic idea is that “a compactly specified wager can grow in size much faster than it grows in complexity.”
We might be talking about different things here, I’m not sure. In his original post, Eliezer seems to be concerned with agents who evaluate beliefs solely (or perhaps just primarily) in terms of their algorithmic complexity. Such agents would not be very smart though; at least, not instrumentally. There are lots of simple beliefs that are nonetheless wrong, f.ex. “the Earth is flat”, or “atoms are perfectly circular, just like planetary orbits”.
I thought we were concerned with agents who evaluate beliefs by looking not only at their complexity, but also at the available evidence, and their projected effects on the world if the beliefs in question were true. Such agents would be closer to our human scientists than pure philosophers.
The problem with Pascal’s Mugging, then, is not only that the initial probability of the agent’s claim being true is low, but that it gets even lower with each additional person he claims to be able to punch in the face. Even if we grant that we’re living in the Matrix (just for example), with each additional punch-victim, we must grant that:
The Matrix hardware can support this victim, in addition to all the others it’s already supporting,
The agent has the power and skill to instantiate this victim correctly,
The agent has the power and skill to punch him in the face,
etc.
Merely believing that we live in the Matrix is not enough; we must also believe that the agent has the power to do what he claims to be able to do, and with each victim he claims to be able to affect, his burden of proof grows larger and larger, and the negative expected value of his actions grows smaller and smaller.
The basic idea is that “a compactly specified wager can grow in size much faster than it grows in complexity.”
We might be talking about different things here, I’m not sure. In his original post, Eliezer seems to be concerned with agents who evaluate beliefs solely (or perhaps just primarily) in terms of their algorithmic complexity. Such agents would not be very smart though; at least, not instrumentally. There are lots of simple beliefs that are nonetheless wrong, f.ex. “the Earth is flat”, or “atoms are perfectly circular, just like planetary orbits”.
I thought we were concerned with agents who evaluate beliefs by looking not only at their complexity, but also at the available evidence, and their projected effects on the world if the beliefs in question were true. Such agents would be closer to our human scientists than pure philosophers.
The problem with Pascal’s Mugging, then, is not only that the initial probability of the agent’s claim being true is low, but that it gets even lower with each additional person he claims to be able to punch in the face. Even if we grant that we’re living in the Matrix (just for example), with each additional punch-victim, we must grant that:
The Matrix hardware can support this victim, in addition to all the others it’s already supporting,
The agent has the power and skill to instantiate this victim correctly,
The agent has the power and skill to punch him in the face,
etc.
Merely believing that we live in the Matrix is not enough; we must also believe that the agent has the power to do what he claims to be able to do, and with each victim he claims to be able to affect, his burden of proof grows larger and larger, and the negative expected value of his actions grows smaller and smaller.