Say I’ve built a room-tidying robot, and I want to measure its optimisation power. The room can be in two states: tidy or untidy. A natural choice of default distribution p is my beliefs about how tidy the room will be if I don’t put the robot in it. Let’s assume I’m pretty knowledgeable and I’m extremely confident that in that case the room will be untidy: p(untidy)=2047/2048 and p(tidy)=1/2048 (we do have to avoid probabilities of 0, but that’s standard in a Bayesian context). But really I do put the robot in and it gets the room tidy, for an optimisation power of −log12048=11 bits.
That 11 bits doesn’t come from any uncertainty on my part about the optimisation process, although it does depend on my uncertainty about what would happen in the counterfactual world where I don’t put the robot in the room. But becoming more confident that the room would be untidy in that world makes me see the robot as more of an optimiser.
Unlike in information theory, these bits aren’t measuring a resolution of uncertainty, but a difference between the world and a counterfactual.
I don’t see the difference between “resolution of uncertainty” and “difference between the world and a counterfactual.” To my mind, resolution of uncertainty is reducing the space of counterfactuals, e.g., if I’m not sure whether you’ll say yes or no, then you saying “yes” reduces my uncertainty by one bit, because there were two counterfactuals.
I think what Garrett is gesturing at here is more like “There is just one way the world goes, the robot cleans the room or it doesn’t. If I had all the information about the world, I would see the robot does clean the room, i.e., I would have no uncertainty about this, and therefore there is no relevant counterfactual. It’s not as if the robot could have not cleaned the room, I know it doesn’t. In other words, as I gain information about the world, the distance between counterfactual worlds and actual worlds grows smaller, and then so does… the optimization power? That’s weird.”
Like, we want to talk about optimization power here as “moving the world more into your preference ordering, relative to some baseline” but the baseline is made out of counterfactuals, and those live in the mind. So we end up saying something in the vicinity of optimization power being a function of maps, which seems weird to me.
The above formulas rely on comparing the actual world to a fixed counterfactual baseline. Gaining more information about the actual world might make the distance between the counterfactual baseline and the actual world grow smaller, but it also might make it grow bigger, so it’s not the case that the optimisation power goes to zero as my uncertainty about the world decreases. You can play with the formulas and see.
But maybe your objection is not so much that the formulas actually spit out zero, but that if I become very confident about what the world is like, it stops being coherent to imagine it being different? This would be a general argument against using counterfactuals to define anything. I’m not convinced of it, but if you like you can purge all talk of imagining the world being different, and just say that measuring optimisation power requires a controlled experiment: set up the messy room, record what happens when you put the robot in it, set the room up the same, and record what happens with no robot.
Hm, I’m not sure this problem comes up.
Say I’ve built a room-tidying robot, and I want to measure its optimisation power. The room can be in two states: tidy or untidy. A natural choice of default distribution p is my beliefs about how tidy the room will be if I don’t put the robot in it. Let’s assume I’m pretty knowledgeable and I’m extremely confident that in that case the room will be untidy: p(untidy)=2047/2048 and p(tidy)=1/2048 (we do have to avoid probabilities of 0, but that’s standard in a Bayesian context). But really I do put the robot in and it gets the room tidy, for an optimisation power of −log12048=11 bits.
That 11 bits doesn’t come from any uncertainty on my part about the optimisation process, although it does depend on my uncertainty about what would happen in the counterfactual world where I don’t put the robot in the room. But becoming more confident that the room would be untidy in that world makes me see the robot as more of an optimiser.
Unlike in information theory, these bits aren’t measuring a resolution of uncertainty, but a difference between the world and a counterfactual.
I don’t see the difference between “resolution of uncertainty” and “difference between the world and a counterfactual.” To my mind, resolution of uncertainty is reducing the space of counterfactuals, e.g., if I’m not sure whether you’ll say yes or no, then you saying “yes” reduces my uncertainty by one bit, because there were two counterfactuals.
I think what Garrett is gesturing at here is more like “There is just one way the world goes, the robot cleans the room or it doesn’t. If I had all the information about the world, I would see the robot does clean the room, i.e., I would have no uncertainty about this, and therefore there is no relevant counterfactual. It’s not as if the robot could have not cleaned the room, I know it doesn’t. In other words, as I gain information about the world, the distance between counterfactual worlds and actual worlds grows smaller, and then so does… the optimization power? That’s weird.”
Like, we want to talk about optimization power here as “moving the world more into your preference ordering, relative to some baseline” but the baseline is made out of counterfactuals, and those live in the mind. So we end up saying something in the vicinity of optimization power being a function of maps, which seems weird to me.
The above formulas rely on comparing the actual world to a fixed counterfactual baseline. Gaining more information about the actual world might make the distance between the counterfactual baseline and the actual world grow smaller, but it also might make it grow bigger, so it’s not the case that the optimisation power goes to zero as my uncertainty about the world decreases. You can play with the formulas and see.
But maybe your objection is not so much that the formulas actually spit out zero, but that if I become very confident about what the world is like, it stops being coherent to imagine it being different? This would be a general argument against using counterfactuals to define anything. I’m not convinced of it, but if you like you can purge all talk of imagining the world being different, and just say that measuring optimisation power requires a controlled experiment: set up the messy room, record what happens when you put the robot in it, set the room up the same, and record what happens with no robot.