(Thanks for the post btw! My comments on this comment below.)
This PrudentBot def does feel in-spirit to me. I also agree with your analysis that it doesn’t fall for the pitfall of “I assumed in my hypothetical that you thought I would unconditionally cooperate with you, but then no fair you defected on me in my hypothetical! I defect!”
For instance this happens if you try, as I was inclined to try, . This quickly becomes “false” i.e. “defect” if you have a lying around.
With your definition, we have , and assuming we have a around I think that simplifies to . (I’m not sure I have that right!)
That’s… better behaved! It will run into the same Lobian difficulties as usual if we directly translate this as a provability bot. So imo we still need some better answer as to how we should interpret these as programs that more faithfully represents what we want in terms of their reasoning.
Also fwiw, I run with the model that the simulated opponent has access to both a description of your full behaviour and a shortcut proof that you cooperate with just them. (In the cooperation branch hypothetical.) So when your simulations have the cooperation proof in scope, you’re just saying that the behaviour is equivalent to CooperateBot in this very matchup. And it happens that this is often a sufficient condition for cooperation. (And if your opponent runs you on a different bot like DefectBot, the cooperation proof doesn’t apply there.)
So there’s some question here like “well what’s the point of simulating under the assumption of the narrow cooperation proof if we’re keeping the full description as a fallback?”
Part of what your post points at is an answer which is that, well, it works out more cleanly if you do! Specifically that reasoning is easier if, for the purposes of deciding to cooperate, you assume the decision is made and check that it’s stable/good.
As for why that is the case, I have a more philosophical take that it’s about encoding choice, and the better mechanics we see are downstream of encoding this better. I’ll go into more detail in this collapsed section (you may well wish to gather your own thoughts first).
on encoding choice
We can think of the bot as containing some mechanisms that lead to certain actions, with conditions for firing.
The FairBot variant described in this post is a simple instance, with a default action of defection and a single mechanism that leads to coooperation when it fires.
I have some idea that the choice-y mechanisms are of the form “if I activate, will the outcome satisfy some property? if so I activate”.
So for a given mechanism we can have a widget with a goal target . We want a way of saying “if were to fire this would lead to ” as our condition for . And there’s a question of how to encode this in a program or in our modal logic setting.
Straight up trying won’t work, since and you won’t be able to form a statement like this. This was omitting the part where is using some model of consequences to make its choice.
So next up is , in which fires if our modelling shows that if fires this will lead to . You can in fact show that here!
The other option is ; this doesn’t assume ” fires” but rather “someone has a model that fires”, and is actually a bit more brittle. The reason this form can be preferable in the modal logic bots is that you are sidestepping the complexity of showing that you will need to check your opponent can tell that has fired. But I think this may be a bit of a hack relative to the version.
My reasoning here suggests for that if you have multiple conditions for action, you may factor out // that are each self-referential, and build your high level agent as trying each in sequence. I haven’t fully cached out yet what this looks like, and whether it e.g. gives you a meaningfully different PrudentBot.
I note that this picture doesn’t provide a convenient answer for the problem where in provability land your action “defect” may cause someone to need to prove . But as I guess I keep mentioning I view this as a bug in the provability encoding.
(Thanks for the post btw! My comments on this comment below.)
This PrudentBot def does feel in-spirit to me. I also agree with your analysis that it doesn’t fall for the pitfall of “I assumed in my hypothetical that you thought I would unconditionally cooperate with you, but then no fair you defected on me in my hypothetical! I defect!”
For instance this happens if you try, as I was inclined to try, . This quickly becomes “false” i.e. “defect” if you have a lying around.
With your definition, we have , and assuming we have a around I think that simplifies to . (I’m not sure I have that right!)
That’s… better behaved! It will run into the same Lobian difficulties as usual if we directly translate this as a provability bot. So imo we still need some better answer as to how we should interpret these as programs that more faithfully represents what we want in terms of their reasoning.
Also fwiw, I run with the model that the simulated opponent has access to both a description of your full behaviour and a shortcut proof that you cooperate with just them. (In the cooperation branch hypothetical.) So when your simulations have the cooperation proof in scope, you’re just saying that the behaviour is equivalent to CooperateBot in this very matchup. And it happens that this is often a sufficient condition for cooperation. (And if your opponent runs you on a different bot like DefectBot, the cooperation proof doesn’t apply there.)
So there’s some question here like “well what’s the point of simulating under the assumption of the narrow cooperation proof if we’re keeping the full description as a fallback?”
Part of what your post points at is an answer which is that, well, it works out more cleanly if you do! Specifically that reasoning is easier if, for the purposes of deciding to cooperate, you assume the decision is made and check that it’s stable/good.
As for why that is the case, I have a more philosophical take that it’s about encoding choice, and the better mechanics we see are downstream of encoding this better. I’ll go into more detail in this collapsed section (you may well wish to gather your own thoughts first).
on encoding choice
We can think of the bot as containing some mechanisms that lead to certain actions, with conditions for firing.
The FairBot variant described in this post is a simple instance, with a default action of defection and a single mechanism that leads to coooperation when it fires.
I have some idea that the choice-y mechanisms are of the form “if I activate, will the outcome satisfy some property? if so I activate”.
So for a given mechanism we can have a widget with a goal target . We want a way of saying “if were to fire this would lead to ” as our condition for . And there’s a question of how to encode this in a program or in our modal logic setting.
Straight up trying won’t work, since and you won’t be able to form a statement like this. This was omitting the part where is using some model of consequences to make its choice.
So next up is , in which fires if our modelling shows that if fires this will lead to . You can in fact show that here!
The other option is ; this doesn’t assume ” fires” but rather “someone has a model that fires”, and is actually a bit more brittle. The reason this form can be preferable in the modal logic bots is that you are sidestepping the complexity of showing that you will need to check your opponent can tell that has fired. But I think this may be a bit of a hack relative to the version.
My reasoning here suggests for that if you have multiple conditions for action, you may factor out // that are each self-referential, and build your high level agent as trying each in sequence. I haven’t fully cached out yet what this looks like, and whether it e.g. gives you a meaningfully different PrudentBot.
I note that this picture doesn’t provide a convenient answer for the problem where in provability land your action “defect” may cause someone to need to prove . But as I guess I keep mentioning I view this as a bug in the provability encoding.