I previously wrote a post about reconciling free will with determinism. The metaphysics implicit in Pearlian causality is free will (In Drescher’s words: “Pearl’s formalism models free will rather than mechanical choice.”). The challenge is reconciling this metaphysics with the belief that one is physically embodied. That is what the post attempts to do; these perspectives aren’t inherently irreconcilable, we just have to be really careful about e.g. distinguishing “my action” vs “the action of the computer embodying me” in a the Bayes net and distinguishing the interventions on them.
I wrote another post about two alternatives to logical counterfactuals: one says counterfactuals don’t exist, one says that your choice of policy should affect your anticipation of your own source code. (I notice you already commented on this post, just noting it for completeness)
And a third post, similar to the first, reconciling free will with determinism using linear logic.
I’m interested in what you think of these posts and what feels unclear/unresolved, I might write a new explanation of the theoretical perspective or improve/extend/modify it in response.
You’ve linked me to three different posts, so I’ll address them in separate comments.
Two Alternatives to Logical Counterfactuals
I actually really liked this post—enough that I changed my original upvote to a strong upvote. I also disagree with the notion that logical counterfactuals make sense when taken literally so I really appreciated you making this point persuasively. I agreed with your criticisms of the material condition approach and I think policy-dependent source code could be potentially promising. I guess this naturally leads to the question of how to justify this approach. This results in questions like, “What exactly is a counterfactual?” and “Why exactly do we want such a notion?” and I believe that following this path leads to the discovery that counterfactuals are circular.
I’m more open to saying that I adopt Counterfactual Non-Realism than I was when I originally commented although I don’t see theories based on material conditionals as the only approach within this category. I guess I’m also more enthusiastic about thinking in terms of policies rather than action mainly because of the lesson I drew from the Counterfactual Prisoner’s Dilemma. I don’t really know why I didn’t make this connection at the time, since I had written that post a few months prior, but I appear to have missed this.
I still feel that introducing the term “free will” is too loaded to be helpful here, regardless of whether you are or aren’t using it in a non-standard fashion. Like I’d encourage you to structure your posts to try to separate:
a) This is how we handle counterfactuals b) This is the implications of this for the free will debate
A large part of this is because I suspect many people on Less Wrong are simply allergic to this term.
Thoughts on Modeling Naturalized Logic Decision Theory Problems in Linear Logic
I hadn’t heard of linear logic before—it seems like a cool formalisation—although I tend to believe that formalisations are overrated as unless they are used very carefully they can obscure more than they reveal.
I believe that spurious counterfactuals are only an issue with the 5 and 10 problem because of an attempt to hack logical-if to substitute for counterfactual-if in such a way that we can reuse proof-based systems. It’s extremely cool that we can do as much as we can working in that fashion, but there’s no reason why we should be surprised that it runs into limits.
So I don’t see inventing alternative formalisations that avoid the 5 and 10 problem as particularly hard as the bug is really quite specific to systems that try to utilise this kind of hack. I’d expect that almost any other system in design space will avoid this. So if, as I claim, attempts at formalisation will avoid this issue by default, the fact that any one formalisation avoids this problem shouldn’t give us too much confidence in it being a good system for representing counterfactuals in general.
Instead, I think it’s much more persuasive to ground any proposed system with philosophical arguments (such as your first post was focusing on), rather than mostly just posting a system and observing it has a few nice properties. I mean, your approach in this article certainly a valuable thing to do, but I don’t see it as getting all the way to the heart of the issue.
One way is by asserting that the logic is about the territory, while the proof system is about the map; so, counterfactuals are represented in the map, even though the map itself asserts that there is only a singular territory.
Interestingly enough, this mirrors my position in Why 1-boxing doesn’t imply backwards causation where I distinguish between Raw Reality (the territory) and Augmented Reality (the territory augmented by counterfactuals). I guess I put more emphasis on delving into the philosophical reasons for such a view and I think that’s what this post is a bit short on.
I’m not sure where you got the idea that this was to solve the spurious counterfactuals problem, that was in the appendix because I anticipated that a MIRI-adjacent person would want to know how it solves that problem.
The core problem it’s solving is that it’s a well-defined mathematical framework in which (a) there are, in some sense, choices, and (b) it is believed that these choices correspond to the results of a particular Turing machine. It goes back to the free will vs determinism paradox, and shows that there’s a formalism that has some properties of “free will” and some properties of “determinism”.
A way that EDT fails to solve 5 and 10 is that it could believe with 100% certainty that it takes $5 so its expected value for $10 is undefined. (I wrote previously about a modification of EDT to avoid this problem.)
CDT solves it by constructing physically impossible counterfactuals which has other problems, e.g. suppose there’s a Laplace’s demon that searches for violations of physics and destroys the universe if physics is violated; this theoretically shouldn’t make a difference but it messes up the CDT counterfactuals.
It does look like your post overall agrees with the view I presented. I would tend to call augmented reality “metaphysics” in that it is a piece of ontology that goes beyond physics. I wrote about metaphysical free will a while ago and didn’t post it on LW because I anticipated people would be allergic to the non-physicalist philosophical language.
I’m not sure where you got the idea that this was to solve the spurious counterfactuals problem, that was in the appendix because I anticipated that a MIRI-adjacent person would want to know how it solves that problem.
Thanks for that clarification.
A way that EDT fails to solve 5 and 10 is that it could believe with 100% certainty that it takes $5 so its expected value for $10 is undefined
I suppose that demonstrates that the 5 and 10 problem is a broader problem than I realised. I still think that it’s only a hard problem within particular systems that have a vulnerability to it.
It does look like your post overall agrees with the view I presented. I would tend to call augmented reality “metaphysics” in that it is a piece of ontology that goes beyond physics
Yeah, we have significant agreement, but I’m more conservative in my interpretations. I guess this is a result of me being, at least in my opinion, more skeptical of language. Like I’m very conscious of arguments where someone says, “X could be described by phrase Y” and then later they rely on connations of Y that weren’t proven.
For example, you write, “From the AI’s perspective, it has a choice among multiple actions, hence in a sense “believing in metaphysical free will”. I would suggest it would be more accurate to write: “The AI models the situation as though it had free will” which leaves open the possibility that it is might be just a pragmatic model, rather than the AI necessarily endorsing itself as possessing free will.
Another way of framing this: there’s an additional step in between observing that an agent acts or models a situation as it believes in freewill and concluding that it actually believes in freewill. For example, I might round all numbers in a calculation to integers in order to make it easier for me, but that doesn’t mean that I believe that the values are integers.
Comments on A critical agential account of free will, causation, and physics
Consider the statement: “I will take action A”. An agent believing this statement may falsify it by taking any action B not equal to A. Therefore, this statement does not hold as a law. It may be falsified at will.
We can imagine a situation where there is a box containing an apple or a pear. Suppose we believe that it contains a pear, but we believe it contains an apple. If we look in the box (and we have good reason to believe looking doesn’t change the contents), then we’ll falsfy our pear hypothesis. Similarly, if we’re told by an oracle that if we looked we would see a pear, then there’d be no need for us to actually look, we’d have heard enough to falsify our pear hypothesis.
However, the situation you’ve identified isn’t the same. Here you aren’t just deciding whether to make an observation or not, but what the value of that observation would be. So in this case, the fact that if you took action B you’d observe the action you took was B doesn’t say anything about the case where you don’t take action B, unlike knowing that if you looked in the box you’d see you an apple provides you information even if you don’t look in the box. It simply isn’t relevant unless you actually take B.
Interestingly, falsificationism takes agency (in terms of observations, computation, and action) as more basic than physics. For a thing to be falsifiable, it must be able to be falsified by some agent, seeing some observation. And the word able implies freedom.
I think it’s reasonable to suggest starting from falsification as our most basic assumption. I guess where you lose me is when you claim that this implies agency. I guess my position is as follows:
It seems like agents in a deterministic universe can falsify theories in at least some sense. Like they take two different weights drop them and see they land at the same time falsifying the fact that heavier objects fall faster
On the other hand, some like agency or counterfactuals seems necessary for talking about falsfiability in the abstract as this involves saying that we could falsify a theory if we ran an experiment that we didn’t.
In the second case, I would suggest that what we need is counterfactuals not agency. That is, we need to be able to say things like, “If I ran this experiment and obtained this result, then theory X would be falsified”, not “I could have run this experiment and if I did and we obtained this result, then theory X would be falsified”.
In other words, I think that there is something behind the intuition which I’m guessing led you to these views, but am in favour of developing it in a different direction than you.
I didn’t read past this point, not because I thought it was uninteresting, but because it already took me a while to figure out how to articulate my objections to the article up to this point and I still have to look at one of your posts. But let me know if there’s anything further down more directly related to whether counterfactuals are circular.
It seems like agents in a deterministic universe can falsify theories in at least some sense. Like they take two different weights drop them and see they land at the same time falsifying the fact that heavier objects fall faster
The main problem is that it isn’t meaningful for their theories to make counterfactual predictions about a single situation; they can create multiple situations (across time and space) and assume symmetry and get falsification that way, but it requires extra assumptions. Basically you can’t say different theories really disagree unless there’s some possible world / counterfactual / whatever in which they disagree; finding a “crux” experiment between two theories (e.g. if one theory says all swans are white and another says there are black swans in a specific lake, the cruxy experiment looks in that lake) involves making choices to optimize disagreement.
In the second case, I would suggest that what we need is counterfactuals not agency. That is, we need to be able to say things like, “If I ran this experiment and obtained this result, then theory X would be falsified”, not “I could have run this experiment and if I did and we obtained this result, then theory X would be falsified”.
Those seem pretty much equivalent? Maybe by agency you mean utility function optimization, which I didn’t mean to imply was required.
The part I thought was relevant was the part where you can believe yourself to have multiple options and yet be implemented by a specific computer.
Basically you can’t say different theories really disagree unless there’s some possible world / counterfactual / whatever in which they disagree;
Agreed, this is yet another argument for considering counterfactuals to be so fundamental that they don’t make sense outside of themselves. I just don’t see this as incompatible with determinism, b/c I’m grounding using counterfactuals rather than agency.
Those seem pretty much equivalent? Maybe by agency you mean utility function optimization, which I didn’t mean to imply was required.
I don’t mean utility function optimization, so let me clarify what as I see as the distinction. I guess I see my version as compatible with the determinist claim that you couldn’t have run the experiment because the path of the universe was always determined from the start. I’m referring to a purely hypothetical running with no reference to whether you could or couldn’t have actually run it.
Hopefully, my comments here have made it clear where we diverge and this provides a target if you want to make a submission (that said, the contest is about the potential circular dependency of counterfactuals and not just my views. So it’s perfectly valid for people to focus on other arguments for this hypothesis, rather than my specific arguments).
I previously wrote a post about reconciling free will with determinism. The metaphysics implicit in Pearlian causality is free will (In Drescher’s words: “Pearl’s formalism models free will rather than mechanical choice.”). The challenge is reconciling this metaphysics with the belief that one is physically embodied. That is what the post attempts to do; these perspectives aren’t inherently irreconcilable, we just have to be really careful about e.g. distinguishing “my action” vs “the action of the computer embodying me” in a the Bayes net and distinguishing the interventions on them.
I wrote another post about two alternatives to logical counterfactuals: one says counterfactuals don’t exist, one says that your choice of policy should affect your anticipation of your own source code. (I notice you already commented on this post, just noting it for completeness)
And a third post, similar to the first, reconciling free will with determinism using linear logic.
I’m interested in what you think of these posts and what feels unclear/unresolved, I might write a new explanation of the theoretical perspective or improve/extend/modify it in response.
You’ve linked me to three different posts, so I’ll address them in separate comments.
Two Alternatives to Logical Counterfactuals
I actually really liked this post—enough that I changed my original upvote to a strong upvote. I also disagree with the notion that logical counterfactuals make sense when taken literally so I really appreciated you making this point persuasively. I agreed with your criticisms of the material condition approach and I think policy-dependent source code could be potentially promising. I guess this naturally leads to the question of how to justify this approach. This results in questions like, “What exactly is a counterfactual?” and “Why exactly do we want such a notion?” and I believe that following this path leads to the discovery that counterfactuals are circular.
I’m more open to saying that I adopt Counterfactual Non-Realism than I was when I originally commented although I don’t see theories based on material conditionals as the only approach within this category. I guess I’m also more enthusiastic about thinking in terms of policies rather than action mainly because of the lesson I drew from the Counterfactual Prisoner’s Dilemma. I don’t really know why I didn’t make this connection at the time, since I had written that post a few months prior, but I appear to have missed this.
I still feel that introducing the term “free will” is too loaded to be helpful here, regardless of whether you are or aren’t using it in a non-standard fashion. Like I’d encourage you to structure your posts to try to separate:
a) This is how we handle counterfactuals
b) This is the implications of this for the free will debate
A large part of this is because I suspect many people on Less Wrong are simply allergic to this term.
Thoughts on Modeling Naturalized Logic Decision Theory Problems in Linear Logic
I hadn’t heard of linear logic before—it seems like a cool formalisation—although I tend to believe that formalisations are overrated as unless they are used very carefully they can obscure more than they reveal.
I believe that spurious counterfactuals are only an issue with the 5 and 10 problem because of an attempt to hack logical-if to substitute for counterfactual-if in such a way that we can reuse proof-based systems. It’s extremely cool that we can do as much as we can working in that fashion, but there’s no reason why we should be surprised that it runs into limits.
So I don’t see inventing alternative formalisations that avoid the 5 and 10 problem as particularly hard as the bug is really quite specific to systems that try to utilise this kind of hack. I’d expect that almost any other system in design space will avoid this. So if, as I claim, attempts at formalisation will avoid this issue by default, the fact that any one formalisation avoids this problem shouldn’t give us too much confidence in it being a good system for representing counterfactuals in general.
Instead, I think it’s much more persuasive to ground any proposed system with philosophical arguments (such as your first post was focusing on), rather than mostly just posting a system and observing it has a few nice properties. I mean, your approach in this article certainly a valuable thing to do, but I don’t see it as getting all the way to the heart of the issue.
Interestingly enough, this mirrors my position in Why 1-boxing doesn’t imply backwards causation where I distinguish between Raw Reality (the territory) and Augmented Reality (the territory augmented by counterfactuals). I guess I put more emphasis on delving into the philosophical reasons for such a view and I think that’s what this post is a bit short on.
Thanks for reading all the posts!
I’m not sure where you got the idea that this was to solve the spurious counterfactuals problem, that was in the appendix because I anticipated that a MIRI-adjacent person would want to know how it solves that problem.
The core problem it’s solving is that it’s a well-defined mathematical framework in which (a) there are, in some sense, choices, and (b) it is believed that these choices correspond to the results of a particular Turing machine. It goes back to the free will vs determinism paradox, and shows that there’s a formalism that has some properties of “free will” and some properties of “determinism”.
A way that EDT fails to solve 5 and 10 is that it could believe with 100% certainty that it takes $5 so its expected value for $10 is undefined. (I wrote previously about a modification of EDT to avoid this problem.)
CDT solves it by constructing physically impossible counterfactuals which has other problems, e.g. suppose there’s a Laplace’s demon that searches for violations of physics and destroys the universe if physics is violated; this theoretically shouldn’t make a difference but it messes up the CDT counterfactuals.
It does look like your post overall agrees with the view I presented. I would tend to call augmented reality “metaphysics” in that it is a piece of ontology that goes beyond physics. I wrote about metaphysical free will a while ago and didn’t post it on LW because I anticipated people would be allergic to the non-physicalist philosophical language.
Thanks for that clarification.
I suppose that demonstrates that the 5 and 10 problem is a broader problem than I realised. I still think that it’s only a hard problem within particular systems that have a vulnerability to it.
Yeah, we have significant agreement, but I’m more conservative in my interpretations. I guess this is a result of me being, at least in my opinion, more skeptical of language. Like I’m very conscious of arguments where someone says, “X could be described by phrase Y” and then later they rely on connations of Y that weren’t proven.
For example, you write, “From the AI’s perspective, it has a choice among multiple actions, hence in a sense “believing in metaphysical free will”. I would suggest it would be more accurate to write: “The AI models the situation as though it had free will” which leaves open the possibility that it is might be just a pragmatic model, rather than the AI necessarily endorsing itself as possessing free will.
Another way of framing this: there’s an additional step in between observing that an agent acts or models a situation as it believes in freewill and concluding that it actually believes in freewill. For example, I might round all numbers in a calculation to integers in order to make it easier for me, but that doesn’t mean that I believe that the values are integers.
Comments on A critical agential account of free will, causation, and physics
We can imagine a situation where there is a box containing an apple or a pear. Suppose we believe that it contains a pear, but we believe it contains an apple. If we look in the box (and we have good reason to believe looking doesn’t change the contents), then we’ll falsfy our pear hypothesis. Similarly, if we’re told by an oracle that if we looked we would see a pear, then there’d be no need for us to actually look, we’d have heard enough to falsify our pear hypothesis.
However, the situation you’ve identified isn’t the same. Here you aren’t just deciding whether to make an observation or not, but what the value of that observation would be. So in this case, the fact that if you took action B you’d observe the action you took was B doesn’t say anything about the case where you don’t take action B, unlike knowing that if you looked in the box you’d see you an apple provides you information even if you don’t look in the box. It simply isn’t relevant unless you actually take B.
I think it’s reasonable to suggest starting from falsification as our most basic assumption. I guess where you lose me is when you claim that this implies agency. I guess my position is as follows:
It seems like agents in a deterministic universe can falsify theories in at least some sense. Like they take two different weights drop them and see they land at the same time falsifying the fact that heavier objects fall faster
On the other hand, some like agency or counterfactuals seems necessary for talking about falsfiability in the abstract as this involves saying that we could falsify a theory if we ran an experiment that we didn’t.
In the second case, I would suggest that what we need is counterfactuals not agency. That is, we need to be able to say things like, “If I ran this experiment and obtained this result, then theory X would be falsified”, not “I could have run this experiment and if I did and we obtained this result, then theory X would be falsified”.
In other words, I think that there is something behind the intuition which I’m guessing led you to these views, but am in favour of developing it in a different direction than you.
I didn’t read past this point, not because I thought it was uninteresting, but because it already took me a while to figure out how to articulate my objections to the article up to this point and I still have to look at one of your posts. But let me know if there’s anything further down more directly related to whether counterfactuals are circular.
The main problem is that it isn’t meaningful for their theories to make counterfactual predictions about a single situation; they can create multiple situations (across time and space) and assume symmetry and get falsification that way, but it requires extra assumptions. Basically you can’t say different theories really disagree unless there’s some possible world / counterfactual / whatever in which they disagree; finding a “crux” experiment between two theories (e.g. if one theory says all swans are white and another says there are black swans in a specific lake, the cruxy experiment looks in that lake) involves making choices to optimize disagreement.
Those seem pretty much equivalent? Maybe by agency you mean utility function optimization, which I didn’t mean to imply was required.
The part I thought was relevant was the part where you can believe yourself to have multiple options and yet be implemented by a specific computer.
Agreed, this is yet another argument for considering counterfactuals to be so fundamental that they don’t make sense outside of themselves. I just don’t see this as incompatible with determinism, b/c I’m grounding using counterfactuals rather than agency.
I don’t mean utility function optimization, so let me clarify what as I see as the distinction. I guess I see my version as compatible with the determinist claim that you couldn’t have run the experiment because the path of the universe was always determined from the start. I’m referring to a purely hypothetical running with no reference to whether you could or couldn’t have actually run it.
Hopefully, my comments here have made it clear where we diverge and this provides a target if you want to make a submission (that said, the contest is about the potential circular dependency of counterfactuals and not just my views. So it’s perfectly valid for people to focus on other arguments for this hypothesis, rather than my specific arguments).