I recently wrote a piece defending free will from Sam Harris’s critique. The core argument was compatibilist: that free will and determinism are compatible, and that free will is best understood as a deliberative algorithm of an agent weighing options and selecting among them based on one’s reasons and values. The causal chain traces back through factors you didn’t choose, but the algorithm is still yours. I also argued that because no one has ultimate authorship, retributive punishment and hatred don’t make sense—accountability should be forward-looking.
I thought some more about it and I think some of it holds up. But I also think I let myself off easy in a few places and didn’t give sufficient weight to some aspects of the argument, so here’s a response to myself.
It’s Not All About How People Use the Word
Your arguments about how people use the term “free will” are only semi-convincing. You leaned hard on common usage: people feel like they make choices, they report evaluating options, so free will is real in the sense that matters. Fine. That’s a valid argument (I mean, I did write it), but that proves less than you made it sound.
People can be wrong about what they’re experiencing. Someone who genuinely believes she’s psychic isn’t made psychic by the sincerity of the belief. We don’t say “well, that’s how she uses the word, so she really is seeing the future.” We say she’s having some experience (fooling herself into thinking she can predict the future) and mislabeling it (as really being psychic).
My point isn’t that free willers are making a falsifiable empirical claim the way the psychic is. It’s that sincerely experiencing something doesn’t settle what that something is. Yes, people have a feeling of control when they deliberate. Yes, they describe it using the language of free will. But maybe they’re doing exactly what the psychic is doing—having a real experience and attaching the wrong label to it. The feeling of choosing is real. Whether that feeling is free will, or just something that gets mistaken for it, is a question you skip when you focus on language usage.
You’re Not Taking Determinism Seriously
If people really grokked what it means to have a determined future, I don’t think they would say they have free will. Here’s an experiment: Take some young college freshman. He’s wide-eyed, idealistic, convinced his life could go in a thousand directions. Maybe he’ll start a company. Maybe he’ll move to Japan. Maybe he’ll drop out and write a novel. The world feels open to him, and that openness feels like freedom.
Now sit him down on his college dorm twin mattress and show him photographs from his future. Here’s your graduation—you stuck with accounting. Here’s your first job at Intuit. Here’s your wife, Sarah. Here are your two kids, Ethan and Ethel. Here’s your house in Plano, Texas. You’ll be mostly happy, by the way. It’s a good life.
Ask him how much free will he’s feeling.[1]
The feeling of open possibilities that excited him five minutes ago will have been replaced by something closer to watching a movie he happens to star in. He’s going through the motions of a life that was always going to happen. But his future is no more determined than it was five minutes ago. The only change is his level of ignorance about it.
Now, as we just saw, a real experience can get the wrong label. Just because the feeling of free will requires ignorance doesn’t mean that free will dissolves without it.
Fair enough, but if the feeling of freedom is completely independent of whether you’re actually free, then the compatibilist has to admit that the entire phenomenology of choice—the thing that makes free will matter to people—is essentially a byproduct of ignorance. You can keep calling the underlying causal process “free will” if you like, but you’ve just acknowledged that a lot of what people actually value about the experience is an illusion. That’s a strange place for a defense of free will to land.
So if you’re going to rely on people’s intuitions, you’ve got to admit that you’re relying on their intuitions while in a state of ignorance. The feeling of free will tracks with how much you know about your future actions, not with how free you are. Give someone total knowledge of their future and the feeling disappears, even though the causal structure of the universe hasn’t changed at all. I’m not saying the whole algorithm collapses without ignorance, but certainly the feeling of it does.
You Haven’t Closed the Door on Harsh Punishment
In your original piece, you treated the case against harsh punishment as largely settled—if no one has ultimate authorship, retributive justifications simply don’t hold. But this is not so. Even if we grant your premises, some moral frameworks can still justify harsh punishment without relying on retributivism at all.
For example, a utilitarian might argue that when someone murders a publicly beloved figure, the community’s need to see a severe response is itself a real psychological good. The justification here is forward-looking—perhaps people heal better and can move on when they feel justice has been done. The punishment looks retributive, but the reasoning isn’t.
Of course, satisfying this desire could still be net bad for society. Historically speaking, societies that have fed the public’s appetite for punishment through public executions and the like have generally seemed like worse places to live.[2]So the argument might still ultimately fail on utilitarian grounds. But notice that now you’re having an empirical disagreement, so it’s clearly not a settled matter.
My point is only that the absence of ultimate authorship doesn’t close the door on harsh punishment under every moral framework. It eliminates retributive justifications, but that’s just one argument against harsh punishment, not the last word.
Where Is The Love?
In your previous piece, you spent a lot of time on hatred. If determinism is true, you argued, we don’t have to hate people for their worst acts. We can still hold them accountable, still protect society, still call behavior “wrong”, but there is no sense in hating anyone. You thought you were giving something up gracefully while keeping what mattered. It was a very tidy arrangement. But where I come from coins have two sides.
If a murderer doesn’t ultimately deserve your hatred because his cruelty traces back through an unbroken chain of genes, upbringing, and circumstance he never chose, then, by the same logic, a saint doesn’t ultimately deserve your love either. Not the deep kind. Not the kind where you look at someone and think, “What a genuinely good person they are, all the way down. They deserve good things.” They might be that good of a person, but how can they deserve love for it any more than an evil person deserves hatred?
You admire Gandhi. I admire Gandhi. He is among the most admired people who has ever lived. But by the logic you stated, Gandhi’s compassion and courage were products of a particular genetic hand and a particular environment, none of which he selected. Gandhi didn’t really do anything all that special, once you correct for his genes and environment. Anyone would have done it, even you. Maybe you deserve a statue in your honor. The plaque could read: “Had he had Gandhi’s genes and been placed in Gandhi’s situation, he, too, would have done great things.”
Perhaps you could still salvage a universal love, like a love for all conscious things or something. I’ll grant that, but that applies to serial killers as much as human rights leaders.
You said in your original piece that you can still prefer kind people, that you’re allowed to like being around someone who makes you laugh and treats you well. But if you call that “love” notice how thin it’s gotten. It’s the love of someone who is useful and pleasant to you, which is roughly the same way you love a good sofa. It’s not the love where you think another person deserves to flourish because of who they are. They’re a very nice sofa, but they didn’t upholster themselves.
This is, I think, considerably more unsettling than the punishment side, which is where everyone focuses. And understandably so—blame is where the stakes feel highest, where people go to prison or get forgiven. But we spend far more of our lives loving people than hating them, admiring people than condemning them. If determinism hollows out hatred, it hollows out love by exactly the same logic. You don’t get to keep one and discard the other, however much you might like to.
But that’s precisely what you tried to do. You softened blame while leaving love and admiration quietly in place. The honest move is to admit that the same wrecking ball swings both ways. Simply removing everything that constitutes blame causes collateral damage elsewhere.
What Are You Doing with Your Moral Agency?
I think there’s a tension within your claims that you haven’t addressed. On one hand, you say that people don’t deserve retributive punishment because they are not the ultimate authors of their actions. You say that punishment should be forward-looking, such as keeping dangerous people away from society. On the other hand, you say that people are moral agents.
So, what does moral agency actually do?
Consider a case where there is no forward-looking justification for punishment. Imagine a woman decides she doesn’t want her children anymore and kills them. She then gets irreversibly sterilized. She will never have children again, so her chance of reoffending is zero. No one else knows about the crime except a single police officer, so there is no deterrence value in publicly prosecuting her. She has no criminal history, mental illness, or other factors. She needs no treatment, no rehabilitation, no monitoring.
On a purely forward-looking account, there’s nothing to do here. And if you try to wriggle in some forward-looking justification we’ll just change the scenario so that it doesn’t apply. The point is, the forward-looking framework says let her go. And that answer feels monstrous.
Why? Because when the woman killed her children, she didn’t just break some procedural rule. She killed people who had moral standing. Those children were moral patients with welfare, futures, and the capacity to suffer, and her deliberative algorithm weighed all of that and chose to destroy them anyway. The moral reality of what she did doesn’t reduce to a forward-looking management problem.
This raises practical questions you haven’t answered. Is the woman’s calculation—weighing her children’s lives against her freedom and deciding her freedom was worth more—a moral claim that society must rebut? Should society impose a cost that outweighs the gain she sought? Should it tell her that she cannot trade another’s life for her convenience and come out ahead?
There’s a vast space between “you deserve to suffer for what you are” and “you’re just a system we need to manage.” You haven’t explored that middle space.
And the forward-looking framework fails for another reason. If punishment is justified entirely by the probability of future harm, then the only thing that matters is how well something predicts future harm, not whether the predictor itself is a crime. It might be the case that having committed a crime correlates more strongly with future harm than anything else, thus justifying some form of response for criminal activity. But this is an empirical claim, and we had better be prepared for it to go the other way. If it turns out that having a bad childhood or a face tattoo predicts future offending just as well as having committed a burglary, then a purely forward-looking system should treat them equally. Are you prepared to incarcerate people for bad childhoods?
You talked a lot about how having free will means you’re responsive to reasons, so you could say that moral agents are the kind of entities on which reasoning-based interventions work. You can explain to a person why they were wrong. You can use reason to deter them. You can’t do that with a bear. So perhaps moral agency tells you which tools are available, not whether someone deserves anything.
But this reduces moral responsibility to something almost entirely instrumental. It just tells us what type of intervention would be effective on a given system. Calling it “moral responsibility” feels empty. It would be more honest to call it “reasons based deterrence susceptibility”.
And notice what happens: we end up treating the woman who killed her children the same way we would treat a bear that mauled a hiker. We don’t blame the bear. We don’t sentence it for its crimes. We create a management plan for it. We put a dangerous bear down or relocate it for public safety. If we incarcerate a dangerous person only for public safety, then we’re treating them the same. In both cases the justification is forward-looking harm reduction. If there’s no difference in the response to a bear, which lacks moral agency, and a human, who has it, then moral agency seems to be little more than a farce.
When people say someone is “responsible” for a terrible act, they mean something beyond “this person’s cognitive architecture is amenable to deterrence.” They mean something backward-looking—this person should have done otherwise, and there’s some appropriateness to a negative response directed at them for what they did.
But you’re trying to preserve the language of moral agency and responsibility while draining it of the backward-looking content that normally gives it force. Without some backward-looking element, I’m not sure these terms have much meaning. When I blame you I’m not just trying to strategically intervene on your behavior. I’m making a claim about you as the agent you are. I’m saying: your algorithm had the capacity to weigh moral considerations, and it didn’t weigh them properly, and that failure is attributable to you in the proximate sense that the essay has spent thousands of words defending as a real and meaningful thing.
Guilt is the internal response of a moral agent recognizing their own failure. Indignation is the response of someone wronged by a being capable of having done otherwise in the Could₁[3]sense. Forgiveness is real because there’s a genuine moral debt to release. These responses are partly backward-looking and irreducibly so. Without that backward-looking element, the entire fabric of the moral community dissolves into mutual behavioral management.
Is there a way to rescue yourself from this?
You’ve done a lot to establish moral agency. You’ve talked about the deliberative algorithm, about reflecting and revising oneself, about becoming an agent. You claimed free will grounds praise and blame. Let’s say we accept all that. Then when you act wrongly despite having the capacity to process moral reasons, something is true of you as an agent that is not true of the bear. There remains some form of desert—call it “agent-desert”—that falls directly out of proximate authorship. It says: you, the integrated deliberative system, produced this action through your own evaluative process, and that action was wrong, and that wrongness is attributable to you. Not to the Big Bang, not to your genes, but to the agent you’ve become.
Could this work? Is agent-desert what gives moral emotions meaning? Can it ground the whole ecosystem of responses, like guilt, indignation, and forgiveness? Not because the universe demands someone suffer, but because these are the correct responses between beings in moral relationships with each other.
So, I ask you again: what does moral agency actually do? Can it change a prison sentence? If it doesn’t ground some form of backward-looking appraisal—if it just tells us which management tools to use—then it’s not doing the work you claimed it does.
You need to admit that this is a backward-looking notion. Agent-desert says that what you did matters, that your relationship to your past action is morally significant, that the right response to a wrong isn’t just a management strategy but a recognition of what occurred between moral agents. The moment you accept that, you’ve conceded that purely forward-looking punishment was never adequate. A purely forward-looking framing is inadequate not because it’s too lenient, but because it’s the wrong kind of response to a moral agent.
You wanted moral agency to matter. You argued for it extensively. But then you left it with nothing to do. Agent-desert at least gives it something to do. Whether this is right or sufficient is still up in the air. But it shows that the purely forward-looking framework was never going to get you there.
Where Does This Leave Me?
I still think the core compatibilist insight is right—that deliberating and being responsive to reasoning is fundamental to free will. I think we need to preserve a meaningful distinction between a person who acts and a boulder that rolls. Those parts remain.
But I think I kept the parts of the moral landscape I liked and tried to surgically remove the parts I didn’t, as if the wrecking ball could be aimed. And I left moral agency standing at the center of my framework with no job to do, which is worse than not having it at all. An ornamental load-bearing wall.
I’m not sure if agent-desert is the fix. Does it all collapse into the retributive cruelty I was trying to escape? I’m not sure yet.
And if you have to blank his mind MIB-style afterwards so he can’t make new decisions, fine, whatever. You get the idea. ↩︎
Though of course comparisons across time are difficult to make, so we shouldn’t put too much weight on it. ↩︎
Could₁ was defined in the previous essay as “could have done otherwise if my reasons or circumstances were different”. ↩︎
Bridging between physical computers and abstract programs seems helpful as a prerequisite to thinking about free will. (Similarly, between real physical systems in some regime such as classical mechanics, and corresponding abstract physical systems.) A physical thing determines its abstract model, letting it be observed, but also an abstract model can serve as a design for a physical thing that follows its structure. When an abstract model has its own internal dynamics, the corresponding physical thing would need to remain in step with that abstract dynamics.
The correspondence goes both ways: the abstract dynamics, as a design, must determine the physical dynamics, and also the observations of how the physical dynamics proceeds must be useful as observations of how the abstract dynamics proceeds. A physical calculator computes a certain answer because of what the abstract fact of arithmetic determines that answer to be, but also observing what the physical calculator computes lets us learn what the abstract fact of arithmetic says.
A physical human is confusing, because there are too many factors involved. But an abstract human that exists on their own, or better yet an abstract decision making deliberation, can be considered in isolation, as a process that proceeds exclusively according to its own abstract dynamics. The physical world then has no choice but to remain in step with that dynamics, or else the abstract dynamics is not a faithful model of what takes place in the physical world. Decisions taken by the abstract deliberation can’t let it escape the correspondence with the physical world. But also it’s free to reach any results that it does, and the physical world would need to comply with them, the same as a physical computer would need to comply with what the abstract program running on it is doing.
I like this way of thinking about it.
This is a fascinating topic but I’m going to try to keep this brief and get back to alignment, which is equally fascinating and much more urgent.
My standard statement is that “free will” is just a very bad term. Why would I want my will to be uncaused? I want it my will to pursue the things I want, including letting me change the things I want if I want. And that’s how it works. Tossing “free” in to the term makes it sound bad to not have it, but I think that’s just a confusion.
The term “self-determination” seems to better capture the type of free will worth wanting (Dennett’s work on this is my favorite; self-determination is my proposed term, (I think)).
Then, a nitpick on the thought experiment that may or may not matter for how you feel about it.. You may have a determined future, but you can’t know it in advance.
If you somehow told someone their future, it would change it. Two elements of how our world works prevent this. First, if you somehow had an accurate prophecy, you’d need some strange new laws or properties of the universe to prevent that person from saying “oh I’d rather have a somewhat different life than you describe, I’ll use that information about my default path to steer differently”. The other is the three-body problem, a statement of how interacting elements make accurate predictions from physics exponentially difficult, so that even with just a few interacting pieces, the computational difficulty ramps up very rapidly until a computer the size of the universe can’t predict the state of a brain in a year (that’s a very rough guess but it’s extreme).
I agree that free will probably just isn’t the right term. This whole thing started as a response to Sam Harris, and that’s the term he used. I imagine switching to “self-determination” might resolve half of the dispute, so I like that. I think Gary Watson used that term as well, so you’re in good company.
Yeah, I also agree that telling someone their future would change it. That’s why I had that footnote about blanking their mind after you tell them Men-in-Black style. You tell them their future, tell them you’re going to blank their mind in five minutes, then ask how much free will they feeling. This isn’t the strongest response and I think that’s because this is a real weakness of what I said. Both the thought experiments (telling the boy his future and the woman who kills her children yet somehow has “nothing wrong with her”) are so outlandish that I think it’s fair to just say, “I reject the experiment”. Definitely a weakness for sure. Thanks for reading it and commenting.
It doesn’t have to deter her. It’s following through on your threat to show others that you’re the sort of person/government/society that follows through on your threats.
Yeah, I get that point. That is why I started the sentence with “no one else knows” so that the society hasn’t gained any knowledge about the situation. Thus letting her go doesn’t show others anything. But maybe the thought experiment is so convoluted that it doesn’t really work.
By doing this, it is almost certain that these photographs won’t be from his future unless you are a superintelligence and very carefully chose both that person and those photographs knowing that his future contingent upon him seeing those photographs = the depicted future.
Those photographs can only be correct to the extent that he allows them to be correct. Not every such system has a fixed point, and for some selections of person this may be simply impossible.
But is this even possible for any predictee at all? After all, his future is also your future, and you would need to be able to predict your own actions and their consequences in perfect detail as well as that of the rest of the universe.
Can a system have sufficient computing power to do this, while still being embedded in the universe that is being predicted in perfect detail? This seems a pretty big assumption, and it’s not just an assumption about a specific universe (which we can just posit as a hypothetical) but a mathematical assumption about the degree to which complex computational systems (e.g. able to prove theorems of arithmetic) are capable of perfectly predicting their own future behaviour.
To me, it seems an unlikely proposition.