JesseClifton

Karma: 828

JesseClifton 26 Apr 2026 16:47 UTC
LW: 4 AF: 3
2
AF
in reply to: paulfchristiano’s comment on: Lukas Finnveden’s Shortform
To me, “managing the news” is just a description of how EDT works in general (i.e., EDT is definitionally about picking the action that gives us the best news). And I think EDT is problematic for that fundamental reason. I just think that Lukas’ case makes the silliness of news-management particularly vivid. (Other cases which arguably do so are XOR blackmail and this case.)

(I do think that Lukas’ case gets some of its counterintuitiveness from the fact that SIA has us weight worlds in proportion to how many copies of us they contain. But, again, that is just a counterintuitive property of SIA in general, which I think we ought to evaluate independently of how it interacts with decision theory.)

JesseClifton 25 Apr 2026 20:46 UTC
LW: 5 AF: 3
3
AF
in reply to: Lukas Finnveden’s comment on: Lukas Finnveden’s Shortform
1. I don’t think that the updateful EDT behaviour in e.g. the calculator example is obviously problematic. Certainly not clearly worse than the alternative of just optimizing relative to the prior (cf. Anthony’s post).
2. I do think that the buy-and-copy behaviour from your example is bad, but it is bad because of how EDT manages the news, not because of the combination of EDT and anthropic updating per se. A counterfactual theory like FDT or TDT doesn’t manage the news and so doesn’t use the buy-and-copy strategy, AIUI. (Maybe similar cases could be constructed for counterfactual theories, though?)
3. To me, 1 and 2 suggest that we should consider a counterfactual theory (without going updateless relative to the prior), not just EDT + updateless relative to the prior.
4. In any case, I’m sympathetic to the ideal reflection principle, i.e., we should optimize our subjective expectation of the ideal agent’s expected utilities. So if you think the ideal agent is updateless relative to the prior, then you should make decisions based on your expectations of your expectations relative to this prior if you could compute it. (This includes the expected value of policies for handling reasoning/logical learning.) Of course it’s very unclear how to form such beliefs, but that doesn’t seem like a problem specific to updatelessness (i.e., it’s also unclear how to form beliefs about an updateful ideal agent).

JesseClifton 14 Mar 2026 16:30 UTC
LW: 3 AF: 3
0
AF
on: Operationalizing FDT
Thanks for this!

What is the “hydrogen maximization problem”?

Why do you think that having to be empirically updateless is unfortunate?

JesseClifton 15 Feb 2026 22:46 UTC
6 points
4
in reply to: Buck’s comment on: Eli’s shortform feed
They will only self-modify to cooperate with twins whose action is causally downstream of their commitment, right? So a CDT agent will not self-modify to a policy that does acausal trade with twins outside the lightcone for example.

JesseClifton 26 May 2025 20:20 UTC
3 points
2
in reply to: Wei Dai’s comment on: The stakes of AI moral status
I had in mind
- 1 and 2 on your list;
- Like 1 but more general, it seems plausible that value-by-my-lights-on-reflection is highly sensitive to the combination of values, decision theory, epistemology, etc which civilization ends up with, such that I should be clueless about the value of little marginal nudges to philosophical competence;
- Ryan’s reply to your comment.

JesseClifton 26 May 2025 14:04 UTC
1 point
2
in reply to: Wei Dai’s comment on: The stakes of AI moral status
Personally, I worry about AIs being philosophically incompetent, and think it’d be cool to work on, except that I have no idea whether marginal progress on this would be good or bad. (Probably that’s not the reason for most people’s lack of interest, though.)

JesseClifton 10 Feb 2025 19:27 UTC
3 points
2
in reply to: Lucius Bushnaq’s comment on: Notes on Occam via Solomonoff vs. hierarchical Bayes
Thanks!
where you just take the uniform prior over all programs of length T, then let T to infinity
Sure, but because of language-dependence I’m not sure why we would want to apply the principle of indifference at this level. (Note that the quote says “if we can find a privileged parameterization to which we can apply the principle”.) I tend to think you should apply the POI at the “explanatorily basic” level (see here, here), which might be the properties of fundamental objects in the ontology (e.g., position and momentum in Newtonian mechanics, maybe?). Otherwise I think you run into unsatisfying-to-me arbitrariness.
In what sense does the universe have structure, such that a-priori bit strings of observations about the universe ought to be treated by us as more than members of the set of all possible bit strings?
Right, I think this is the kind of question you’re not going to be able to answer without thinking about the kinds of ontological considerations pointed to here.
afaik that’s false
Thanks, will change.

JesseClifton 9 Feb 2025 21:23 UTC
LW: 4 AF: 3
4
AF
in reply to: Wei Dai’s comment on: In Defense of Open-Minded UDT
FWIW, in our original formulation of open-minded updatelessness, the idea was about revising the prior via either

-becoming aware of new possibilities (thereby changing the support of the prior);
-”philosophy”, i.e., reflecting on principles for prior-setting. (Which would allow for the use of an “objective” prior, if we ended up thinking there was such a thing.)

(Cool post, Abraham, thanks!)

JesseClifton 8 Feb 2025 19:50 UTC
LW: 7 AF: 3
2
AF
on: Updatelessness doesn’t solve most problems
Open-Minded Updatelessness doesn’t push back on this fundamental trade-off. Instead, given the trade-off persists, it explores which kinds and shapes of partial commitments seem more robustly net-positive from the perspective of our current game-theoretic knowledge (that is, our current prior)

I would distinguish between two cases:
1. We’re aware of specific reasons why continuing to follow an OMU policy could be harmful (e.g., specific reasons other agents might punish us for doing so). In such cases, if those hypotheses have high-enough weight, OMU as a normative criterion can itself recommend not continuing to follow the OMU policy. So, in this sense I agree with the quote, but this doesn’t undermine OMU as a normative criterion.
2. We’re not aware of any specific reasons why continuing to follow the OMU policy could be harmful. In that case, it seems arbitrary to form beliefs according to which it’s net-negative to continue following OMU, and so it seems reasonable to continue following OMU (I’m not sure what else to do).
Sorry for only now commenting...

JesseClifton 16 Nov 2024 0:32 UTC
4 points
0
on: Why I’m not a Bayesian
This paper discusses two semantics for Bayesian inference in the case where the hypotheses under consideration are known to be false.
- Verisimilitude: p(h) = the probability that that h is closest to the truth [according to some measure of closeness-to-truth] among hypotheses under consideration
- Counterfactual: p(h) = the probability of h given the (false) supposition that one of the hypotheses under consideration is true
In any case, it’s unclear what motivates making decisions by maximizing expected value against such probabilities, which seems like a problem for boundedly rational decision-making.

JesseClifton 6 Nov 2024 15:39 UTC
6 points
4
in reply to: sunwillrise’s comment on: Winning isn’t enough
mildly disapprove of words like “a widely-used strategy”
The text says “A widely-used strategy for arguing for norms of rationality involves avoiding dominated strategies”, which is true* and something we thought would be familiar to everyone who is interested in these topics. For example, see the discussion of Dutch book arguments in the SEP entry on Bayesianism and all of the LessWrong discussion on money pump/dominance/sure loss arguments (e.g., see all of the references in and comments on this post). But fair enough, it would have been better to include citations.
“we often encounter claims”
We did include (potential) examples in this case. Also, similarly to the above, I would think that encountering claims like “we ought to use some heuristic because it has worked well in the past” is commonplace among readers so didn’t see the need to provide extensive evidence.
*Granted, we are using “dominated strategy” in the wide sense of “strategy that you are certain is worse than something else”, which glosses over technical points like the distinction between dominated strategy and sure loss.

JesseClifton 21 Nov 2023 22:50 UTC
2 points
0
in reply to: Garrett Baker’s comment on: D0TheMath’s Shortform
What principles? It doesn’t seem like there’s anything more at work here than “Humans sometimes become more confident that other humans will follow through on their commitments if they, e.g., repeatedly say they’ll follow through”. I don’t see what that has to do with FDT, more than any other decision theory.
If the idea is that Mao’s forming the intention is supposed to have logically-caused his adversaries to update on his intention, that just seems wrong (see this section of the mentioned post).
(Separately I’m not sure what this has to do with not giving into threats in particular, as opposed to preemptive commitment in general. Why were Mao’s adversaries not able to coerce him by committing to nuclear threats, using the same principles? See this section of the mentioned post.)

JesseClifton 21 Nov 2023 21:45 UTC
5 points
0
in reply to: Garrett Baker’s comment on: D0TheMath’s Shortform
I don’t think FDT has anything to do with purely causal interactions. Insofar as threats were actually deterred here this can be understood in standard causal game theory terms. (I.e., you claim in a convincing manner that you won’t give in → People assign high probability to you being serious → Standard EV calculation says not to commit to threat against you.) Also see this post.

JesseClifton 18 Nov 2023 16:19 UTC
2 points
0
on: SIA > SSA, part 4: In defense of the presumptuous philosopher
Awesome sequence!
I wish that discussions of anthropics were clearer about metaphysical commitments around personal identity and possibility. I appreciated your discussions of this, e.g., in Section XV. I agree with what you, though, that it is quite unclear what justifies the picture “I am sampled from the set of all possible people-in-my-epistemic situation (weighted by probability of existence)”. I take it the view of personal identity at work here is something like “‘I’ am just a sequence of experiences S”, and so I know I am one of the sequences of experiences consistent with my current epistemic situation E. But the straightforward Bayesian way of thinking about this would seem to be: “I am sampled from all of the sequences of experiences S consistent with E, in the actual world”.
(Compare with: I draw a ball from an urn, which either contains (A) 10 balls or (B) 100 balls, 50% chance each. I don’t say “I am indifferent between the 110 possible balls I could’ve drawn, and therefore it’s 10:1 that this ball came from (B).” I say that with 50%, ball came from (A) and with 50% the ball came from (B). Of course, there may be some principled difference between this and how you want to think about anthropics, but I don’t see what it is yet.)
This is just minimum reference class SSA, which you reject because of its verdict in God’s Coin Toss with Equal Numbers. I agree that this result is counterintuitive. But I think it becomes much more acceptable if (1) we get clear about the notion of personal identity at work and (2) we try to stick with standard Bayesianism. mrcSSA also avoids many of the apparent problems you list for SSA. Overall I think mrcSSA’s answer to God’s Coin Toss with Equal Numbers is a good candidate for a “good bullet” =).
(Cf. Builes (2020), part 2, who argues that if you have a deflationary view of personal identity, you should use (something that looks extensionally equivalent to) mrcSSA.)

JesseClifton 4 Oct 2023 11:18 UTC
3 points
3
in reply to: Daniel Kokotajlo’s comment on: Open-minded updatelessness

But it’s true that if you had been aware from the beginning that you were going to be threatened, you would have wanted to give in.

To clarify, I didn’t mean that if you were sure your counterpart would Dare from the beginning, you would’ve wanted to Swerve. I meant that if you were aware of the possibility of Crazy types from the beginning, you would’ve wanted to Swerve. (In this example.)

I can’t tell if you think that (1) being willing to Swerve in the case that you’re fully aware from the outset (because you might have a sufficiently high prior on Crazy agents) is a problem. Or if you think (2) this somehow only becomes a problem in the open-minded setting (even though the EA-OMU agent is acting according to the exact same prior as they would’ve if they started out fully aware, once their awareness grows).

(The comment about regular ol exploitability suggests (1)? But does that mean you think agents shouldn’t ever Swerve, even given arbitrarily high prior mass on Crazy types?)

What if anything does this buy us?

In the example in this post, the ex ante utility-maximizing action for a fully aware agent is to Swerve. The agent starts out not fully aware, and so doesn’t Swerve unless they are open-minded. So it buys us being able to take actions that are ex ante optimal for our fully aware selves when we otherwise wouldn’t have due to unawareness. And being ex ante optimal from the fully aware perspective seems preferable to me than being, e.g., ex ante optimal from the less-aware perspective.

More generally, we are worried that agents will make commitments based on “dumb” priors (because they think it’s dangerous to think more and make their prior less dumb). And EA-OMU says: No, you can think more (in the sense of becoming aware of more possibilities), because the right notion of ex ante optimality is ex ante optimality with respect to your fully-aware prior. That’s what it buys us.

And revising priors based on awareness growth differs from updating on empirical evidence because it only gives other agents incentives to make you aware of things you would’ve wanted to be aware of ex ante.

they need to gradually build up more hypotheses and more coherent priors over time

I’m not sure I understand—isn’t this exactly what open-mindedness is trying to (partially) address? I.e., how to be updateless when you need to build up hypotheses (and, as mentioned briefly, better principles for specifying priors).

JesseClifton 29 Sep 2023 18:37 UTC
2 points
1
in reply to: Daniel Kokotajlo’s comment on: Open-minded updatelessness
If I understand correctly, you’re making the point that we discuss in the section on exploitability. It’s not clear to me yet why this kind of exploitability is objectionable. After all, had the agent in your example been aware of the possibility of crazy agents from the start, they would have wanted to swerve, and non-crazy agents would want to take advantage of this. So I don’t see how the situation is any worse than if the agents were making decisions under complete awareness.

JesseClifton 29 Sep 2023 17:15 UTC
4 points
0
in reply to: Daniel Kokotajlo’s comment on: Open-minded updatelessness
Can you clarify what “the problem” is and why it “recurs”?

My guess is that you are saying: Although OM updatelessness may work for propositions about empirical facts, it’s not clear that it works for logical propositions. For example, suppose I find myself in a logical Counterfactual Mugging regarding the truth value of a proposition P. Suppose I simultaneously become aware of P and learn a proof of P. OM updatelessness would want to say: “Instead of accounting for the fact that you learned that P is true in your decision, figure out what credence you would have assigned to P had you been aware of it at the outset, and do what you would have committed to do under that prior”. But, we don’t know how to assign logical priors.

Is that the idea? If so, I agree that this is a problem. But it seems like a problem for decision theories that rely on logical priors in general, not OM updatelessness in particular. Maybe you are skeptical that any such theory could work, though.

JesseClifton 11 Jul 2023 11:38 UTC
2 points
1
in reply to: Dagon’s comment on: Open-minded updatelessness
The model is fully specified (again, sorry if this isn’t clear from the post). And in the model we can make perfectly precise the idea of an agent re-assessing their commitments from the perspective of a more-aware prior. Such an agent would disagree that they have lost value by revising their policy. Again, I’m not sure exactly where you are disagreeing with this. (You say something about giving too much weight to a crazy opponent — I’m not sure what “too much” means here.)

Re: conservation of expected evidence, the EA-OMU agent doesn’t expect to increase their chances of facing a crazy opponent. Indeed, they aren’t even aware of the possibility of crazy opponents at the beginning of the game, so I’m not sure what that would mean. (They may be aware that their awareness might grow in the future, but this doesn’t mean they expect their assessments of the expected value of different policies to change.) Maybe you misunderstand what we mean by “unawareness”?

JesseClifton 10 Jul 2023 21:18 UTC
6 points
5
in reply to: Dagon’s comment on: Open-minded updatelessness

For this to be wrong, the opponent must be (with some probability) irrational—that’s a HUGE change in the setup

For one thing, we’re calling such agents “Crazy” in our example, but they need not be irrational. They might have weird preferences such that Dare is a dominant strategy. And as we say in a footnote, we might more realistically imagine more complex bargaining games, with agents who have (rationally) made commitments on the basis of as-yet unconceived of fairness principles, for example. An analogous discussion would apply to them.

But in any case, it seems like the theory should handle the possibility of irrational agents, too.

You can’t just say “Alice has wrong probability distributions, but she’s about to learn otherwise, so she should use that future information”. You COULD say “Alice knows her model is imperfect, so she should be somewhat conservative, but really that collapses to a different-but-still-specific probability distribution.

Here’s what I think you are saying: In addition to giving prior mass to the hypothesis that her counterpart is Normal, Alice can give prior mass to a catchall that says “the specific hypotheses I’ve thought of are all wrong”. Depending on the utilities she assigns to different policies given that the catchall is true, then she might not commit to Dare after all.

I agree that Alice can and should include a catchall in her reasoning, and that this could reduce the risk of bad commitments. But that doesn’t quite address the problem we are interested in here. There is still a question of what Alice should do once she becomes aware of the specific hypothesis that the predictor is Crazy. She could continue to evaluate her commitments from the perspective of her less-aware self, or she could do the ex-ante open-minded thing and evaluate commitments from the priors she should have had, had she been aware of the things she’s aware of now. These two approaches come apart in some cases, and we think that the latter is better.

You don’t need to bring updates into it, and certainly don’t need to consider future updates. https://www.lesswrong.com/tag/conservation-of-expected-evidence means you can only expect any future update to match your priors.

I don’t see why EA-OMU agents should violate conservation of expected evidence (well, the version of the principle that is defined for the dynamic awareness setting).

JesseClifton 10 Jul 2023 19:22 UTC
3 points
5
in reply to: Dagon’s comment on: Open-minded updatelessness
Thanks Dagon:

Any mechanism to revoke or change a commitment is directly giving up value IN THE COMMON FORMULATION of the problem

Can you say more about what you mean by “giving up value”?

Our contention is that the ex-ante open-minded agent is not giving up (expected) value, in the relevant sense, when they “revoke their commitment” upon becoming aware of certain possible counterpart types. That is, they are choosing the course of action that would have been optimal according to the priors that they believe they should have set at the outset of the decision problem, had they been aware of everything they are aware of now. This captures an attractive form of deference — at the time it goes updateless / chooses its commitments, such an agent recognizes its lack of full awareness and defers to a version of itself that is aware of more considerations relevant to the decision problem.

As we say, the agent does make themselves exploitable in this way (and so “gives up value” to exploiters, with some probability). But they are still optimizing the right notion of expected value, in our opinion.

So I’d be interested to know what, more specifically, your disagreement with this perspective is. E.g., we briefly discuss a couple of alternatives (close-mindedness and awareness growth-unexploitable open-mindedness). If you think one of those is preferable I’d be keen to know why!

This model doesn’t seem to really specify the full ruleset that it’s optimizing for

Sorry that this isn’t clear from the post. I’m not sure which parts were unclear, but in brief: It’s a sequential game of Chicken in which the “predictor” moves first; the predictor can fully simulate the “agent’s” policy; there are two possible types of predictor (Normal, who best-responds to their prediction, and Crazy, who Dares no matter what); and the agent starts off unaware of the possibility of Crazy predictors, and only becomes aware of the possibility of Crazy types when they see the predictor Dare.

If a lack of clarity here is still causing confusion, maybe I can try to clarify further.

I also suspect you’re conflating updates of knowledge with strength and trustworthiness of commitment. It’s absolutely possible (and likely, in some formulations about timing and consistency) that a player can rationally make a commitment, and then later regret it, WITHOUT preferring at the time of commitment not to commit.

I’m not sure I understand your first sentence. I agree with the second sentence.