DanielW

Karma: 15

DanielW 23 Apr 2026 23:37 UTC
1 point
0
in reply to: Davey Morse’s comment on: Davey Morse’s Shortform
Mainstream among scientists was different from political communities. Scientists didn’t have the same expectations (which isn’t to say they never do or didn’t have expectations of their own).
>my question is whether this threshold is higher or delay is greater when the scale at which the theory operates is enormous or tiny.
As I said, the larger the implications of a theory, the more interwoven and difficult to test its implications, the harder it is to rule out alternative explanations
>there was no way that light could have a speed because it would need to be too fast
The speed of light was measured to be in the ballpark of 300 million m/s since the late 17th century. There were a bunch of rather clever ways of estimating it way before Einstein. There wasn’t a consensus frim the time if ancient greece through the 17th century, but since ancient Greece it has been imagined that light might consist of some emissions that take non-zero time to propagate.

DanielW 23 Apr 2026 12:14 UTC
2 points
1
in reply to: Mitchell_Porter’s comment on: An Angry Review of Greg Egan’s “Didicosm”
>This is why Egan is avoiding stories of transcendence—out of fear that these will become comforting or enervating falsehoods for the superfans who take it literally.
I don’t think that is quite true. He has been pretty explicit in interviews of his views in this regard—he was dissatisfied with writings in the 80s he saw as “churning out very lame noir plots that utterly squandered the philosophical implications of the technology,” regularly expresses dissatisfaction with re-treading concepts he feels he has already explored and whatnot. It seems pretty evident he views transcendence similiarly, less interesting to him as a philosophical concept to explore and a trope he has engaged with already in a lot of his works (and one many others have engaged with).
>It’s even why he is apriori against the idea of an LLM-powered singularity happening in the real world.
I think it is better to assume he is honest about his reasons for being critical of AI. He facially doesn’t find the idea of AI in general implausible, but views many claims as being on their face seemingly silly and unevidenced. He has expressed sympathy that human minds are not inherently unique (including by citing Tegram, incidentally), and could be emulated by artifical machines (some of his works deal with this directly!) but explicitly doesn’t see on its face how running human language through a series of regression models would lead to human extinction or create an emergent entity with human intelligence (lacking any evidence of such a thing). That is a perfectly reasonable view and I would agree with cautioning in favor of understanding technology and advancements inside empirical frameworks that we can evidence.

DanielW 8 Apr 2026 14:06 UTC
1 point
0
in reply to: leogao’s comment on: leogao’s Shortform
What do you mean by predictions? In sense of predicting the “direction” of history not really—historicism is generally poor because the reality of humanity on a large scale is massively complex and filled with stochastic, unknown, unpredictable uncertainties.
Predictions for historical work are judged by their ability to predict historical observations. That is, how well a theory conforms to current and future observations derived from records (the evaluation of which is themselves extremely nuanced in many cases), archeological findings and to a somewhat more controversial extent experimental archeology.
You wouldn’t judge a theory of how stars formation by how well it predicts who the next president will be, rather you judge it by how well it conforms to our present and future observations and understanding of the material facts involved (e.g., how atoms and matter works) and our observations of the cosmos.

DanielW 8 Apr 2026 13:57 UTC
6 points
0
in reply to: Davey Morse’s comment on: Davey Morse’s Shortform
I don’t think this is true of the examples you gave.
The establishment of astronomy took heliocentrism very seriously, the split was with political communities not with science communities. It took a while to have good enough observational evidence of orbits to strongly support heliocentrism, but once we had those observations from Gallileo and Kepler, their arguments were largely accepted by contemporary scientists.
Evolution by natural selection was also pretty much universally accepted by the scientific community when we had strong enough evidence. There were a bunch of prior ideas about how speciation/evolution occurred (most notably Lamarkian evolution), within Darwin’s life time, fossil evidence had been collected and examined to demonstrate evolution by natural selection and his views were pretty much universally adopted by the scientific community.
Special relativity was a lot harder experimentally, but within a couple years of Einstein’s initial publication, it also had wide spread acceptance though disputed experimental results meant it took a couple decades before the evidence was universally overwhelming.
There certainly is a delay, it is easier to come up with a theory than provide evidence for it. Initial proposals must be evaluated, critiqued and rigorously tested. The larger the implication of a theory and the more their are plausible alternative hypotheses, the longer it takes to have enough evidence to strongly prefer one theory over another.

DanielW 8 Apr 2026 13:37 UTC
1 point
0
in reply to: DanielW’s comment on: DanielW’s Shortform
I am curious, since I never got any replies but a few disagrees, is there any meaningful recommendations that FDT gives for voting that I may have missed?

DanielW 20 Mar 2026 5:49 UTC
2 points
0
in reply to: papetoast’s comment on: DanielW’s Shortform
just saying that prediction markets are not perfectly efficient
I wouldn’t say anyone is really discussing perfect efficiency. You could reasonably say roulette is a fairly efficient market, the price accurately reflects 95% of the value.
There isn’t really value you can reliably, fairly make off them (otherwise someone like Jane Street would probably have scooped it up). You may be able to make money in some, but odds are more likely than not you are wrong and the juice isn’t worth the squeeze.
Close enough that it sometimes makes sense to act as if market probability = true probability.
Yes, and if that is the case, your chances are greater than not that no matter how intelligent you are if that is true using a prediction market will lose you money.
still have great utility
What, though? The main valid arguments for utility i see—risk management and price discovery—are not how the platforms are being used or not good reasons to trade (on current platforms) respectively. To be clear, I think there are valid arguments for utility in potential, but what is currently being promoted by proponents in extant markets doesn’t match the potential utility.
instead you create a market and put $10k in the automatic market maker to incentivize people to make their best guess.
In theory, maybe. In reality, doesn’t happen on any current platforms. To the extent they adopt automatic market makers, those market makers act more as bookies, providing liquidity only at prices in excess of reasonable fair odds. Some play money platforms use automatic market makers designed to encourage price discovery (which Hanson has advocated for) but I am not aware of any large prediction markets that do so.
Also the prediction markets are not negative sum for society as a whole, the fee doesn’t just evaporate
They have costs and there still is time value issues (to be truly net-zero, the total value of any position on average would have to appreciate by the risk-free rate + a risk premium), so it is still a net negative sum in that sense.
But I was talking from the perspective of a person trading, not society as a whole.
Taking money from dumb people in a zero-sum game is fine.
I don’t necessarily think so. Like, a lot of scams are zero sum, but I would say scamming people is still wrong (even in cases where it might not legally be fraud). Taking advantage of people being gullible or gamblers is generally something I would not view favorably, and where there is a net loss (as is the case with, say, gambling at a casino) my default is to take a rather skeptical view.
In the case of equity, there is the fairly valid claim that it determines how equity is allocated, and by definition more efficiently allocating equity leads to greater economic product and therefore a net positive (and equity tends to increase in value above the risk free rate). Traders in prediction markets aren’t really doing that; they are not capital markets.

DanielW 20 Mar 2026 3:50 UTC
2 points
−1
in reply to: papetoast’s comment on: DanielW’s Shortform
Also efficiency can be relative to a trader even if everyone only trades on public information because in
I’ll be blunt and start here, as it is an important mistake I see made a lot. I think a lot of people misunderstand what market efficiency refers to. It has nothing to do with individual trader behavior. You could be seeking out the worst investments, that wouldn’t matter to the question of how efficient the market is. Market efficiency is just about the price. To the extent the price accurately reflects available information (or the underlying fundamental value) the market is efficient. An individual traders behavior is entirely irrelevant.
Also, efficiency doesn’t equate to “good” so it can be a bit misleading. You could say roulette is 95% efficient, since the price is just off by ~5% of the underlying value (which can be ascertained by available information). But there isn’t any way for you to reliably capture that 5%, it is off because of value captured by the house. And, like prediction markets, roulette is a negative sum game.
Isn’t a natural conclusion then that in reality prediction markets are not completely efficient?
Sure, hence the “if” in my original comment. If you are doing it because you like to gamble, that’s fine. Risk hedging also theoretically makes sense, but isn’t currently how the markets are being used in any meaningful aense (and as mentioned runs dirextly contrary to how proponents like Hanson often suggest they should be used).
If you think the average traders are predictably irrational it should be possible to take their money and enrich the platform in the process. That seems like a bad thing and a good reason to advocate against prediction markets. Negative sum games are not exactly great for the economy, and anyone with altruistic views should view negative sum games as definitionally net negatives, I would argue.
Proponents of prediction markets generally argue they are efficient markets. If they aren’t, the utility generally claimed to exist isn’t there.
Subsidy can come in another form, a player can spend money trying to elicit information so that they can use the information somewhere else.
Spending money doesn’t gain information. As I mentioned in my other comment. There is no additional price discovery function you can access to by placing a bid or ask on a prediction market. To the extent the market has that value, it is independent of your transactions.

DanielW 19 Mar 2026 17:20 UTC
1 point
0
in reply to: DanielW’s comment on: DanielW’s Shortform
I didn’t mention it, but of course another benefit a lot of proponents reference is price discovery. This is certainly a real benefit (though, I am a bit skeptical about some of the claims of the extent of utility), but as an individual not a good reason to trade. I have seen some, like Scott Alexander, argue that platforms will pay traders to use them. This seems facially implausible to me and not clear how it would be structured. But in any case, it isn’t how the platforms actually work (some do pay interest on positions, but they pay less than the risk-free rate).
Also, some have argued for actually encouraging insider trading, since allowing insiders to trade will make the prices even more accurate predictions. Even if we grant this argument, this makes it even worse reason for a typical trader to enter the market. It would mean paying a premium to play an unfair game, paying to get worse odds seems inherently non-sensical.

DanielW 19 Mar 2026 15:18 UTC
3 points
−1
on: DanielW’s Shortform
If prediction markets are efficient, no one should use them (how they are currently being used). This is something that has bothered me for a while about the claims with prediction markets, particularly when compared to equity markets. Prediction markets are fundamentally a negative sum game (since you lose fees/interest taken by the platforms). If they were efficient (i.e., the price reflects all available information accurately), then you should in general always expect to lose in the long run if you don’t have any private information. Equity markets have an obvious advantage of generally being a net positive, since they reflect the value of underlying firms/assets which tend to increase in value.
Of course, the more obvious closer comparison is to financial derivatives. In theory, I do see a strong case for prediction markets as effectively more direct methods of exchanging and managing risk (similar to how financial instruments like interest rate swaps are used). But the reality is pretty much no one using prediction markets trades that way, and indeed people advocating for the use of prediction markets generally discourage using them that way (e.g., a common case made by those like Hanson is that you should bet on things you expect and desire to happen so you have a vested interest in the outcome, this is the exact opposite of what you would do if you were using them as risk management tools, in that case you would want to bet against the outcome you most expect and are making other financial decisions on so that in the case you are wrong the damage is mitigated).

DanielW 17 Mar 2026 20:51 UTC
1 point
0
in reply to: the gears to ascension’s comment on: Elizabeth’s Shortform
It may not be a very good test, in many cases. Perhaps modifying it to gauge confidence could be better?
One can imagine there are near infinite sets of things that might be true for whatever the secret knowledge is regarding. Only a subset actually is a true. What is the most probable prior may be wildly divergent from what is actually true. If you judge purely on how they confirm to your secret set that doesn’t tell you how good they are at forecasting in general, just that they happen to be wrong on that set.
If you gauge confidence, that might be better. If they are very confident about something you know to be wrong, it is unlikely that the prior probability lined up with reality. If they are only moderately confident, or believe it best explains the evidence they have but are fully aware it may be incomplete or not explain other evidence they lack, then it seems unreasonable to strongly hold a view based on them.

DanielW 17 Mar 2026 17:47 UTC
1 point
−2
on: DanielW’s Shortform
Does Logical Decision Theories actually give meaningfully better recommendations on real world problems, particularly voting, frequently referenced?
One of the main reasons given for preferring logical decision theories (LDT), or particularly functional decision theory (FDT) is that agents do better in real world problems. Indeed, the article here on logical decision theory opens by discussing voting. I recently posted a discussion of a hypothetical where FDT agents perform worse, but I think when applying it in practice to the real world case of voting which is often given as a preference is actually better (see here for Eliezer Yudkowsky’s discussion of voting under decision theories where he argues for logical decision theory being better). Particularly, I think that for most people this discussion gets wrong what causal decision theory actually would recommend.
To begin (note, I spend a while going over how to model voting decisions and different utility to CDT modeling of decisions for a few paragraphs, and later discuss practical agent to agent comparisons), let us imagine what the expected utility is for an agent under CDT of voting in some election. Let’s say there are two candidates, like Yudkowsky, I will use the Simpson’s Kang and Kodos. If Kang wins, we have some expected outcome (O1), if Kodos wins we have some expected outcome (O2). Let’s say our agent is a Kang supporter and has a positive evaluation of O1 such that O1>O2.^[1]
Our agent is evaluating the value of voting for Kang (A1) or not voting (A0).
In the simplest case, with no externalities an EDT agent would say: “we should vote if our evidential evidence indicates voting is more likely to lead to Kang winning” (i.e., if P(O1|A1)>P(O1|A0). A CDT agent would say “we should vote if there is a positive probability that our vote will cause Kang to win” (we can say this works out equivalently, if P(O1|A1)>P(O1|A0) we should vote).
If we are a simplistic agent, in both cases we should vote, as in either case the value is something greater than zero. But of course, realistically we are not so simple agents, and have some cost to voting. Taking one more step of complexity and stopping there is where I think Yudowsky (and others) go wrong. They correctly note for most real world scenarios the probabilistic effects of a single vote are de minus and humans have some cost associated with voting.
For the CDT agent, they expect there is some probability of their vote being pivotal (we can say P(pivotal)=P(O1|A1)-P(O1|A0) ). They also have some cost of voting (say E). So really, they should vote if P(pivotal) * (O1-O2) > E. That is to say, if the probability of their vote being pivotal, times the difference in expected outcomes caused by their decision to vote, is greater than the cost.
This leads to some sensible recommendations, (i.e., you should be more likely to vote the less costly it is to vote, the more impactful the outcome of the election and the more likely it is your vote will be pivotal). If I am a policy maker and want to increase voting, I should use the policy levers at my disposal to reduce E, and political campaigners should emphasize the impact of the election and the odds of voters impacting results to increase turnout. This is what we observe in the real world.
Where Yudowsky says CDT gets it wrong, however, is that as mentioned the P(pivotal) is vanishingly small. While I framed P(pivotal) as the difference in odds for voting or not, for a CDT agent this could also be reduced to the odds of the candidates tying but-for your vote. Obviously, it is incredibly rare that major elections come down to single voters. For the EDT agent, they don’t have to make this reduction so fair slightly better under uncertainty, but still would value the difference in odds as very small. Yudowsky says this misses the mark. But what if we take our model one step further? Our agent is a person, people place real value on things other than strict outcomes.
When someone says “it is your civic duty to vote” they are appealing to a real value we can include in our utility functions—people value being members of civic society and participation. In addition, there are social benefits to voting in the form of signaling, people proudly display ‘I voted’ stickers all the time. This is not internally independent, the more contested an election and the more meaningful the outcomes, the more valuable signaling is.
We can say P(pivotal) is a function of the degree an election is contested and general voting population (number of people and associated behaviors). Similarly, we may say the value of signaling in an election is a function of social values and how contested an election can be.
So we can say a CDT agent under real world conditions should vote if P(pivotal) * (O1-O2) + personal utility (e.g. personal values on being a civically engaged person) + social utility (e.g., benefits from signaling you are civically engaged) > E
This again leads to additional sensible recommendations. If I think well of myself as a civically minded person and value civic contributions and have social connections for whom signaling my voting behavior will provide social benefits, that should all increase my odds of voting. Similarly, for policy makers and political activists, increasing the civic mindedness and social value placed on voting, publicly being seen to promote and reward those voting, etc can be seen as recommended way to increase agent’s decision to vote.
Bringing things together, where does FDT differ and what is the utility of these in practice?
As mentioned above, if I am a CDT agent deciding whether to vote, I have to answer the question “is my value from voting, which includes my personal values around voting and my expectations for the likelihood of my vote being pivotal and my expectations for different outcomes of the election greater than my anticipated cost of voting?” One’s expectations of the election being pivotal can be determined by examining polls and voting models and making empirically based estimates of the outcomes (these also likely affect one’s expectation of the value of signaling). This can be simply understood as trying to estimate what the causal outcome of my votes are likely to be. The empirical side of it is readily, concretely definable depending on my estimated values and while values and signaling effects must be individually parsed, they are usually fairly straightforward for most people.
So where does LDT/FDT differ? As Yud has put it instead of asking what your decision’s outcome “You ask what would happen if people like you voted.” This gives an obvious recommendation absent under CDT: “the more people that are similar to you, the more you should vote.” This recommendation does not match as neatly with intuition (at least not with my institution) and, in fact, implicitly seems to run counter to Yudowsky’s previous statement that under LDT it would still be irrational to vote if “if you don’t expect any of the elections to be close.” Under LDT if a lot of people like you (which might be heuristically judged by people voting for the same candidate as you) are voting, that would seem to provide more evidence that you should vote even more so the more dominant your side of the election is. Though this hinges on a fairly evident open question: who constitutes “people like you?” According to Yud, this is “just an empirical question” but is it really? His 2016 post on LessWrong gives multiple different ways you might think about who constitutes someone similar to you. There doesn’t seem to be any unified way for an agent to really make that estimate. Should we perhaps use polls or previous results, maybe just personal estimates of how many people might think similiarly to use under our theory of mind? Those questions seem unanswered, if answerable at all, and depending on how you understand it can lead to contradictory, unintuitive voting patterns.
Case 1 Voting for an Underdog:
So, I am an FDT agent. I want to estimate whether I should vote or not. I look at past polling data, 40% of Kang supporters voted in the past. 50% of Kodos voters voted. I expect that the odds of Kodos winning are, say, 80% at present and the odds of my vote being pivotal is 0.000001%. I know a handful of people who think of themselves as LDT agents, most of them have told me they decided not to vote. How can I calculate my EV from this? I don’t think there really is any clear way to quantify it, but let’s consider a few possibilities qualitatively; should I:
a. Say since I know other LDT agents decided not to vote assume we are similar and that LDT recommends not voting in this decision?
b. Say few people are like me, most people won’t reason the way I do, so the odds of the vote being different are de minus and not vote on those grounds?
c. See that most people with similar values to me are not voting and some are, seeing I identify with those who are voting more, but I therefore estimate that people similar to me are already voting so if people similarly to me voted the outcome would likely be the same. On that basis should I decide that being an agent that votes has little value and decide not to vote?
c. Say that there are a lot of Kang supporters not voting, to some extent we are similar (our voting behavior evidentially has some correlation with eachother), if they voted we would have a good chance of winning so I should by imagining the counterfactual where Kang had arbitrarily higher output and vote on that basis?
I don’t think there is any clear way of judging these scenarios.
CDT gives a pretty clear recommendation. My expected value for voting for Kang is 0.00000001*(O1-O2) + personal utility + social utility—E (cost of voting). If I evaluate that positively, under my personal internal values of personal utility and social utility and whatever the costs are, I should vote. If I do not, I shouldn’t.
Case 2 Protest Voting:
Let’s say Kang was running against Lisa in the primaries. As polls increasingly show turnout for her is low, she is not considered in the election. However, by some definition of people similar to me, if people similar to me were to turn out for her she could still have had some probability of winning and would be so preferable to Kang and Kodos that I chose to vote for her anyway. ^[2]This doesn’t seem very sensible. A CDT agent would value voting for her (despite knowing that in the specific case she had no chance of winning) if they expected the signaling benefit of their vote to be sufficiently greater to justify ‘throwing away’ their vote on a candidate who they know has already lost. There are cases where this might be justified, if one expects the odds of their vote being pivotal to be incredibly low, and the difference between O1 and O2 to be fairly negligible, then a protest vote can be considered somewhat rational under some signaling estimates. That calculation seems to me to be far more pragmatic and meaningful than the reasoning in FDT which requires implicitly treating you treating something you know to be false as if it could be true to determine whether to vote.
In this case, for a CDT agent, the expected value of voting can simply be evaluated as personal utility + social utility—E.
CDT seems to give clear recommendations that can be readily evaluated and do at least a serviceable job modeling real world behavior. FDT gives unclear recommendations, that to the extent they can be evaluated give less helpful recommendations. On that basis, it would seem to me that CDT actually wins out as a framework for considering whether it is rational to vote.
1. ^
  I was going to write out all the math, but I cannot be asked to figure out notation formatting here.
2. ^
  This might be compared to a version of Newcomb’s problem with transparent boxes where the predictor is known to be very inaccurate

DanielW 16 Mar 2026 16:42 UTC
1 point
0
in reply to: Firmament’s comment on: DanielW’s Shortform
Also, putting this in another post since I think it is a major point, if we assume some cost to bargaining, for Derek it approximates something like a dove-hawk game, where Derek gets the first move. Will’s game is more complex as he is operating under information assymetry, so depends on the odds he assigns some probailities
If we consider the value Will pays as X (negative if Derek pays Will), if we assume some cost (C) of both negotiating the outcome, the payoffs works out to (I don’t know how/if you can put tables into comments so I just have to write them out):
Payoffs given FDT-Will with Negotiation:
(1) Will accepts the initial offer (for FDT-Will, X = 1,000,199.99):
- Derek: 1 + 1,000,199.99
- Will: 0 * −1,000,000 + 1 * −999,999.99
(2) Will Contests and Derek Accepts (say X = −0.99^[1]):
- Derek: 1 − 0.99
- Will: P(Derek rejects)*-1,000,0000 + P(Derek accepts)*(200.99) + P(Derek contests)*( (3) Will)
(3) Will and Derek contest over X. X is unspecified under the assumptions, any number where X > (C − 1) and X < (1,000,200 - C) is feasible:
- Derek: 1 + X—C
- Will: P(Derek rejects)^[2]*-1,000,0000 + P(Derek accepts)*(200- X) - C
Counterfactual: Derek doesn’t offer an amount and Will doesn’t contest (X = 0)
- Derek: 1
- Will: 1,000,200
Payoffs given CDT-Will with Negotiation:
(1) Will accepts the initial offer (for CDT-Will, X = 199.99, since anything greater wouldn’t be paid):
- Derek: 1 + 199.99
- Will: 0 * −1,000,000 + 1 * 0.01
(2) Will Contests and Derek Accepts (X = −0.99):
- Derek: 1 − 0.99
- Will: P(Derek rejects)*-1,000,0000 + P(Derek accepts)*(200.99) + P(Derek contests)*( (3).Will)
(3) Will and Derek contest over X. X is unspecified under the assumptions, any number where X > (C − 1) and X < (200 - C) is feasible:
- Derek: 1 + X—C
- Will: P(Derek rejects)*-1,000,0000 + P(Derek accepts)*(200- X) - C
Counterfactual: Derek doesn’t offer an amount and Will doesn’t contest (X = 0)
- Derek: 1
- Will: 1,000,200
While we would need to know Will’s probability estimates to actually model how they behave and what actions they take, from this it seems rather evident that under most approximations CDT-Will is still likely to be better off.
1. ^
  Realistically, Will could set a value of X higher to decrease the chances of Derek contesting. But I am just assuming the extreme case here.
2. ^
  These probabilities depend on the values of X. FDT-Will would estimate that as x approaches 1,000,199.99 P(Derek accepts) approaches 1.

DanielW 16 Mar 2026 15:32 UTC
1 point
0
in reply to: Firmament’s comment on: DanielW’s Shortform
What I mean is that you can think of “CDT agent with certain utility function” and “FDT agent” as exactly the same. They’re the same concept.
They are not. A CDT agent is fundamentally doing a different expected value calculation than a FDT agent. This is why they can lead to radically different outcomes.
I should have clarified more, oops. I was talking about a minor variation of the scenario where the “negotiation is not possible” restriction is lifted (while still keeping the information asymmetry somehow).
Okay, play out the scenario. He offers to take CDT will back for $199.99, what does will say? Will’s expected values are:
1. “Ok”—payoff = $0.01
2. “No”—payoff = -$1,000,000 (this is assuming an honest/total no, the other case is under 3)
3. “No take me for X amount.”—payoff = ???^[1] (he doesn’t know whether Derek will accept or not, note that this includes the case where X is zero or the cases where X is negative).
Now the question becomes “how does Will estimate the payoffs for 3?” What is his expectation for Derek to negotiate? Etc. If we assume sufficient risk aversion (which I would argue is the most probable outcome) 1 is still preferable.
Let’s imagine Derek offers to take FDT will back for $1,000,199.99. Will’s expected values are:
1. Be an agent that would say “OK”—payoff = -$999,999.99
2. Be an agent that would say “No”—payoff = -$1,000,000
3. Be an agent that would say “no take me for X amount”—payoff = ???
4. Edit: Be an agent that would say “Ok” but not actually pay Derek—payoff = -$1,000,000 (I realize I forgot to include this one, as in Parfit’s Hitchiker, the agents’ expected outcome if they wouldn’t pay Derek honestly is that they would be left to die)^[2]
FDT-Will has the same problem as CDT-Will. Though, for FDT-Will, unlike CDT-Will, I would argue under most reasonable assumptions there would be some preferable value for X under 3 that FDT-Will would estimate has a better expected value. Given that, he would try to negotiate a value somewhere between -$1 and $1,000,199.99. Where he would negotiate that value depends on risk aversions and how he estimates the responses from Derek.
And meanwhile if Derek’s $1,000,000 value on honesty is set to $0 BUT he uses FDT then the exact same thing happens absent any weird commitment-race dynamics with FDT-Will
I am not convinced this is true. I don’t see why FDT-Derek would behave differently. If we assume information symmetry, then you get the same commitment race with their priors.

FDT-Will then says “give me $0.99 and I’ll let you save my life”
Why? See above. FDT-Will under your scenario still suffers from information asymmetry. You can argue that the 3 is reasonably the better option for him, but he has no idea what value of X is optimal. We know Derek considers any value >-$0.99 as an expected positive, but Will is operating in an asymmetric environment. He doesn’t know what Derek will decide. It seems reasonable that Will might expect Derek to accept some lower amount, but he is going to have to weigh that against the probability that Derek says “no.” If he has extreme risk aversion, he will still prefer 1 even if he estimates Derek would likely accept a lower price. If he has no risk aversion and Derek cannot counter offer, he will offer whatever he expects Derek to accept.
and poor CDT-Derek will agree.
Why? Let’s lay out CDT-Derek’s option.
1. Agree—payoff = -$5.99
2. Refuse—payoff = -$6.00
3. Refuse and tell FDT-Will “I will only take you back for X amount” (where X is greater than −0.99) - payoff = unspecified (but known to Derek)
It seems likely that CDT-Derek would pick some variant of 3, dependent on what he expects FDT-Will to react with which depends on FDT-Will’s estimation of CDT-Derek. You would expect to get some race with FDT-Will trying to determine Derek’s utility function. Indeed, if we assume perfect information asymmetry, CDT-Derek’s best move is probably to keep saying “I will only take you back for $1,000,199.99” to prevent FDT-Will from getting any information on his utility, if CDT-Derek repeats that until FDT-Will is about to die if he doesn’t make a decision, FDT-Will, having gained no information on the Derek’s utility, is likely to simply accept when he becomes unable to negotiate further (for the same reasons as above).^[3] And, making the standard FDT estimates when he gets to town (i.e., he anticipates if he wasn’t the kind of agent that would pay, he would have been left to die), he would honestly pay the $1,000,199.99.
1. ^
  When I use ‘???’ I mean that it is both unspecified under the assumptions of the equations and unknown to the agent. We would need to add additional specifications to the problem to determine the expected payoff for different values of X, and without changing assumptions Will’s expected payoff from X would remain unknown to Will. We could add assumptions for Will’s estimates (which are not likely to be equivalent to the real payoffs), to determine what Will would estimate the expected payoffs are for different values of X.
2. ^
  I am not including this option for the CDT agent since it is a strictly inferior version of 1, since their payout for honesty is $200 it is trivial that they would always be honest for under $199.99
3. ^
  If there is no such cutoff, they are in a classic battle of the sexes type problem FDT-Will’s expected payoff from a deal is $1,000,200 - X, where Derek’s payoff is $1 + X. It is not clear what FDT-Will’s position would have to be for him to expect CDT-Derek to accept a better deal. Any deal from −0.99 to $1,000,199.99 is feasible under our assumptions (and would be a Nash Equilibrium) but we have no reason to expect any outcome in that range without adding some assumptions.

DanielW 16 Mar 2026 1:42 UTC
2 points
0
in reply to: Firmament’s comment on: DanielW’s Shortform
I mean, my intuition says that the right utility function can turn any CDT agent into an FDT agent, and any FDT agent can be described in terms of a CDT agent with a certain utility function.
I don’t think that follows. The CDT and FDT agents have the same utility functions and behave differently. Of course, if you gave them different tailored utility functions you could get them to behave the same in any given case, but that doesn’t seem very sensible, imo.
So if both Derek and Will were running CDT, but they were to value honesty at infinity dollars,
In that case, Derek could demand infinite dollars from Will and Will would pay it.
This seems like the crux of the issue. I think that the whole reason Will is in this mess is because Derek places a value of $1,000,000 on trust for some reason, making him act exactly like an FDT agent
If you remove Derek valuing honesty, his optimal decisions work out identically, as I daid in the OP. I made him value it trivially highly so I didn’t have to include a discussion of those scenarios to show they are suboptimal for Derek, but you can calculate his EV yourself, they will always be less than thr scenarios I described in the OP.
If the tables were turned, and we got rid of the “no negotiation allowed” rule, and Derek was a $0 honesty
Derek’s honesty value doesn’t affect those scenarios. In a negotiation, the turn order, information asymmetry etc determine who wins.
So the moral of the story is that whoever runs FDT wins
PEr the original problem, Derek’s optimal move is identical under CDT and FDT. How much he can get from Will is the only variable which depends on Will’s utility calculations.
This isn’t a tiebreaker, it is what value they ascribe to different scenarios. Since CDT-Will’s posterior calculation is limited to his causal effects, the value that can be extracted from him is much lower.

DanielW 15 Mar 2026 6:34 UTC
1 point
0
in reply to: papetoast’s comment on: DanielW’s Shortform
I think you are intuiting the question of “which DT is better” using the real world too heavily in a sort of “I think a world where people all do this is better” → “this DT is better” kind of way. You can’t just hope things work out this way.
Mostly fair, as i think you said elsewhere, i think I misunderstood you as making a value claim when you meant better in some other terms.
But one of the main reasons Yud and Soares give for preferring FDT over CDT is a belief that FDT leads to better outcomes. That is what I find unconvincing. It seems to me that more realistic assumptions better model observations under CDT (e.g. Braess’s paradox, to use an exampl I did elsewhere) and can lead to better outcomes. That was my central thesis. I do agree, that it is usually trivial to conceive of scenarios where any given theory loses to another in some sense.
Yes, thats why you use laws / precommitments to prevent it
Yes, but I would argue it is good to have mediating forces outside of laws. Derek can get either kf them to sign a contract before hand for a $1,000,199, but only FDT would say that they should honor that contract absent any mechanism to enforce it. While I don’t think it can be proven, it seems sensible before considering enforcement mechanisms we should consider honoring contracts based on how much we value honesty, associated signals and other such considerations. It seems less sensible to say we should honor them based solely on value estimates of the entire scenario they fall under. It alao seems sensible, if we include enforcement mechanism, that such mechanism be aet up to prevent people not following contracts that are generally deemed not unreasonable and preventing unconscionable conditions from being imposed even on agents that rationally consented to them (as would be the case with the agents consenting to a 1,000,200 contract).
We are assuming Derek knows everything about Will right? So if Will changes his strategy based on his prior then Will knows that too.
You mean Derek knows it, right? But it doesn’t change Will’s value calculation, so it shouldn’t change his strategy a priori even if he had a prior for what he thinks Derek would accept. He would change his decision if we assumed he knew how Derek was likely to price set and adapted his strategy on that, though.

DanielW 15 Mar 2026 6:15 UTC
2 points
0
in reply to: Firmament’s comment on: DanielW’s Shortform
humans use various concepts like justice, trust, and honor to approximate FDT occasionally.
I don’t think it is approximating FDT. I think it is just different values. Laws and policies may make CDT agents approximate what FDT agents would do without those laws, but that is not what I mean. Real humans have complex sets of desires/utility functions.
The differences in utility functions are the humans’ way of implementing FDT
...
FDT-Will effectively values honesty at infinity, while CDT-Will values it at $200.
...
I think this may be a central confusion. You are misunderstanding the hypothetical somewhat. FDT-Will values honesty at $200. He and CDT-Will would both be willing to be dishonest if it got him a >$200 payoff. To take my prior example, if someone dropped a $100 bill, he would return it. But if they dropped >$200, he would pocket it. The reason he is willing to pay up to the value of his life +$200 is because his assessment of the value is not based on how he values honesty, it is only based on how he expect agents like himself to be treated.
True, the $200 honesty-value seems to be there just to make CDT act more-FDT-like.
...
Wait, does the $200 honesty-value actually matter here?
...
Anyways, I don’t see the value in making the scenario complex.
Out of order, but i think it is more relevant here. Absent the $200, Will is monetarily better off (since Derek would drive them back anyway). It is to show in real scenarios what factors the other party might use to determine how much to demand of the hitchhiker. In the original, why the driver asks for what he does is ignored. Realistically, people don’t set prices at random.
The value of the $200 is meant to show price setting behavior in a more realistic CDT environment. It is not relevant to CDT winning. CDT wins in the given because Derek is a Decent Driver (hence the name). If Derek wasn’t a decent guy, CDT would still win if (and only if) they valued the signaling + honesty greater than Derek thought driving back was costly. Derek loses every dollar he values honesty more than Derek sees saving him as costly (though it doesn’t affect his actual expected value payoff, just amount that is actual cash money).
But I agree on the complexity. I guess it wpuld have been better to first present Will as a simply agent with no external values and then show how he would behave under CDT with more realistic values. But the more realistic values are what I’d argue are more relevant for where CDT offers different policy implications.
Oops, I wasn’t aware that there was a distinction between the transparent-box version and the opaque-box version. Thank you for the correction.
No worries, the reason the original was interesting is that CDT estimates two-boxing maximizes expected value while EDT would estimate one boxing does. Both EDT and CDT in the transparent case would say two box. EDT says in the opaque case if you one box there is a 99% chance (or whatever probability you apply) of the opaque box having the money, so one-boxing works out a higher value. But if you can see what’s in the box, that is no longer a evidential problem so EDT says to two box and you get no difference.
I am confused as to what you mean. CDT and FDT-emulating-CDT act the same, so they’re equally honest and get equally as much benefit from honesty. Is this about this specific problem? Or all similar problems? But this doesn’t seem to be load-bearing, so, whatever.
I will have to think on this, but my first thought is that my previous point on honesty applies. In this scenerio, the CDT agent gets a better deal by signaling honestly they will act as if they value being honest about their payments at $200. The FDT agent can actually do better (as you said in your first response) if there isn’t asymmetry by acting as if they would never repay a payment. This implies to me a dishonest signal. But yea, it isn’t load bearing, it is somewhat my own intuition.
in the scenario Derek is basically running FDT and using it to gain an advantage by precommitting to a single offer (the scenario explicitly says that negotiation is impossible, but in real life this would only happen if Derek is using some weird nonconventional negotiation tactics, and using a massive value on trust to precommit to the first offer made is one such tactic).
Real life negotiation often is impossible or extremely costly. Negotiating prices after a hospital visit can cost many man hours. Most retailers won’t allow you to negotiate at all. Setting prices and comitting to them is a pretty conventional tactic. But if you allow negotiation Derek is still likely to get an outsized payment for something he would have done for free. Paying him seems to have some social utility (we want to encourage people to be decent and help others), but limiting it to amounts consisered feasible by social values and honest signals (as under CDT), seems likely to lead to better outcomes than msking the restraints equivilant to the total value of the interaction (which is the restraint under FDT).
I’m having trouble comprehending this and should probably get some sleep, but it would seem Claude is being weird and overconfident here so I hereby downgrade my overconfident endorsement of Claude’s outputs from “insightful and correct” to “looks right in some places but makes mistakes or is overconfident in other places”.
Fair enough, same here. Have a good night!

DanielW 15 Mar 2026 4:02 UTC
1 point
0
in reply to: papetoast’s comment on: DanielW’s Shortform
This seems like a good thing
Why?
Take my example with the contracts, I don’t think that is actually a good outcome to be able to impose any contract on a disadvantaged party. Having the world of deals you can impose on someone you find at your mercy, so to say, restricted by what is socially permissible and enforceable seems like a preferable state of affairs. Absent legal/social frameworks, enforceability being limited by agent values and willingness to be beholden to deals seems like a preferable state of affairs to such limits not being in place.
This means CDT-Will will die if Derek’ has a different utility function and is only willing to drive them home for $201+? This is the “other” universes I’m talking about.
Yes, if we assume Derek is a misanthrope he will kill Will if WIll is not willing to pay him some amount greater than his misanthropy. But I do not think that is a realistic state of affairs and I think on the flip side you can get asymmetric information causing FDT agents to behave sub optimally when presented with misanthropic actors.^[1]
In an even more realistic scenario, Will should have a prior for the minimum amount Derek is willing to get to drive them home.
In the real world, we are often price takers or price setters and rarely negotiating as equal parties. Will may have the prior in my scenario for what he thinks will would be willing to accept. What his prior is, however, is irrelevant, he is not offered that price and doesn’t get to proposition Derek. His only choices are “do I accept Derek’s offer?” and when they get to town having accepted the offer he gets to decide “do I honor the offer?” If he wouldn’t honor the offer, Derek wouldn’t pick him up so he dies.
1. ^
  E.g., as the first example that comes to mind, let’s say your child has been kidnapped. Your kidnapper just happened to capture your child, by pure chance not intentionally, but you have no way to know that. You think that paying off blackmailers makes it more likely you will be blackmailed. The blackmailer demands a payment (lets say there is an escrow and they cannot cheat), but you, as an FDT agent, decline to negotiate. So the blackmailer kills your kid and disappears. A CDT agent pays the blackmailer, not considering the odds their decision may have on them being blackmailed. Unlike the decent driver, which assumes a lack of information, this assumes a true mistake on the FDT part to be truly worse off. Edit: though you can get individual agents to be worse off under FDT in the standard blackmail dilemma, for this case I am pre-assuming true randomness, in which case FDT would pay if they thought it was truly random as such but would still refuse to pay if they were acting under a, in this case mistaken, assumption that agents that didn’t pay would be extremely unlikely to be blackmailed.

DanielW 15 Mar 2026 3:21 UTC
2 points
−5
in reply to: papetoast’s comment on: DanielW’s Shortform
I don’t think that is quite true. And as I said elsewhere, when we assume non-telepaths we get FDT losing by amounts dependent on the degree of information asymmetry. In this case, the driver, Derek, is able to capture “as much money as the agent will be willing, upon arriving in town, to pay them to prevent the scenario from happening.” For CDT, lacking retro-causality, they will only be willing to pay up to whatever their honesty value and signaling value is (i.e. less than the $200 for Will). For the FDT agent, they will be willing to pay up to whatever they value the totality of the outcomes (live and pay vs. die and don’t).
CDT being willing to be dishonest in retrospect means there is less value to capture from them. In the real world, CDT agents are what we act like. If we want people to be more honest, we try to increase the value of honesty and signaling. To prevent people like Derek capturing this, we put limits on it.
One could imagine instead Derek demanding them to sign an contract before saving them. CDT will would now also be willing to pay up to $1,000,200 if that contract would be enforceable. But in the real world, if Derek demanded that much, under U.S. law the contract would likely be thrown out as unconscionable if he demanded in excess of a million dollars for a short car ride.
He could probably get away with demanding a contract of a few hundred or even thousand dollars, but if he was charging thousands of times the fair market value for a car ride, any court would likely throw that out. I think that is right, there should be limits on what you can enforce on another party in such situations.

DanielW 15 Mar 2026 3:09 UTC
2 points
0
in reply to: Firmament’s comment on: DanielW’s Shortform
Hm, I guess it’s an empirical question then
I sort of agree, but I don’t think it is one we can strictly answer. I gave some reasons we might think (in real world scenarios) that CDT tends to better explain behavior (e.g., Braess’s paradox), though I do not believe it is one we can have enough data to answer for all time.
People in low-trust FDT-hostile communities will grow up running CDT, while people in high-trust communities grow up running FDT.
I disagree. I think the difference in high and low trust societies can largely be attributed to differences in utility functions and signaling effects. This appears empirically true (high-trust societies tend to have more information on each others activities, making signaling more meaningful, tend to poll as valuing trust, etc).
Also, the problem isn’t “FDT-hostile” per-say. Derek is just trying to maximize his utility, he only cares what value he thinks he can get Will to honestly pay him. FDT does worse because it recommends taking the deal in Parfit’s hitchhiker and doesn’t have a posterior restraint on honest signaling.
If Derek was a misanthrope, Will could lose under CDT. If Derek valued saving Will’s life at −195 (instead of +6), Derek would leave CDT Will to die and still save FDT Will. This is why I predicated that Derek has to be a decent person, he prefers a scenario where everyone wins to one where he wins and Will dies.
Parfit’s Hitchhiker is a more realistic yet isomorphic framing of Newcomb’s Problem
Minor nitpick but Parfit’s hitchiker is isomorphic under CDT and to Newcomb’s Problem with transparent boxes under both CDT and EDT but not Newcomb’s problem in general. CDT doesn’t care about evidentiary probability, but if you don’t know what is in the boxes EDT says you should act probabilistically.
So the moral of the story is “run FDT, but emulate CDT if you live near a lot of information-asymmetric FDT-hostile problems”
I mean, I guess that is not unfair, but it seems bad. It would imply honest signals are good for CDT agents and dishonest signals are good for FDT agents.

Also, Claude is wrong or missing the point.
Under CDT, the $200 honesty value is the entire mechanism by which Derek can extract payment
Yes, this is the point. In the real world, for some value people are unwilling to cheat and for some value people are willing to cheat. The number of people, on seeing someone drop a $5 bill, that would return it to the person is greater than the people who seeing them drop a $100 bill would return it. This is a pretty trivial observation. Honesty in the real world is dependent on external signaling effects (i.e. people know others will be more honest with them if they are seen to be honest with others) and individual values (i.e., people are more honest because they ascribe some nebulous value to being honest, people who ascribe a greater value to honesty will be more honest than those who ascribe less of a value).
If you set honesty to $0, CDT-Will pays nothing
Derek still saves them, if you set all externalities to nothing, then yes. But the point of the hypothetical is to assume more normal human values. Most humans don’t like killing people and don’t like being dishonest.
FDT-Will still pays up to $1M and Derek extracts near-maximal surplus.
Which seems suboptimal. Extracting all of the surplus value for little effort doesn’t seem like a good thing. In society we ideally want to have parties be able to negotiate how to best divide the surplus according to various principles and social values. FDT-Will can be taken advantage of because he doesn’t have realistic human values. Similarly, in the typical formulation the driver can leave the hitchiker to die and the hitchhiker will be dishonest at any dollar figure because they have unrealistic utility functions.
The information asymmetry is contrived in a way that’s load-bearing but presented as “realistic.”
I mean, less so than in the original or in the inverse from Will’s perspective. I explained why I would argue it is generally realistic. It is an extreme case. In truly realistic scenarios, I would expect Derek to simply extract more from FDT-Will to a differing degree depending on how confident he was Will would pay him back. If Derek thought Will was CDT will, he would have to base that response on the estimated utility of Will paying him back when saved, which would be the $200. If he expected Will was FDT will, he would have to do so based on his estimate of Will’s functional utility for the total scenario, which would be $1,000,200. Realistically, it would be under $1,00,000 since he would expect Will to model some cut off to get a better deal. That cut off would fall between $0 and $1,000,200 depending on his relative estimates. If there was less asymmetry, it may end up that he would estimate he could only be reasonably confident that Will would take $5,000. But that would still have FDT-Will worse off since his bargaining position assumes a much larger stake than CDT-Will.
As I said in my OP, my comment was to make more realistic but still keep most of the simplifying assumptions. You can add more variables to make it more realistic—the more you add the more complex it becomes to model.
Poster 1 claims Will can’t do this because he doesn’t know Derek is modeling him
That wasn’t what I was claiming. Did you have the same confusion? The problem Will has is he modelling to pay Derek or not based on the assumption that if he doesn’t pay Derek he would have been not saved (i.e., exactly how FDT models Will’s decision under the traditional Parfit’s dilemma).
This necessitates he has some understanding of Derek’s decision making, if he had no reason to think Derek cared either way, he would be fine lying in both the original Parfit’s Dilemma and my ‘Decent Driver’ version.
The question of whether FDT requires knowledge that you’re being modeled to apply counterfactual reasoning is a genuinely deep question about FDT’s foundations, and it’s where the real philosophical action is.
No it isn’t, FDT plainly doesn’t require that. FDT gives different outputs if you are being modeled in so far as how the agent is being modeled affects how the agent assesses the values at issue.

DanielW 14 Mar 2026 22:22 UTC
2 points
0
in reply to: Firmament’s comment on: DanielW’s Shortform
Exactly, it is unfair to Will under FDT (which is part of why I framed it from Derek’s perspective), but I would argue it is a lot closer to what we see in the real world.
Usually there is some asymmetry, people have nuanced utility functions and when there is some net positive utility actors desire to capture as much of the net benefit for themselves as they can. While dishonest agents can lead to worse outcomes (e.g., in the traditional Parfit’s dilemma, someone who are under CDT is simply left to die, with no choice in the matter), unmitigated honesty can lead to an actor being a patsy, and taken advantage of. Realistic utility functions generally model and moderate this, I would argue, better than FDT.
One of the main reasons Yudowsky and Soares give for preferring FDT is that in the real world they view it as generally leading to better outcomes. I am not convinced that is the case, where these sorts of unfair situations arise.
You could construct a problem that unfairly favors CDT instead, like “Newcomb’s Problem but, unbeknownst to you, if you one-box, you die”.
Yes, but that isn’t very realistic. The purpose of the construction is that more realistic assumptions can lend to FDT being worse off. A more direct comparison to this would be Newcomb’s Problem with Transparent Boxes where the Predictor doesn’t actually care if you one-box. But Newcomb’s problem doesn’t seem very realistic on its face.
This framing, I would argue, more closely resembles how we actuslly handle asymmetric problems. Pricing decisions are based on willingness to pay, contracts are devised with incentives around what we anticipate will make the cost of cheating greater than the cost of being honest, etc. These observed behaviors align better with an understanding of expected value under causal decision theory.
If we assumed an agent that always has at least symetric information and predictive ability, I think it is fair to say that for such an agent, FDT would win out. But in reality is that is rarely the case. Amazon knows more about your purchase histories, price decisions and whatnot than you know about Amazon.

DanielW

Bringing things together, where does FDT differ and what is the utility of these in practice?

Case 1 Voting for an Underdog:

Case 2 Protest Voting: