David Joshua Sartor

Karma: 172

David Joshua Sartor 15 Dec 2025 19:17 UTC
1 point
0
in reply to: Nathan Young’s comment on: Eliezer’s Unteachable Methods of Sanity
My previous statements are technically correct, and IMO mostly make a correct point in context (that Truman had not realized, at the time, the immediate consequences of his decision), but are somewhat misleading. Thanks.
The process was still stupid,and not what Truman would have preferred. Truman was surprised and disturbed by the second bomb being dropped so quickly. But it seems like it wouldn’t have been too hard for him to anticipate and prevent this outcome, if he had been paying more attention (the same way he thought Hiroshima was a military base due to his own deficit of curiosity); I hadn’t realized that before, thanks.

David Joshua Sartor 11 Dec 2025 4:11 UTC
1 point
0
in reply to: Chris Wintergreen’s comment on: Eliezer’s Unteachable Methods of Sanity
Thanks, I hadn’t seen this.
I agree Truman thought Hiroshima was mostly a military base. IIRC you can see him make basic factual errors to that effect in an early draft of a speech.

David Joshua Sartor 8 Dec 2025 19:16 UTC
2 points
0
in reply to: jamii’s comment on: Insulin Resistance and Glycemic Index
IIUC, evolution is supposed to accelerate greatly during population growth.

David Joshua Sartor 8 Dec 2025 2:00 UTC
5 points
0
on: Eliezer’s Unteachable Methods of Sanity
I was doing do-nothing meditation maybe a month ago, managed to switch to a frame (for a few hours) where I felt planning as predicting my actions, and acting as perceiving my actions. IIRC, I exited when my brother-in-law asked me a programming question, ’cause maintaining that state took too much brainpower.
I think a lot of human action is simple “given good things happen, what will I do right now?”, which obviously leads to many kinds of problems. (Most obviously:)

David Joshua Sartor 8 Dec 2025 1:22 UTC
8 points
0
in reply to: Eliezer Yudkowsky’s comment on: Eliezer’s Unteachable Methods of Sanity
It’d be weird for him to take sole credit; he only established full presidential control of nuclear weapons afterward. He didn’t even know about the second bomb until after it dropped.

David Joshua Sartor 8 Dec 2025 1:15 UTC
42 points
4
in reply to: dr_s’s comment on: Eliezer’s Unteachable Methods of Sanity
Truman only made the call for the first bomb; the second was dropped by the military without his input, as if they were conducting a normal firebombing or something. Afterward, he cancelled the planned bombings of Kokura and Niigata, establishing presidential control of nuclear weapons.

David Joshua Sartor 4 Dec 2025 1:03 UTC
2 points
0
in reply to: Matthew Khoriaty’s comment on: Matthew Khoriaty’s Shortform
We try to make models obedient; it’s an explicit target. If we find that a natural framing, it makes sense AI does too. And it makes sense that that work can be undone.

David Joshua Sartor 23 Nov 2025 15:02 UTC
1 point
0
in reply to: lsusr’s comment on: Luna Lovegood and the Chamber of Secrets—Part 12
At least the final chapter has the name wrong.

David Joshua Sartor 23 Nov 2025 2:34 UTC
3 points
0
in reply to: lsusr’s comment on: Luna Lovegood and the Chamber of Secrets—Part 12
This is not fixed.

David Joshua Sartor 10 Nov 2025 4:22 UTC
3 points
2
on: Focusing
“Everything in my life is perfect right now.”
I couldn’t think about this before, ’cause it was obviously false in 100% of cases. I’ve gained greater understanding now.
“Perfect” is a 3-place word. It asks if a given state of the world is the best of a given set of states, given some values.
Is perfect(my life right now, ???, my values) true? If we take the minimal set as default, we get perfect(my life right now, my life right now, my values), which is obviously true. This isn’t totally unreasonable; there’s only one multiverse in the world, and there’s only one set of things in it I identify with. It’s very.intuitive to just stop there.
The sentence on its own doesn’t feel much false or true. But it doesn’t feel inconceivable anymore either.
I feel like I’ve come to the insight backward. I’ll keep meditating, haha.

David Joshua Sartor 10 Nov 2025 1:07 UTC
3 points
0
on: On Trust
It’s the “decided” part that’s the problem: beliefs are not supposed to involve any “deciding”.
I can pretty easily shift my perspective such that learning what I’m going to do feels like realizing that my action is overdetermined, rather than like “deciding”, for almost every action (and better meditators can get every action to feel this way). What I do to achieve this is: manually redefine my identity to exclude most of my decision-making process.
Similarly, many people include part of their world-model in their identity, such that learning about the world can feel like deciding something. The world-model’s doing very similar computation to the planner and whatnot, it seems reasonable for some people to include it.
There’s priors, there’s evidence, and if it feels like there’s a degree of freedom in what to do with those, then something has probably gone wrong.
Can just as easily say “There’s beliefs, there’s values, and if it feels like there’s a degree of freedom in what to do with those, then something has probably gone wrong.”. There’s only one optimal decision, given a set of beliefs and values. (Of course we’re bounded, but that applies just as well to what we do with beliefs and evidence.)

David Joshua Sartor 9 Nov 2025 21:01 UTC
1 point
0
on: Efficiency as a 2-place word
I always “feel behind”.
I think this is caused by mistaking a 3-place word for a 2-place word. “Behind” takes something like arguments ‘current state’, ‘value function’, ‘schedule distribution’. I think you’ve misplaced the schedule distribution that’s supposed to go here, and are using some silly replacement, because you forgot it was an argument that mattered.

David Joshua Sartor 4 Nov 2025 19:50 UTC
1 point
0
in reply to: David Joshua Sartor’s comment on: Should you donate to Lightcone Infrastructure?
I changed my mind; at least in the case of my sharing information with you, if you were perfectly trustworthy you’d totally just defer to my beliefs for not making me worse off as a result. But, as you said, plausibly even in this easy case being perfect is way too hobbling for humans ’cause of infohazards.

David Joshua Sartor 3 Nov 2025 22:36 UTC
3 points
0
in reply to: Mikhail Samin’s comment on: Should you donate to Lightcone Infrastructure?
Oliver said “The promise that Mikhail asked me to make was, as far as I understood it, to ‘not use any of the information in the conversation in any kind of adversarial way towards the people who the information is about’.”.
Oliver understood you to be asking him not to use the information to hurt anyone involved, which is way more restrictive, and in fact impossible for a human to do perfectly.
Unless he meant something more specific by “any kind of adversarial way”, which promise wouldn’t get you what you want.
If you meant the reasonable thing, and said it clearly, I agree Oliver’s misunderstanding is surprising and probably symptomatic of not reading planecrash.

David Joshua Sartor 2 Nov 2025 21:20 UTC
2 points
1
in reply to: habryka’s comment on: Should you donate to Lightcone Infrastructure?
I agree that promise is overly restrictive.
‘Don’t make my helping you have been a bad idea for me’ is a more reasonable version, but I assume you’re already doing that in your expectation, and it makes sense for different people to take the other’s expectation into account different amounts for this purpose.

David Joshua Sartor 31 Oct 2025 16:50 UTC
1 point
0
in reply to: cosmobobak’s comment on: strawberry calm’s Shortform
I agree none of this is relevant to anything, I was just looking for intrinsically interesting thoughts about optimal chess.
I thought at least CDT could be approximated pretty well with a bounded variant; causal reasoning is a normal thing to do. FDT is harder, but some humans seem to find it a useful perspective, so presumably you can have algorithms meaningfully closer or further, and that is a useful proxy for something.
Actually never mind, I have no experience with the formalisms.

I guess “choose the move that maximises your expected value” is technically compatible with FDT, you’re right.
It seems like the obvious way to describe what CDT does, and a really unnatural way to describe what FDT does, so I got confused.

David Joshua Sartor 30 Oct 2025 1:41 UTC
1 point
0
in reply to: cosmobobak’s comment on: strawberry calm’s Shortform
Your description of EVGOO is incorrect; you describe a Causal Decision Theory algorithm, but (assuming the opponent also knows your strategy ‘cause otherwise you’re cheating) what you want is LDT.
(Assuming they only see each others’ policy for that game, so an agent acting as eg CDT is indistinguishable from real CDT, then LDT is optimal even against such fantastic pathological opponents as “Minimax if my opponent looks like it’s following the algorithm that you the reader are hoping is optimal, otherwise resign” (or, if they can see each others’ policy for the whole universe of agents you’re testing, then LDT at least gets the maximum aggregate score).)

David Joshua Sartor 21 Oct 2025 18:33 UTC
3 points
0
on: Framing Practicum: Comparative Advantage
There are two ways this sort of “trade” can’t be made:
- One site is already maximally specialized. For instance, if Zion is already fully specialized in growing apples, then there are no further banana or coconut groves to replace with apple trees.
- The two sites trade off in exactly the same ratios. For instance, Xenia and Zion both trade off apples:bananas at a ratio of 1:0.5, so we can’t achieve a pareto gain with a little more specialization in those two fruits between those two sites.
If Zion’s fully specialized in growing apples, it can still replace apples with other things.
Note that “multiple goals” might really mean “multiple sub-goals”—e.g. Fruit Co might ultimately want to maximize profit, but producing more apples is a subgoal, producing more bananas is another subgoal, etc.
In that case I think utility is basically linear in each fruit, which makes it a bad example ’cause you don’t need comparative advantage for that. (At least for me, an example of using a concept doesn’t much help remembering it unless it’s helpful to me for that example.)
IIUC it’s only useful for subgoals when utility’s sublinear in them.

David Joshua Sartor 21 Oct 2025 14:09 UTC
1 point
0
in reply to: Carl Feynman’s comment on: The IABIED statement is not literally true
“the Palestinians get control of Palestine, or the Israelis maintain control of Israel”
I think in these cases opposing ASIs work together to maintain the existence of the disputed land and/or people, and use RNG to decide who gets control.
Of course zero-sum conflicts do exist, but IIUC only in cases where goals are exactly opposed (at least between just two ASIs).

David Joshua Sartor 19 Oct 2025 16:01 UTC
1 point
0
in reply to: Alex Vermillion’s comment on: Education on My Homeworld
A median earthling on another world would make lots and lots of errors in imagining Earth, and I think it would make sense to be biased toward errors that make things more legible.
Like how, in planecrash, it’s said that smarter-than-average dath ilani doing the exercise still have their medianworld share dath ilan’s high-end of intelligence, since “what kinds of innovations might superhuman geniuses make?” is not at all the point of the exercise. (Of course Eliezer ignored that rule...)