Eliezer Yudkowsky

Karma: 157,069

Eliezer Yudkowsky 21 Dec 2025 18:42 UTC
10 points
5
in reply to: TsviBT’s comment on: Contradict my take on OpenPhil’s past AI beliefs
When somebody at least pretending to humility says, “Well, I think this here estimator is the best thing we have for anchoring a median estimate”, and I stroll over and proclaim, “Well I think that’s invalid”, I do think there is a certain justice in them demanding of me, “Well, would you at least like to say then in what direction my expectation seems to you to be predictably mistaken?”

Eliezer Yudkowsky 21 Dec 2025 17:24 UTC
2 points
−2
in reply to: StanislavKrym’s comment on: Contradict my take on OpenPhil’s past AI beliefs
If you can get that or 2050 equally well off yelling “Biological Anchoring”, why not admit that the intuition comes first and then you hunt around for parameters you like? This doesn’t sound like good methodology to me.

Eliezer Yudkowsky 21 Dec 2025 16:20 UTC
3 points
0
in reply to: Daniel Kokotajlo’s comment on: Contradict my take on OpenPhil’s past AI beliefs
Is your take “Use these different parameters and you get AGI in 2028 with the current methods”?

Eliezer Yudkowsky 21 Dec 2025 15:32 UTC
7 points
0
in reply to: mattmacdermott’s comment on: Contradict my take on OpenPhil’s past AI beliefs
I think OpenPhil was guided by Cotra’s estimate and promoted that estimate. If they’d labeled it: “Epistemic status: Obviously wrong but maybe somebody builds on it someday” then it would have had a different impact and probably not one I found objectionable.

Separately, I can’t imagine how you could build something not-BS on that foundation and if people are using it to advocate for short timelines then I probably regard that argument as BS and invalid as well.

Eliezer Yudkowsky 21 Dec 2025 15:29 UTC
5 points
0
in reply to: Oliver Sourbut’s comment on: Contradict my take on OpenPhil’s past AI beliefs
Will MacAskill could serve as exemplar. More broadly I’m thinking of people who might have called themselves ‘longtermists’ or who hybridized Bostrom with Peter Singer.

Eliezer Yudkowsky 21 Dec 2025 15:28 UTC
−6 points
−8
in reply to: Ben Pace’s comment on: Contradict my take on OpenPhil’s past AI beliefs
I again don’t consider this a helpful thing to say on a sinking ship when somebody is trying to organize passengers getting to the lifeboats.

Especially if your definition of “AI takeover” is such as to include lots of good possibilities as well as bad ones; maybe the iceberg rockets your ship to the destination sooner and provides all the passengers with free iced drinks, who can say?

Eliezer Yudkowsky 21 Dec 2025 15:26 UTC
13 points
4
in reply to: David Matolcsi’s comment on: Contradict my take on OpenPhil’s past AI beliefs
You can do better by saying “I don’t know” than by saying a bunch of wrong stuff. My long reply to Cotra was, “You don’t know, I don’t know, your premises are clearly false, and if you insist on my being Bayesian and providing a direction of predictable error when I claim predictable error then fine your timelines are too long.”

Eliezer Yudkowsky 21 Dec 2025 15:23 UTC
14 points
10
in reply to: SE Gyges’s comment on: Contradict my take on OpenPhil’s past AI beliefs
People ask me questions. I answer them honestly, not least because I don’t have the skill to say “I’m not answering that” without it sending some completely different set of messages. Saying a bunch of stuff in private without giving anyone a chance to respond to what I’m guessing about them is deontologically weighed-against by my rules, though not forbidden depending on circumstances. I do not do this in hopes any good thing results, but then acts with good consequences are few and far between in any case, these days.

Eliezer Yudkowsky 21 Dec 2025 6:23 UTC
7 points
4
in reply to: ChristianKl’s comment on: Eliezer’s Unteachable Methods of Sanity
Why, that’s my job too! But it’s a very different job depending on whether you consider it an indispensable requirement to have people coming away with a roughly accurate picture of reality, or if your job is to be an entertainer.

Eliezer Yudkowsky 21 Dec 2025 5:57 UTC
18 points
12
in reply to: anaguma’s comment on: Contradict my take on OpenPhil’s past AI beliefs
I think if they sponsored Cotra’s work and cited it, this reflects badly on them. More on them than on Cotra, really; I am not a fan of the theory that you blame the people who were selected to have an opinion or incentivised to have an opinion, so much as the people who did the selection and incentivization. See https://www.lesswrong.com/posts/ax695frGJEzGxFBK4/biology-inspired-agi-timelines-the-trick-that-never-works, which I think stands out as clearly correct in retrospect, for why their analysis was obviously wrong at the time. And I did in that case take the trouble to explain why their whole complicated analysis was bogus, and my model is that this clearly-correct-in-retrospect critique had roughly zero impact or effect on OpenPhil; and that is what I expected and predicted in advance, which is why I did not spend more effort trying to redeem an organization I modeled as irredeemably broken.

Eliezer Yudkowsky 21 Dec 2025 5:48 UTC
13 points
0
in reply to: Eli Tyre’s comment on: Contradict my take on OpenPhil’s past AI beliefs
I expect it’s a combination of selection effects and researchers knowing implicitly where their bread is buttered; I have no particular estimate of the relative share of these effects, except that they are jointly sufficient that, eg, a granter can hire what advertises itself as a group of superforecasters, and get back 1% probability on AI IMO gold by 2025.

Eliezer Yudkowsky 21 Dec 2025 5:47 UTC
7 points
0
in reply to: Buck’s comment on: Contradict my take on OpenPhil’s past AI beliefs
Well, there sure is a simple story for how it looked from outside. What’s the complicated real truth that you only get to know about from the inside, where everything is, like, not ignorantly handwaved off as incredibly standard bureaucratic organizational dynamics of grantees telling the grantmaker what it wants to hear?

Eliezer Yudkowsky 20 Dec 2025 22:23 UTC
11 points
−2
in reply to: Buck’s comment on: Contradict my take on OpenPhil’s past AI beliefs
If you imagine the very serious person wearing the expensive suit saying, “But of course we must prepare for cases where the ship sinks sooner and there is a possibility of some passengers drowning”, whether or not this is Very Exculpatory depends on the counterfactual for what happens if the guy is not there. I think OpenPhil imagines that if they are not there, even fewer people take MIRI seriously. To me this is not clear and it looks like the only thing that broke the logjam was ChatGPT, after which the weight and momentum of OpenPhil views was strongly net negative.

One issue among others is that the kind of work you end up funding when the funding bureaucrats go to the funding-seekers and say, “Well, we mostly think this is many years out and won’t kill everyone, but, you know, just in case, we thought we’d fund you to write papers about it” tends to be papers that make net negative contributions.

Eliezer Yudkowsky 10 Dec 2025 20:15 UTC
17 points
3
in reply to: David Joshua Sartor’s comment on: Eliezer’s Unteachable Methods of Sanity
...amazing.

Eliezer Yudkowsky 8 Dec 2025 14:25 UTC
14 points
0
in reply to: testingthewaters’s comment on: Eliezer’s Unteachable Methods of Sanity
I would of course have a different response to someone who asked the incredibly different question, “Any learnable tricks for not feeling like crap while the world ends?”

(This could be seen as the theme of a couple of other brief talks at the Solstice. I don’t have a 30-second answer that doesn’t rely on context, and don’t consider myself much of an expert on that question versus the part of the problem constraint that is maintaining epistemic health while you do whatever. That said, being less completely unwilling to spend small or even medium amounts of money made a difference to my life, and so did beginning a romantic relationship in the frame of mind that we might all be dead soon and therefore I ought to do more fun things and worry less about preserving the relationship, which led to a much stronger relationship relative to the wrong things I otherwise do by default.)

Eliezer Yudkowsky 7 Dec 2025 23:44 UTC
24 points
5
in reply to: Eli Tyre’s comment on: Eliezer’s Unteachable Methods of Sanity
It’s fancy and indirect, compared to getting out of bed.

Eliezer Yudkowsky 7 Dec 2025 23:43 UTC
22 points
18
in reply to: Caleb Biddulph’s comment on: Eliezer’s Unteachable Methods of Sanity
They didn’t need to deal with social media informing them that they need to be traumatized now, and form a conditional prediction of extreme and self-destructive behavior later.

Eliezer Yudkowsky 7 Dec 2025 17:36 UTC
7 points
1
in reply to: Algon’s comment on: Eliezer’s Unteachable Methods of Sanity
That does sound similar to me! But I haven’t gotten a lot of mileage out of TAPs and if you’re referring to some specific advanced version of it, maybe I’m off. But the basic concept of mentally rehearsing the trigger, the intended action, and (in some variations) the later sequence of events leading up to an outcome you feel is good, sure sounds to me like trying to load a plan into a predictorlike thing that has been repurposed to output plan images.

Eliezer Yudkowsky 7 Dec 2025 17:33 UTC
11 points
0
in reply to: Hastings’s comment on: Eliezer’s Unteachable Methods of Sanity
This is just straight-up planning and doesn’t require doing weird gymnastics to deal with a biological brain’s broken type system.

Eliezer Yudkowsky 7 Dec 2025 17:33 UTC
19 points
6
in reply to: Chris Datcu’s comment on: Eliezer’s Unteachable Methods of Sanity
Nope. Breaks the firewall. Exactly as insane.

Beliefs are for being true. Use them for nothing else.

If you need a good thing to happen, use a plan for that.