Liron

Karma: 4,848

Liron 13 Mar 2026 14:52 UTC
3 points
0
in reply to: ScienceBall’s comment on: Interview with Steven Byrnes on His Mainline Takeoff Scenario
Thanks for that. Should be fixed now.

Interview with Steven Byrnes on His Mainline Takeoff Scenario

Liron10 Mar 2026 20:17 UTC

36 points

8 comments57 min readLW link

(doomdebates.com)

Liron 14 Feb 2026 3:37 UTC
4 points
0
in reply to: Rahul N’s comment on: The Facade of AI Safety Will Crumble
The post is about what “adulthood” means for goal engines, and where the vector from baby to adulthood points. Current AI safety work is only relevant to a “system that is still sufficiently baby-like”. But we should expect goal engines to be extremely mature. When you are negotiating with a human adult who is trying to maximize their company’s profit, there is no need to study the phenotype of the 3-month embryo that once scaffolded that human.

Liron 14 Feb 2026 3:30 UTC
5 points
0
in reply to: Thomas Kwa’s comment on: The Facade of AI Safety Will Crumble
Humans at 1000x speed still retain many properties of immature goal engines, full of abstraction-breaking silly quirks, the same way ENIAC at 1000x speed can still get a literal bug (like a moth) in it. The direction of progress after ENIAC did not point toward ENIAC at 1000x speed.
P.S. Thanks for being the only one so far to engage with my claim.

Liron 14 Feb 2026 3:28 UTC
4 points
0
in reply to: plex’s comment on: The Facade of AI Safety Will Crumble
I took the liberty to exaggerate “a 2-digit number of people” as a “nonexistent field” :)

Liron 14 Feb 2026 3:27 UTC
4 points
2
in reply to: Sodium’s comment on: The Facade of AI Safety Will Crumble
The idea explained in the post, in a way that I don’t know what other reference already explains, is that there is a disconnect between the expected character of a mature goal engine, and the nature of the tools that are being developed under the name “AI safety”.

The Facade of AI Safety Will Crumble

Liron12 Feb 2026 15:57 UTC

36 points

11 comments4 min readLW link

(doomdebates.com)

Liron 6 Feb 2026 17:14 UTC
2 points
0
in reply to: tslarm’s comment on: How Dario Amodei’s “The Adolescence of Technology” Delegitimizes AI X-Risk Concerns
If the post itself was ambiguous, I think there has been a ton of evidence in the 3+ years since that post that this community has a VERY non-fatalistic attitude about the situation.

Liron 6 Feb 2026 13:40 UTC
3 points
0
in reply to: tslarm’s comment on: How Dario Amodei’s “The Adolescence of Technology” Delegitimizes AI X-Risk Concerns
I interpreted Eliezer’s message in that piece not as it being inevitable, but as there being many layers of problems that would need to be fixed, but with very little evidence that most of the layers had much hope of being fixed. In my view Eliezer has consistently been nimble about updating on evidence, but he thinks the path to extinction is vastly overdetermined unless many surprising updates come his way.

How Dario Amodei’s “The Adolescence of Technology” Delegitimizes AI X-Risk Concerns

Liron and Harlan

6 Feb 2026 2:07 UTC

12 points

6 comments50 min readLW link

(doomdebates.com)

Liron 4 Feb 2026 14:44 UTC
5 points
0
on: I finally fixed my footwear
PSA to those with flat or otherwise imperfect feet:
I finally got custom-made orthotics made, and they’re very different / way more correction than I expected compared to off-the-shelf orthotics, in a good way. Highly recommended!

Liron 30 Jan 2026 15:53 UTC
13 points
11
on: Bentham’s Bulldog is wrong about AI risk
Amazing post. Meta-level it’s very well argued and good-faith, and object-level these arguments are spot on IMO, especially how you unpacked the details of exactly how his post falls victim to the Multiple Stage Fallacy.
I debated BB a couple days ago for an upcoming episode of Doom Debates, and while I warned him that MSF in complex domains is a huge trap that makes arguments like his almost never work, I wasn’t able to pin down the problem with his stages the way you did here.
I’m really happy with the meta-level quality of BB’s original post and your reply (and with BB’s conduct in our Doom Debate). I wish discourse of this caliber among the various AI x-risk positions was much more common.

Liron 18 Dec 2025 17:14 UTC
8 points
0
on: A basic case for donating to the Berkeley Genomics Project
Here’s my recent interview with Tsvi about Berkeley Genomics project. I asked him what I think are cruxy questions about whether it’s worth supporting, and I think the conclusion is yes!

Liron 29 Nov 2025 22:47 UTC
33 points
14
in reply to: Boaz Barak’s comment on: Unless its governance changes, Anthropic is untrustworthy
I suspect the real disagreement between you and Anthropic-blamers like me is downstream of a P(Doom) disagreement (where yours is pretty low and others’ is high), since I’ve seen this is often the case with various cases of smart people disagreeing.
Realistically/pragmatically balanced moves in a lowish-P(Doom) world are unacceptable in a high-P(Doom) world.

AI Corrigibility Debate: Max Harms vs. Jeremy Gillen

Liron, Max Harms and Jeremy Gillen

14 Nov 2025 4:09 UTC

46 points

1 comment75 min readLW link

(doomdebates.com)

Liron 14 Nov 2025 1:44 UTC
18 points
6
on: Liron’s Shortform
I just noticed the LessWrong site loads a lot faster than it used to. Very cool!

Liron 5 Nov 2025 20:14 UTC
2 points
0
in reply to: IgnatzMouse’s comment on: I ate bear fat with honey and salt flakes, to prove a point
Makes sense. Only problem is, bear fat + sugar + salt seems qualitatively pretty similar to ice cream. It doesn’t seem like it neglected the qualitative spirit of why ice cream is good, which just adds to the fine parsing needed to get value out of this.

Liron 4 Nov 2025 18:18 UTC
6 points
4
on: I ate bear fat with honey and salt flakes, to prove a point
The fact still stands that ice cream is what we mass produce and send to grocery stores.
Yeah, I guess this exact observation is critical to making Eliezer’s analogy accurate.
IMO “predicting that bear fat with honey and salt tastes good” is analogous to “predicting that harnessing a star’s power will be an optimization target” — something we probably can successfully do.
And “predicting bear fat (or some kind of rendered animal fat) with honey and salt will be a popular treat”—the thing we couldn’t have done a-priori—is analogous to “predicting solar-to-electricity generator panels will be a popular fixture on many planets” (since the details probably will turn out to have some unpredictable twists), and also to “predicting that making humans satisfied with outcomes will be an optimization target for AIs in the production environment as a result of their training”.
I think this analogy is probably right, but the sense in which it’s right seems sufficiently non-obvious/detailed/finicky that I don’t think we can expect most people to get it?
Plus IMO it further undermines the pedagogical value of this example to observe that a drinkable form of ice cream (shakes) is also popular, plus there’s gelato / frozen yogurt / soft serve, and then thick sweet yogurts and popsicles… it’s a pretty continuous treat-fitness landscape.
I do think Eliezer is importantly right that the exact peak market-winning point in this landscape, would be hard to predict a-priori. But is the hardness also explained by the peak being dependent on chaotic historical/cultural forces?
And that’s why I personally don’t bring up the bear fat thing in my AI danger explanations.

Liron 2 Nov 2025 19:05 UTC
75 points
60
on: Should you donate to Lightcone Infrastructure?
Seems like the rapid-fire nature of an InkHaven writing sprint is a poor fit for a public post under a personally-charged summary bullet like “Oliver puts personal conflict ahead of shared goals”.

High-quality discourse means making an effort to give people the benefit of the doubt when making claims about their character. It’s worth taking time to carefully follow our rationalist norms of epistemic rigor, productive discourse, and personal charity.
I’d expect a high-evidence post about a very non-consensus topic like this to start out in a more norm-calibrated and self-aware epistemic tone, e.g. “I have concerns about Oliver’s decisionmaking as leader of Lightcone based on a pattern of incidents I’ve witnessed in his personal conflicts (detailed below)”.

Liron 2 Nov 2025 18:26 UTC
23 points
5
on: Should you donate to Lightcone Infrastructure?
Maybe Lightcone Infrastructure can just allow earmarking donations for LessWrong, if enough people care about that criticism.

Liron

In­ter­view with Steven Byrnes on His Main­line Take­off Scenario

The Fa­cade of AI Safety Will Crumble

How Dario Amodei’s “The Ado­les­cence of Tech­nol­ogy” Dele­gi­t­imizes AI X-Risk Concerns

AI Cor­rigi­bil­ity De­bate: Max Harms vs. Jeremy Gillen

Interview with Steven Byrnes on His Mainline Takeoff Scenario

The Facade of AI Safety Will Crumble

How Dario Amodei’s “The Adolescence of Technology” Delegitimizes AI X-Risk Concerns

AI Corrigibility Debate: Max Harms vs. Jeremy Gillen