Thanks for that. Should be fixed now.
Liron
The post is about what “adulthood” means for goal engines, and where the vector from baby to adulthood points. Current AI safety work is only relevant to a “system that is still sufficiently baby-like”. But we should expect goal engines to be extremely mature. When you are negotiating with a human adult who is trying to maximize their company’s profit, there is no need to study the phenotype of the 3-month embryo that once scaffolded that human.
Humans at 1000x speed still retain many properties of immature goal engines, full of abstraction-breaking silly quirks, the same way ENIAC at 1000x speed can still get a literal bug (like a moth) in it. The direction of progress after ENIAC did not point toward ENIAC at 1000x speed.
P.S. Thanks for being the only one so far to engage with my claim.
I took the liberty to exaggerate “a 2-digit number of people” as a “nonexistent field” :)
The idea explained in the post, in a way that I don’t know what other reference already explains, is that there is a disconnect between the expected character of a mature goal engine, and the nature of the tools that are being developed under the name “AI safety”.
If the post itself was ambiguous, I think there has been a ton of evidence in the 3+ years since that post that this community has a VERY non-fatalistic attitude about the situation.
I interpreted Eliezer’s message in that piece not as it being inevitable, but as there being many layers of problems that would need to be fixed, but with very little evidence that most of the layers had much hope of being fixed. In my view Eliezer has consistently been nimble about updating on evidence, but he thinks the path to extinction is vastly overdetermined unless many surprising updates come his way.
PSA to those with flat or otherwise imperfect feet:
I finally got custom-made orthotics made, and they’re very different / way more correction than I expected compared to off-the-shelf orthotics, in a good way. Highly recommended!
Amazing post. Meta-level it’s very well argued and good-faith, and object-level these arguments are spot on IMO, especially how you unpacked the details of exactly how his post falls victim to the Multiple Stage Fallacy.
I debated BB a couple days ago for an upcoming episode of Doom Debates, and while I warned him that MSF in complex domains is a huge trap that makes arguments like his almost never work, I wasn’t able to pin down the problem with his stages the way you did here.
I’m really happy with the meta-level quality of BB’s original post and your reply (and with BB’s conduct in our Doom Debate). I wish discourse of this caliber among the various AI x-risk positions was much more common.
Here’s my recent interview with Tsvi about Berkeley Genomics project. I asked him what I think are cruxy questions about whether it’s worth supporting, and I think the conclusion is yes!
I suspect the real disagreement between you and Anthropic-blamers like me is downstream of a P(Doom) disagreement (where yours is pretty low and others’ is high), since I’ve seen this is often the case with various cases of smart people disagreeing.
Realistically/pragmatically balanced moves in a lowish-P(Doom) world are unacceptable in a high-P(Doom) world.
I just noticed the LessWrong site loads a lot faster than it used to. Very cool!
Makes sense. Only problem is, bear fat + sugar + salt seems qualitatively pretty similar to ice cream. It doesn’t seem like it neglected the qualitative spirit of why ice cream is good, which just adds to the fine parsing needed to get value out of this.
The fact still stands that ice cream is what we mass produce and send to grocery stores.
Yeah, I guess this exact observation is critical to making Eliezer’s analogy accurate.
IMO “predicting that bear fat with honey and salt tastes good” is analogous to “predicting that harnessing a star’s power will be an optimization target” — something we probably can successfully do.
And “predicting bear fat (or some kind of rendered animal fat) with honey and salt will be a popular treat”—the thing we couldn’t have done a-priori—is analogous to “predicting solar-to-electricity generator panels will be a popular fixture on many planets” (since the details probably will turn out to have some unpredictable twists), and also to “predicting that making humans satisfied with outcomes will be an optimization target for AIs in the production environment as a result of their training”.
I think this analogy is probably right, but the sense in which it’s right seems sufficiently non-obvious/detailed/finicky that I don’t think we can expect most people to get it?
Plus IMO it further undermines the pedagogical value of this example to observe that a drinkable form of ice cream (shakes) is also popular, plus there’s gelato / frozen yogurt / soft serve, and then thick sweet yogurts and popsicles… it’s a pretty continuous treat-fitness landscape.
I do think Eliezer is importantly right that the exact peak market-winning point in this landscape, would be hard to predict a-priori. But is the hardness also explained by the peak being dependent on chaotic historical/cultural forces?
And that’s why I personally don’t bring up the bear fat thing in my AI danger explanations.
Seems like the rapid-fire nature of an InkHaven writing sprint is a poor fit for a public post under a personally-charged summary bullet like “Oliver puts personal conflict ahead of shared goals”.
High-quality discourse means making an effort to give people the benefit of the doubt when making claims about their character. It’s worth taking time to carefully follow our rationalist norms of epistemic rigor, productive discourse, and personal charity.I’d expect a high-evidence post about a very non-consensus topic like this to start out in a more norm-calibrated and self-aware epistemic tone, e.g. “I have concerns about Oliver’s decisionmaking as leader of Lightcone based on a pattern of incidents I’ve witnessed in his personal conflicts (detailed below)”.
Maybe Lightcone Infrastructure can just allow earmarking donations for LessWrong, if enough people care about that criticism.
Thanks. The reactions to such a post would constitute a stronger common knowledge signal of community agreement with the book (to the degree that such agreement is in fact present in the community).
I wonder if it would be better to make the agree-voting anonymous (like LW post voting) or with people’s names attached to their votes (like react-voting).
I’m sure this is going too far for you, but I also personally wish LW could go even further toward turning a sufficient amount of mutual support expressed in that form (if it turns out to exist) into a frontpage that actually looks like what most humans expect a supportive front page around a big event to look like (moreso than having a banner mentioning it and discussion mentioning it).
> nor is my argument even “mutual knowledge is bad”.
For example, I really like the LessWrong surveys! I take those every year!
What’s the minimally modified version of posting this “Statement of Support for IABIED” you’d feel good about? Presumably the upper bound for your desired level of modification would be if we included a yearly survey question about whether people agree with the quoted central claim from the book?
Again, the separate tweet about LW crab-bucketing in my Twitter thread wasn’t meant as a response to to you in this LW thread.
I agree that “room for disagreement does not imply any disagreement is valid”, and am not seeing anything left to respond to on that point.
Stuck an iPhone camera mount to the wall of my home gym to record my form on exercises (trap bar squats, pull-ups, glute bridges, etc) then share frame screenshots with GPT-5.4 Thinking and get form analysis. This has been a game changer for me because I have hypermobile ligaments and if I don’t have perfect form I’m getting a much worse (even net negative) result, but I suspect everyone can benefit from trying this a bit.
Obvious but various health/medical problem solving, like “hey my eyes are constantly kind of dry, it’s not an emergency but how would I actually go about not having this problem”
In general I think there’s a lot of value in consciously building a habit of constantly taking advice, since high quality expert advice is now available on everything.