Rob Bensinger

Karma: 22,846

Communications @ MIRI. Unless otherwise indicated, my posts and comments here reflect my own views, and not necessarily my employer’s. (Though we agree about an awful lot.)

Rob Bensinger 30 Sep 2025 11:23 UTC
LW: 4 AF: 4
2
AF
in reply to: denkenberger’s comment on: Four ways learning Econ makes people dumber re: future AI
When Freeman Dyson originally said “Dyson sphere” I believe he had a Dyson swarm in mind, so it strikes me as oddly unfair to Freeman Dyson to treat Dyson “spheres” and “swarms” as disjoint. But “swarms” might be better language, just to avoid the misconception that a “Dyson sphere” is supposed to be a single solid structure.

Rob Bensinger 28 Sep 2025 18:32 UTC
47 points
−6
on: A Reply to MacAskill on “If Anyone Builds It, Everyone Dies”
Quoting from a follow-up conversation I had with Buck after this exchange:
__________________________________________________________
Buck: So following up on your Will post: It sounds like you genuinely didn’t understand that Will is worried about AI takeover risk and thinks we should try to avert it, including by regulation. Is that right?
I’m just so confused here. I thought your description of his views was a ridiculous straw man, and at first I thought you were just being some combination of dishonest and rhetorically sloppy, but now my guess is you’re genuinely confused about what he thinks?
(Happy to call briefly if that would be easier. I’m interested in talking about this a bit because I was shocked by your post and want to prevent similar things happening in the future if it’s easy to do so.)
Rob: I was mostly just going off Will’s mini-review; I saw that he briefly mentioned “governance agendas” but otherwise everything he said seemed to me to fit ‘has some worries that AI could go poorly, but isn’t too worried, and sees the current status quo as basically good—alignment is going great, the front-running labs are sensible, capabilities and alignment will by default advance in a way that lets us ratchet the two up safely without needing to do anything special or novel’
so I assumed if he was worried, it was mainly about things that might disrupt that status quo
Buck: what about his line “I think the risk of misaligned AI takeover is enormously important.”
alignment is going great, the front-running labs are sensible
This is not my understanding of what Will thinks.
[added by Buck later: And also I don’t think it’s an accurate reading of the text.]
Rob: 🙏
that’s helpful to know!
Buck: I am not confident I know exactly what Will thinks here. But my understanding is that his position is something like: The situation is pretty scary (hence him saying “I think the risk of misaligned AI takeover is enormously important.”). There is maybe 5% overall chance of AI takeover, which is a bad and overly large number. The AI companies are reckless and incompetent with respect to these risks, compared to what you’d hope given the stakes. Rushing through super intelligence would be extremely dangerous for AI takeover and other reasons.
[added/edited by Buck later: I interpret the review as saying:
- He thinks the probability of AI takeover and of human extinction due to AI takeover is substantially lower than you do.
  - This is not because he thinks “AI companies/humanity are very thoughtful about mitigating risk from misaligned superintelligence, and they are clearly on track to develop techniques that will give developers justified confidence that AIs powerful enough that their misalignment poses risk of AI takeover are aligned”. It’s because he is more optimistic about what will happen if AI companies and humanity are not very thoughtful and competent.
- He thinks that the arguments given in the book have important weaknesses.
- He disagrees with the strategic implications of the worldview described in the book.
For context, I am less optimistic than he is, but I directionally agree with him on both points.]
In general, MIRI people often misunderstand someone saying, “I think X will probably be fine because of consideration Y” to mean “I think that plan Y guarantees that X will be fine”. And often, Y is not a plan at all, it’s just some purported feature of the world.
Another case is people saying “I think that argument A for why X will go badly fails to engage with counterargument Y”, which MIRI people round off to “X is guaranteed to go fine because of my plan Y”
Rob: my current guess is that my error is downstream of (a) not having enough context from talking to Will or seeing enough other AI Will-writing, and (b) Will playing down some of his worries in the review
I think I was overconfident in my main guess, but I don’t think it would have been easy for me to have Reality as my main guess instead
Buck: When I asked the AIs, they thought that your summary of Will’s review was inaccurate and unfair, based just on his review.
It might be helpful to try checking this way in the future.
I’m still interested in how you interpreted his line “I think the risk of misaligned AI takeover is enormously important.”
Rob: I think that line didn’t stick out to me at all / it seemed open to different interpretations, and mainly trying to tell the reader ‘mentally associate me with some team other than the Full Takeover Skeptics (eg I’m not LeCun), to give extra force to my claim that the book’s not good’.
like, I still associate Will to some degree with the past version of himself who was mostly unconcerned about near-term catastrophes and thought EA’s mission should be to slowly nudge long-term social trends. “enormously important” from my perspective might have been a polite way of saying ‘it’s 1 / 10,000 likely to happen, but that’s still one of the most serious risks we face as a society’
it sounds like Will’s views have changed a lot, but insofar as I was anchored to ‘this is someone who is known to have oddly optimistic views and everything-will-be-pretty-OK views about the world’ it was harder for me to see what it sounds like you saw in the mini-review
(I say this mainly as autobiography since you seemed interested in debugging how this happened; not as ‘therefore I was justified/right’)
Buck: Ok that makes sense
Man, how bizarre
Claude had basically the same impression of your summary as I did
Which makes me feel like this isn’t just me having more context as a result of knowing Will and talking to him about this stuff.
Rob: I mean, I still expect most people who read Will’s review to directionally update the way I did—I don’t expect them to infer things like
“The situation is pretty scary.”
“The AI companies are reckless and incompetent wrt these risks.”
“Rushing through super intelligence would be extremely dangerous for AI takeover and other reasons.”
or ‘a lot of MIRI-ish proposals like compute governance are a great idea’ (if he thinks that)
or ‘if the political tractability looked 10-20x better then it would likely be worth seriously looking into a global shutdown immediately’ (if he thinks something like that??)
I think it was reasonable for me to be confused about what he thinks on those fronts and to press him on it, since I expect his review to directionally make people waaaaaaay more misinformed and confused about the state of the world
and I think some of his statements don’t make sense / have big unresolved tensions, and a lot of his arguments were bad and misinformed. (not that him strawmanning MIRI a dozen different ways excuses me misrepresenting his view; but I still find it funny how disinterested people apparently are in the ‘strawmanning MIRI’ side of things? maybe they see no need to back me up on the places where my post was correct, because they assume the Light of Truth will shine through and persuade people in those cases, so the only important intervention is to correct errors in the post?)
but I should have drawn out those tensions by posing a bunch of dilemmas and saying stuff like ‘seems like if you believe W, then bad consequence X; and if you believe Y, then bad consequence Z. which horn of the dilemma do you choose, so I know what to argue against?‘, rather than setting up a best-guess interpretation of what Will was saying (even one with a bunch of ‘this is my best guess’ caveats)
I think Will was being unvirtuously cagey or spin-y about his views, and this doesn’t absolve me of responsibility for trying to read the tea leaves and figure out what he actually thinks about ‘should government ever slow down or halt the race to ASI?’, but it would have been a very easy misinterpretation for him to prevent (if his views are as you suggest)
it sounds like he mostly agrees about the parts of MIRI’s view that we care the most about, eg ‘would a slowdown/halt be good in principle’, ‘is the situation crazy’, ‘are the labs wildly irresponsible’, ‘might we actually want a slowdown/halt at some point’, ‘should govs wake up to this and get very involved’, ‘is a serious part of the risk rogue AI and not just misuse’, ‘should we do extensive compute monitoring’, etc.
it’s not 100% of what we’re pushing but it’s overwhelmingly more important to us than whether the risk is more like 20-50% or more like ‘oh no’
I think most readers wouldn’t come away from Will’s review thinking we agree on any of those points, much less all of them
Buck:
I expect his review to directionally make people waaaaaaay more misinformed and confused about the state of the world
I disagree
and I think some of his statements don’t make sense / have big unresolved tensions, and a lot of his arguments were bad and misinformed.
I think some of his arguments are dubious, but I don’t overall agree with you.
I think Will was being unvirtuously cagey or spin-y about his views, and this doesn’t absolve me of responsibility for trying to read the tea leaves and figure out what he actually thinks about ‘should government ever slow down or halt the race to ASI?’, but it would have been a very easy misinterpretation for him to prevent (if his views are as you suggest)
I disagree for what it’s worth.
it sounds like he mostly agrees about the parts of MIRI’s view that we care the most about, eg ‘would a slowdown/halt be good in principle’, ‘is the situation crazy’, ‘are the labs wildly irresponsible’, ‘might we actually want a slowdown/halt at some point’, ‘should govs wake up to this and get very involved’, ‘is a serious part of the risk rogue AI and not just misuse’, ‘should we do extensive compute monitoring’, etc.
it’s not 100% of what we’re pushing but it’s overwhelmingly more important to us than whether the risk is more like 20-50% or more like ‘oh no’
I think that the book made the choice to center a claim that people like Will and me disagree with: specifically, “With the current trends in AI progress building super intelligence is overwhelmingly likely to lead to misaligned AIs that kill everyone.”
It’s true that much weaker claims (e.g. all the stuff you have in quotes in your message here) are the main decision-relevant points. But the book chooses to not emphasize them and instead emphasize a much stronger claim that in my opinion and Will’s opinion it fails to justify.
I think it’s reasonable for Will to substantially respond to the claim that you emphasize, rather than different claims that you could have chosen to emphasize.
I think a general issue here is that MIRI people seem to me to be responding at a higher simulacrum level than the one at which criticisms of the book are operating. Here you did that partly because you interpreted Will as himself operating at a higher simulacrum level than the plain reading of the text.
I think it’s a difficult situation when someone makes criticisms that, on the surface level, look like straightforward object level criticisms, but that you suspect are motivated by a desire to signal disagreement. I think it is good to default to responding just on the object level most of the time, but I agree there are costs to that strategy.
And if you want to talk about the higher simulacra levels, I think it’s often best to do so very carefully and in a centralized place, rather than in a response to a particular person.
I also agree with Habryka’s comment that Will chose a poor phrasing of his position on regulation.
Rob: If we agree about most of the decision-relevant claims (and we agree about which claims are decision-relevant), then I think it’s 100% reasonable for you and Will to critique less-decision-relevant claims that Eliezer and Nate foregrounded; and I also think it would be smart to emphasize those decision-relevant claims a lot more, so that the world is likely to make better decisions. (And so people’s models are better in general; I think the claims I mentioned are very important for understanding the world too, not just action-relevant.)
I especially think this is a good idea for reviews sent to a hundred thousand people on Twitter. I want a fair bit more of this on LessWrong too, but I can see a stronger claim having different norms on LW, and LW is also a place where a lot of misunderstandings are less likely because a lot more people here have context.
Re simulacra levels: I agree that those are good heuristics. For what it’s worth, I still have a much easier time mentally generating a review like Will’s when I imagine the author as someone who disagrees with that long list of claims; I have a harder time understanding how none of those points of agreement came up in the ensuing paragraphs if Will tacitly agreed with me about most of the things I care about.
Possibly it’s just a personality or culture difference; if I wrote “This is a shame, because I think the risk of misaligned AI takeover is enormously important” (especially in the larger context of the post it occurred in) I might not mean something all that strong (a lot of things in life can be called “enormously important” from one perspective or another); but maybe that’s the Oxford-philosopher way of saying something closer to “This situation is insane, we’re playing Russian roulette with the world, this is an almost unprecedented emergency.”
(Flagging that this is all still speculative because Will hasn’t personally confirmed what his views are someplace I can see it. I’ve been mostly deferring to you, Oliver, etc. about what kinds of positions Will is likely to endorse, but my actual view is a bit more uncertain than it may sound above.)

Rob Bensinger 28 Sep 2025 2:13 UTC
4 points
2
in reply to: Rob Bensinger’s comment on: A Reply to MacAskill on “If Anyone Builds It, Everyone Dies”
(I also would have felt dramatically more positive about Will’s review if he’d kept everything else unchanged but just added the sentence “I definitely think it will be extremely valuable to have the option to slow down AI development in the future.” anywhere in his review. XP If he agrees with that sentence, anyway!)

Rob Bensinger 28 Sep 2025 2:11 UTC
8 points
4
in reply to: Buck’s comment on: A Reply to MacAskill on “If Anyone Builds It, Everyone Dies”
I definitely think it will be extremely valuable to have the option to slow down AI development in the future.
What are the mechanisms you find promising for causing this to occur? If we all agree on “it will be extremely valuable to have the option to slow down AI development in the future”, then I feel silly for arguing about other things; it seems like the first priority should be to talk about ways to achieve that shared goal, whatever else we disagree about.
(Unless there’s a fast/easy way to resolve those disagreements, of course.)

Rob Bensinger 28 Sep 2025 2:06 UTC
6 points
1
in reply to: habryka’s comment on: A Reply to MacAskill on “If Anyone Builds It, Everyone Dies”
banning anyone from having more than 8 GPUs
I assume you know this, but I’ll say out loud that this is a straw-man, since I expect this to be a common misunderstanding. The book suggests “[more than] eight of the most advanced GPUs from 2024” as a possible threshold where international monitoring efforts come online and the world starts caring that you aren’t using those GPUs to push the world closer to superintelligence, if it’s possible to do so.
“More than 8 GPUs” is also potentially confusing because people are likely to anchor to consumer hardware. From the book’s online appendices:
The most advanced AI chips are also quite specialized, so tracking and monitoring them would have few spillover effects. NVIDIA’s H100 chip, one of the most common AI chips as of mid-2025, costs around $30,000 per chip and is designed to be run in a datacenter due to its cooling and power requirements. These chips are optimized for doing the numerical operations involved in training and running AIs, and they’re typically tens to thousands of times more performant at AI workloads than standard computers (consumer CPUs).

Rob Bensinger 28 Sep 2025 1:46 UTC
5 points
−6
in reply to: habryka’s comment on: A Reply to MacAskill on “If Anyone Builds It, Everyone Dies”
I wasn’t exclusively looking at that line; I was also assuming that if Will liked some of the book’s core policy proposals but disliked others, then he probably wouldn’t have expressed such a strong a blanket rejection. And I was looking at Will’s proposal here:
[IABIED skips over] what I see as the crucial period, where we move from the human-ish range to strong superintelligence[1]. This is crucial because it’s both the period where we can harness potentially vast quantities of AI labour to help us with the alignment of the next generation of models, and because it’s the point at which we’ll get a much better insight into what the first superintelligent systems will be like. The right picture to have is not “can humans align strong superintelligence”, it’s “can humans align or control AGI-”, then “can {humans and AGI-} align or control AGI” then “can {humans and AGI- and AGI} align AGI+” and so on.
This certainly sounds like a proposal that we advance AI as fast as possible, so that we can reach the point where productive alignment research is possible sooner.
The next paragraph then talks about “a gradual ramp-up to superintelligence”, which makes it sound like Will at least wants us to race to the level of superintelligence as quickly as possible, i.e., he wants the chain of humans-and-AIs-aligning-stronger-AIs to go at least that far:
Elsewhere, EY argues that the discontinuity question doesn’t matter, because preventing AI takeover is still a ‘first try or die’ dynamic, so having a gradual ramp-up to superintelligence is of little or no value. I think that’s misguided.
… Unless he thinks this “gradual ramp-up” should be achieved via switching over at some point from the natural continuous trendlines he expects from industry, to top-down government-mandated ratcheting up of a capability limit? But I’d be surprised if that’s what he had in mind, given the rest of his comment.
Wanting the world to race to build superintelligence as soon as possible also seems like it would be a not-that-surprising implication of his labs-have-alignment-in-the-bag claims.
And although it’s not totally clear to me how seriously he’s taking this hypothetical (versus whether he mainly intends it as a proof of concept), he does propose that we could build a superintelligent paperclip maximizer and plausibly be totally fine (because it’s risk averse and promise-keeping), and his response to “Maybe we won’t be able to make deals with AIs?” is:
I agree that’s a worry; but then the right response is to make sure that we can.
Not “in that case maybe we shouldn’t build a misaligned superintelligence”, but “well then we’d sure better solve the honesty problem!”.
All of this together makes me extremely confused if his real view is basically just “I agree with most of MIRI’s policy proposals but I think we shouldn’t rush to enact a halt or slowdown tomorrow”.
If his view is closer to that, then that’s great news from my perspective, and I apologize for the misunderstanding. I was expecting Will to just straightforwardly accept the premises I listed, and for the discussion to proceed from there.
I’ll add a link to your comment at the top of the post so folks can see your response, and if Will clarifies his view I’ll link to that as well.
Twitter says that Will’s tweet has had over a hundred thousand views, so if he’s a lot more pro-compute-governance, pro-slowdown, and/or pro-halt than he sounded in that message, I hope he says loud stuff in the near future to clarify his views to folks!

Rob Bensinger 1 Jul 2025 4:52 UTC
1 point
0
in reply to: Buck’s comment on: TurnTrout’s shortform feed
yeah, I left off this part but Nate also said
[people having trouble separating them] does maybe enhance my sense that the whole community is desperately lacking in nate!courage, if so many people have such trouble distinguishing between “try naming your real worry” and “try being brazen/rude”. (tho ofc part of the phenomenon is me being bad at anticipating reader confusions; the illusion of transparency continues to be a doozy.)

Rob Bensinger 1 Jul 2025 4:15 UTC
21 points
4
in reply to: Duncan Sabien (Inactive)’s comment on: TurnTrout’s shortform feed
Nate messaged me a thing in chat and I found it helpful and asked if I could copy it over:
fwiw a thing that people seem to me to be consistently missing is the distinction between what i was trying to talk about, namely the advice “have you tried saying what you actually think is the important problem, plainly, even once? ideally without broadcasting signals of how it’s a socially shameful belief to hold?”, and the alternative advice that i was not advocating, namely “have you considered speaking to people in a way that might be described as ‘brazen’ or ‘rude’ depending on who’s doing the describing?”.
for instance, in personal conversation, i’m pretty happy to directly contradict others’ views—and that has nothing to do with this ‘courage’ thing i’m trying to describe. nate!courage is completely compatible with saying “you don’t have to agree with me, mr. senator, but my best understanding of the evidence is [thing i believe]. if ever you’re interested in discussing the reasons in detail, i’d be happy to. and until then, we can work together in areas where our interests overlap.” there are plenty of ways to name your real worry while being especially respectful and polite! nate!courage and politeness are nearly orthogonal axes, on my view.

Rob Bensinger 1 Jul 2025 2:24 UTC
10 points
4
in reply to: Duncan Sabien (Inactive)’s comment on: TurnTrout’s shortform feed
FWIW, as someone who’s been working pretty closely with Nate for the past ten years (and as someone whose preferred conversational dynamic is pretty warm-and-squishy), I actively enjoy working with the guy and feel positive about our interactions.

Rob Bensinger 29 Jun 2025 17:52 UTC
19 points
13
in reply to: Knight Lee’s comment on: A case for courage, when speaking of AI danger
(Considering how little cherry-picking they did.)
From my perspective, FWIW, the endorsements we got would have been surprising even if they had been maximally cherry-picked. You usually just can’t find cherries like those.

Rob Bensinger 21 Jun 2025 18:03 UTC
11 points
3
in reply to: Lumpyproletariat’s comment on: New Endorsements for “If Anyone Builds It, Everyone Dies”
(That was indeed my first thought when Bernanke said he liked the book; no dice, though.)

Rob Bensinger 20 Jun 2025 20:52 UTC
30 points
19
in reply to: MalcolmMcLeod’s comment on: New Endorsements for “If Anyone Builds It, Everyone Dies”
Yep. And equally, the blurbs would be a lot less effective if the title were more timid and less stark.
Hearing that a wide range of respected figures endorse a book called If Anyone Builds It, Everyone Dies: Why Superhuman AI Would Kill Us All is a potential “holy shit” moment. If the same figures were endorsing a book with a vaguely inoffensive title like Smarter Than Us or The AI Crucible, it would spark a lot less interest (and concern).

Rob Bensinger 19 Jun 2025 22:11 UTC
15 points
11
in reply to: Richard Korzekwa ’s comment on: New Endorsements for “If Anyone Builds It, Everyone Dies”
Yeah, I think people usually ignore blurbs, but sometimes blurbs are helpful. I think strong blurbs are unusually likely to be helpful when your book has a title like If Anyone Builds It, Everyone Dies: Why Superhuman AI Would Kill Us All.

Rob Bensinger 19 Jun 2025 19:00 UTC
15 points
2
in reply to: 1a3orn’s comment on: New Endorsements for “If Anyone Builds It, Everyone Dies”
Aside from the usual suspects (people like Tegmark), we mostly sent the book to people following the heuristic “would an endorsement from this person be helpful?”, much more so than “do we know that this person would like the book?”. If you’d asked me individually about Church, Schneier, Bernanke, Shanahan, or Spaulding in advance, I’d have put most of my probability on “this person won’t be persuaded by the book (if they read it at all) and will come away strongly disagreeing and not wanting to endorse”. They seemed worth sharing the book with anyway, and then they ended up liking it (at least enough to blurb it) and some very excited MIRI slack messages ensued.
(I’d have expected Eddy to agree with the book, though I wouldn’t have expected him to give a blurb; and I didn’t know Wolfsthal well enough to have an opinion.)
Nate has a blog post coming out in the next few days that will say a bit more about “How filtered is this evidence?” (along with other topics), but my short answer is that we haven’t sent the book to that many people, we’ve mostly sent it to people whose AI opinions we didn’t know much about (and who we’d guess on priors would be skeptical to some degree), and we haven’t gotten many negative reactions at all. (Though we’ve gotten people who just didn’t answer our inquiries, and some of those might have read the book and disliked it enough to not reply.)

Rob Bensinger 19 Jun 2025 18:36 UTC
40 points
29
in reply to: habryka’s comment on: New Endorsements for “If Anyone Builds It, Everyone Dies”
Now, how much is that evidence about the correctness of the book? Extremely little!
It might not be much evidence for LWers, who are already steeped in arguments and evidence about AI risk. It should be a lot of evidence for people newer to this topic who start with a skeptical prior. Most books making extreme-sounding (conditional) claims about the future don’t have endorsements from Nobel-winning economists, former White House officials, retired generals, computer security experts, etc. on the back cover.

Rob Bensinger 18 Jun 2025 22:36 UTC
10 points
0
in reply to: Anon User’s comment on: New Endorsements for “If Anyone Builds It, Everyone Dies”
We’re still working out some details on the preorder events; we’ll have an announcement with more info on LessWrong, the MIRI Newsletter, and our Twitter in the next few weeks.
You don’t have to do anything special to get invited to preorder-only events. :) In the case of Nate’s LessOnline Q&A, it was a relatively small in-person event for LessOnline attendees who had preordered the book; the main events we have planned for the future will be larger and online, so more people can participate without needing to be in the Bay Area.
(Though we’re considering hosting one or more in-person events at some point in the future; if so, those would be advertised more widely as well.)

Rob Bensinger 18 Jun 2025 17:57 UTC
2 points
0
in reply to: MC_Escherichia’s comment on: New Endorsements for “If Anyone Builds It, Everyone Dies”
“Inventor” is correct!

Rob Bensinger 3 Jun 2025 3:29 UTC
2 points
0
in reply to: mark@finnern.com’s comment on: Eliezer and I wrote a book: If Anyone Builds It, Everyone Dies
Hopefully a German pre-order from a local bookstore will make a difference.
Yep, this counts! :)
What links here?
- Sherrinford's comment on Sherrinford’s Shortform by Sherrinford (20 Jun 2025 19:55 UTC; 2 points)

Rob Bensinger 29 May 2025 21:15 UTC
4 points
0
in reply to: k64’s comment on: Eliezer and I wrote a book: If Anyone Builds It, Everyone Dies
It’s a bit complicated, but after looking into this and weighing this against other factors, MIRI and our publisher both think that the best option is for people to just buy it when they think to buy it—the sooner, the better.
Whether you’re buying on Amazon or elsewhere, on net I think it’s a fair bit better to buy now than to wait.

Rob Bensinger 21 May 2025 21:49 UTC
39 points
4
in reply to: TristanTrim’s comment on: Eliezer and I wrote a book: If Anyone Builds It, Everyone Dies
Yeah, I think the book is going to be (by a very large margin) the best resource in the world for this sort of use case. (Though I’m potentially biased as a MIRI employee.) We’re not delaying; this is basically as fast as the publishing industry goes, and we expected the audience to be a lot smaller if we self-published. (A more typical timeline would have put the book another 3-20 months out.)
If Eliezer and Nate could release it sooner than September while still gaining the benefits of working with a top publishing house, doing a conventional media tour, etc., then we’d definitely be releasing it immediately. As is, our publisher has done a ton of great work already and has been extremely enthusiastic about this project, in a way that makes me feel way better about this approach. “We have to wait till September” is a real cost of this option, but I think it’s a pretty unavoidable cost given that we need this book to reach a lot of people, not just the sort of people who would hear about it from a friend on LessWrong.
I do think there are a lot of good resources already online, like MIRI’s recently released intro resource, “The Problem”. It’s a very different beast from If Anyone Build It, Everyone Dies (mainly written by different people, and independent of the whole book-writing process), and once the book comes out I’ll consider the book strictly better for anyone willing to read something longer. But I think “The Problem” is a really good overview in its own right, and I expect to continue citing it regularly, because having something shorter and free-to-read does matter a lot.
Some other resources I especially like include:
- Gabriel Alfour’s Preventing Extinction from Superintelligence, for a quick and to-the-point overview of the situation.
- Ian Hogarth’s We Must Slow Down the Race to God-Like AI (requires Financial Times access), for an overview with a bit more discussion of recent AI progress.
- The AI Futures Project’s AI 2027, for a discussion focused on very near-term disaster scenarios. (See also a response from Max Harms, who works at MIRI.)
- MIRI’s AGI Ruin, for people who want a more thorough and (semi)technical “why does AGI alignment look hard?” argument. This is a tweaked version of the LW AGI Ruin post, with edits aimed at making the essay more useful to share around widely. (The original post kinda assumed you were vaguely in the LW/EA ecosystem.)