Nick Bostrom comments on Optimal Timing for Superintelligence: Mundane Considerations for Existing People

Nick Bostrom 13 Feb 2026 17:22 UTC
56 points
7
Yes the post explicitly considers things only from a mundane person-affecting stance, and I would not argue that this is the correct stance or the one I would all-things-considered endorse. However, it may be a component in a more plausible complex view, or a caucus in a ‘moral parliament’; so I think it is worth investigating what it implies. If ethics is complicated, we may need a divide and conquer strategy, where we isolate and analyze one element at a time.
Existing ordinary people usually have preferences and values not captured by the QALY calculation, such as wanting their kids having happy lives, or wanting the world not to end, or even wanting to live in a world which they understand.
I agree that people also have such preferences. Regarding people wanting their kids to have happy lives, the post does discuss that (under ‘other-focused prudential concerns’). Ceteris paribus, this pushes towards longer timelines being optimal. People might want their parents to have happy lives, which pushes in the opposite direction.
Preferences for not wanting the world to end: I think this would need to be discussed in the context of an analysis of which timelines are optimal from an xrisk-minimization perspective, which was (lamely) set aside for possible future work.
“live in a world they understand”—If this is something people want, it seems plausible AI could help a lot, by providing better explanations and perhaps cognitive enhancements. In the status quo, I think most technological devices (and to some extent our institutions) are, to most people, black boxes that offer somewhat understandable affordances. And even highly educated people are ignorant about the most fundamental aspects of how the world works, since we e.g. lack a theory of quantum gravity or a proper understanding of simulation theory and whatever other such ultimate parameters define reality.
Using the “undergo risky surgery” analogy, this analysis assumes the patient’s preferences have no place in the analysis, and if and when the surgery should happen should be decided by a utilitarian social planner.
I don’t think so. In the surgery example, the patient presumably gets to make the final decision; yet one may want to analyze the relevant tradeoffs to help inform their choice. In the case of AI, one can analyze the consequences for various interests or values of different choices that could be made—independently of who (or what institution) one thinks ultimately ought to make the decision. In any case, it seems likely that the outcome will be determined by processes that are not morally optimal or maximally legitimate; so the main question of interest may be how, on the margin, various stakeholders might wish to try to nudge things.
The analysis hedges by “not supporting any particular policy prescriptions” but I do expect in practice to be quoted in informal motte-and-bailey way
That is plausible. (This is also the case with much of my other writings, alas—I have no comparative advantage in being an ‘on message’ communicator; and I suspect any tiny specks of value there might be in my work would be washed away if I tried much harder to exercise message discipline.)
- habryka 13 Feb 2026 20:21 UTC
  73 points
  49
  Parent
  Yes the post explicitly considers things only from a mundane person-affecting stance, and I would not argue that this is the correct stance or the one I would all-things-considered endorse.
  I do feel confused about the centrality of the person-affecting stance in this paper. My relationship to the person-affecting stance is approximately the same as the stance I would have to an analysis of the form “what should someone do if they personally don’t want to die and are indifferent to killing other people, or causing large amounts of suffering to other people, in the pursuit of that goal”. And my stance to that goal is “that is deeply sociopathic and might make a fun fiction story, but it obviously shouldn’t be the basis of societal decision-making, and luckily also isn’t”.
  But when I read this paper, I don’t get that you relate to it anything like that. You say a bit that you are only doing this analysis from a person-affecting view, but most of the frame of the paper treats that view as something that is reasonable to maybe be the primary determinant of societal decision-making, and that just doesn’t really make any sense to me.
  Like, imagine applying this analysis to any previous generation of humans. It would have resulted in advocating previous generations of humans to gamble everything on extremely tenuous chances of immortality, and probably would have long resulted in extinction, or at least massively inhibited growth. It obviously doesn’t generalize in any straightforward sense. And I feel like in order for the paper to leave the reader not with a very confused model of what is good to do here, that kind of analysis needed to be a very central part of the paper.
  What links here?
  - AI #156 Part 2: Errors in Rhetoric by Zvi (20 Feb 2026 14:31 UTC; 45 points)
  - Canaletto 14 Feb 2026 12:14 UTC
    8 points
    0
    Parent
    Isn’t CEV pretty much person-affecting view implemented? It’s not like you consider including dead people or future people in CEV or animals or aliens? They would receive consideration from preferences of people you do, but not directly.
    - habryka 14 Feb 2026 17:25 UTC
      3 points
      −9
      Parent
      I am here referencing person-affecting welfare-utilitarian views (this is pretty clear in the context of the paper, and also “person-affecting views” practically always refers to a subset of welfare-utilitarian views).
      We could go into the tricky details of what CEV might be, or how the game theory plays into it, but the paper is referencing a much narrower moral perspective (in which the only things that matter are the experiences, not the preferences of the people currently alive).
      - Matthew Barnett 14 Feb 2026 22:19 UTC
        11 points
        −15
        Parent
        the paper is referencing a much narrower moral perspective (in which the only things that matter are the experiences, not the preferences of the people currently alive).
        Note that you could hold the view that the vast majority of people care mostly, even if not entirely, about the lives of people who currently exist: themselves, their immediate family, their children, and their friends. This is highly plausible when you consider that birth rates are crashing worldwide. Most people clearly prioritize their family’s material well-being over maximizing their future descendants who will be born many decades or centuries from now. Most people are not longtermists, or total utilitarians.
        If this is the case, and I believe it is, then the welfare version of person-affecting views and the preference version largely coincide.
        habryka 14 Feb 2026 22:46 UTC
        8 points
        0
        Parent
        Yep, agree if people did not care about having children and the preferences of those children in the future, or about leaving a legacy to future humans, or about future generations, and in general were indifferent to any suffering or happiness to anyone not currently alive, then these two would coincide (but this strikes me as exceedingly unlikely).
        Matthew Barnett 14 Feb 2026 22:49 UTC
        7 points
        6
        Parent
        People do care about having children, and they care especially strongly about their living children. But their concern for future unborn descendants, particularly in the distant future, is typically weaker than their concern for everyone who is currently alive.
        habryka 14 Feb 2026 22:52 UTC
        12 points
        0
        Parent
        I am certainly not saying people’s behavior is well-approximated by people caring about future people as much as they care about people today! Indeed, I would be very surprised if people’s caring factors much through welfare considerations at all. Mostly people have some concern about “the future of humanity”, and that concern is really quite strong. I don’t think it’s particularly coherent (as practically no ethical behavior exhibited by broad populations is), but it clearly is quite strong.
        Matthew Barnett 14 Feb 2026 23:16 UTC
        28 points
        11
        Parent
        Mostly people have some concern about “the future of humanity”, and that concern is really quite strong. I don’t think it’s particularly coherent (as practically no ethical behavior exhibited by broad populations is), but it clearly is quite strong.
        How would we test the claim that people have a strong concern about the long-term future of humanity? Almost every way I can think of measuring this seems to falsify it.
        The literature on time discounting and personal finance behavior doesn’t support it. Across the world people are having fewer children than ever, suggesting they are placing less and less priority on having a posterity at all. Virtually all political debate concerns the lives of currently living people rather than abstract questions about humanity’s distant future. The notable exceptions, such as climate change, seem to reinforce my point: climate concern has been consistently overshadowed by our material interest in cheap fossil fuels, as evidenced by the fact that emissions and temperatures keep rising every year despite decades of debate.
        One might argue that in each of these cases people are acting irrationally, and that we should look at their stated values rather than their revealed behavior. But the survey data doesn’t clearly demonstrate that people are longtermists either. Schubert et al. asked people directly about existential risk, and one of their primary findings was: “Thus, when asked in the most straightforward and unqualified way, participants do not find human extinction uniquely bad. This could partly explain why we currently invest relatively small resources in reducing existential risk.” We could also look at moral philosophers, who have spent thousands of years debating what we should ultimately value, and among whom explicit support for longtermism remains a minority position. This fact is acknowledged by longtermist philosophers like Hilary Greaves and William MacAskill, who generally emphasize that longtermist priorities are “neglected”, both within their field and by society at large.
        I acknowledge that most people have some concern for the future of humanity. But “some concern” is not what we’re arguing about here. This concern would need to be very strong to override people’s interests in their own lives, such as whether they will develop Alzheimer’s or whether their parents will die. Even if people do have strong feelings about the future of humanity upon reflection, that concern is not “clear” but rather speculative. How could we actually know what people ultimately value upon reflection? In any case, the strong concern people have for their actual, living family is already pretty clear given the ordinary behavior that they engage in: how they spend their money, how many children they have, etc.
        Vugluscr Varcharka 15 Feb 2026 4:09 UTC
        0 points
        0
        Parent
        That, I suppose, depends strongly on whether one has or has not been fortunate. Threshold by my intuitive view is located around having accumulated enough resources to safeguard one’s AND one’s children future wellbeing/striving. Which is sooooooo stupid and sad to be happening on LW of all places. The asymmetry of mutual understanding between two groups goes again common sense though—i mean most fortunate ones should’ve had been the unfortunate ones in the past. Not so for us unfortunates. Fs should understand UFs mindsets better, but they seem not to. Like it’s the main thing of being LW—noticing our own biases, but Fs seem to have fallen victims of this self-directed warfare… I probably won’t be allowed to comment again this year—karma here bites—just in case anybody wonders))))
      - Canaletto 14 Feb 2026 18:56 UTC
        1 point
        0
        Parent
        There might be a distinction here in considering CEV in near VS far mode. As this is one of the pretty strong considerations that would be included, I believe. Did you hope the CEV would be good by your lights? But you are just ¹⁄_8000000000 constituent of it, can go in many ways. And I’m not very sure if current (mixed) attitudes towards it would be amplified in one direction or another.
  - Stephen McAleese 13 Feb 2026 23:53 UTC
    6 points
    4
    Parent
    I think what he’s saying is that he and others have been promoting the idea of an impersonal longtermist view a lot over the past few years but that he has moral uncertainty and wants to consider other views. So he wrote a paper about the future of AI using a radically different perspective (the person-affecting view) even though he may not agree with it.
    
    Though as you said if he really favors a more impersonal view then he could have done a better job at communicating that in the paper.
    - habryka 14 Feb 2026 0:03 UTC
      32 points
      29
      Parent
      Yeah, and as I said, I think choosing the person-affecting view for this seems about as useful as choosing the “sociopath willing to kill every person in their path to get to their goal” point of view. I don’t understand this choice. It has obvious reductio-ad-absurdum cases.
      And most importantly, the choice of ethical framework of course trivially overdetermines the answer. If your assumption is “assume humanity should gamble arbitrarily much on the immortality of the current generation” and “assume AI could provide immortality and nothing else can”, the answer of course becomes “humanity should gamble everything on AI providing immortality”. The rest is just fancy dress-up work.
      Like, I think a persuasive or reasonable paper would have put its central load-bearing assumptions up front. It might still be worth chasing out the implications of the assumptions, but this is such a weird set of modeling assumptions, that explaining how sensitive the conclusion to this modeling assumption is, is the central thing a reader needs to understand when reading this. I think almost no reader would naively understand this, and indeed the conclusion seems to actively contradict the perspective, with most sentences not being phrased in the form “if you had belief X, then Y would make sense” but just directly phrased in pithy ways like “The appropriate analogy for the development of superintelligence is not Russian roulette but surgery for a serious condition that would be fatal if left untreated.”.
      - Nick Bostrom 14 Feb 2026 3:24 UTC
        17 points
        1
        Parent
        Like, I think a persuasive or reasonable paper would have put its central load-bearing assumptions up front.
        More up front than in the title?
        it obviously shouldn’t be the basis of societal decision-making, and luckily also isn’t”.
        Societal decision-making typically uses a far narrower basis than the general person-affecting stance that this paper analyzes. For example, not only do voters and governments usually not place much weight on not-yet-conceived people that might come into existence in future millennia, but they care relatively little about what happens to currently existing persons in other countries.
        habryka 14 Feb 2026 3:35 UTC
        46 points
        42
        Parent
        It’s not in the title, which is “Optimal Timing for Superintelligence: Mundane Considerations for Existing People”. My guess is you were maybe hoping that people would interpret “considerations for existing people”, to be equivalent to “person-affecting views” but that IMO doesn’t make any sense. A person-affecting assumption is not anywhere close to equivalent to “considerations for existing people”.
        Existing people care about the future, and the future of humanity! If existing people (including me) didn’t care about future people, then the person-affecting view would indeed be correct, but people do!
        For example, not only do voters and governments usually not place much weight on not-yet-conceived people that might come into existence in future millennia, but they care relatively little about what happens to currently existing persons in other countries.
        Voters and governments put enormous weight on not-yet-conceived people! The average planning horizon for climate change regulation is many decades in the future. Nuclear waste management policies are expected to contain waste for hundreds of years. If anything, I think current governance tends to put far too much weight on the future relative to their actual ability to predict the future (as indeed, I expect neither nuclear waste nor climate change to still be relevant when they are forecasted to have large impacts).
        It’s true that governments care less about what happens to people outside of their country, but that just seems like an orthogonal moral issue. They do very much care about their own country and routinely make plans that extend beyond the median life-expectancy of the people within their country (though usually this is a bad idea because actually they aren’t able to predict the future well enough to make plans that far out, but extinction risk is one of the cases where you can actually predict what will happen that far out, because you do know that you don’t have a country anymore if everyone in your country is dead).
        Caring about future generations seems common, if not practically universal in policymaking. All the variance in why policymaking tends to focus on short-term effects is explained by the fact the future is hard to predict, not a lack of caring by governance institutions about the future of their countries or humanity at large. But that variance simply doesn’t exist for considering extinction risks. Maybe you have some other evidence that convinced you that countries and policy-makers operate on person-affecting views?
        I am very confident that if you talk to practically any elected politician and ask them “how bad is it for everyone in the world to become infertile but otherwise they would lead happy lives until their deaths?”, their reaction would be “that would be extremely catastrophic and bad, humanity would be extinct soon, that would be extremely terrible” (in as much as you can get them to engage with the hypothetical seriously, which is of course often difficult).
        Nick Bostrom 14 Feb 2026 13:58 UTC
        16 points
        1
        Parent
        The average planning horizon for climate change regulation is many decades in the future. Nuclear waste management policies are expected to contain waste for hundreds of years. … Maybe you have some other evidence that convinced you that countries and policy-makers operate on person-affecting views?
        Of course they don’t consistently operate on any specific moral view. But I would claim that they are less badly approximated by ‘benefit currently existing citizens’ than ‘neutrally benefit all possible future people (or citizens) that might be brought into existence over future eons’. Much less is spent on things like nuclear waste management and preventing climate change than on providing amenities for the current population. In fact, they may be spending a net negative amount of resources on trying to benefit future generations, since they are often saddling future generations with vast debt burdens in order to fund current consumption. (FHI—particularly Toby Ord—was involved in some efforts to try to infuse a little bit more consideration of future generations in UK policymaking, but I think only very limited inroads where made on that front.)
        habryka 14 Feb 2026 17:09 UTC
        12 points
        4
        Parent
        Yep, I am definitely not saying that current governance cares about future people equally as they do for current people! (My guess is I don’t either, but I don’t know, morality is tricky and I don’t have super strong stances on population ethics)
        But “not caring equally strongly about future people” and “being indifferent to human extinction as long as everyone alive gets to spend the rest of their days happy” are of course drastically different. You are making the second assumption in the paper, which even setting aside whether it’s a reasonable assumption on moral grounds, is extremely divorced from how humanity makes governance decisions (and even more divorced from how people would want humanity to make policy decisions, which would IMO be the standard to aspire to for a policy analysis like this).
        Nick Bostrom 14 Feb 2026 20:12 UTC
        18 points
        4
        Parent
        In other papers (e.g. Existential Risks (2001), Astronomical Waste (2003), and Existential Risk Prevention as a Global Priority (2013)) I focus mostly on what follows from a mundane impersonal perspective. Since that perspective is even further out of step with how humanity makes governance decisions, is it your opinion that those paper should likewise be castigated? (Some people who hate longtermism have done so, quite vehemently.) But my view is that there can be value in working out what follows from various possible theoretical positions, especially ones that have a distinguished pedigree and are taken seriously in the intellectual tradition. Certainly this is a very standard thing to do in academic philosophy, and I think it’s usually a healthy practice.
        habryka 14 Feb 2026 22:17 UTC
        6 points
        14
        Parent
        Since that perspective is even further out of step with how humanity makes governance decisions, is it your opinion that those paper should likewise be castigated?
        I am not fully sure what you are referring to by “mundane impersonal perspective”, but I like all of those papers. I both think they are substantially closer to capturing actual decision-making, and also are closer to what seems to me like good decision-making. They aren’t perfect (I could critique them as well), but my relationship to the perspective brought up in those papers is not the same as I would have to the sociopathic example I mention upthread, and I don’t think they have that many obvious reductio-ad-absurdum cases that obviously violate common-sense morality (and I do not remember these papers advocating for those things, but it’s been a while since I read them).
        But my view is that there can be value in working out what follows from various possible theoretical positions, especially ones that have a distinguished pedigree and are taken seriously in the intellectual tradition. Certainly this is a very standard thing to do in academic philosophy, and I think it’s usually a healthy practice.
        Absolute agree there is value in mapping out these kinds of things! But again, your paper really unambiguously to me does not maintain the usual “if X then Y” structure. It repeatedly falls back into making statements from an all-things-considered viewpoint, using the person-affecting view as a load-bearing argument in those statements (I could provide more quotes of it doing such).
        And then separately, the person-affecting view just really doesn’t seem very interesting to me as a thing to extrapolate. I don’t know why you find it interesting. It seems to me like an exceptionally weak starting point with obvious giant holes in its ethical validity, that make exploring its conclusions much less interesting than the vast majority of other ethical frameworks (like, I would be substantially more interested in a deontological analysis of AI takeoff, or a virtue ethical analysis of AI risk, or a pragmatist analysis, all of which strike me as more interesting and more potentially valid starting point than person-affecting welfare-utilitarianism).
        And then beyond that, even if one were to chase out these implications, it seems like a huge improvement to include an analysis of the likelihood of the premises of the perspective you are chasing out, and how robust or likely to be true they are. It has been a while since I read the papers you linked, but much of at least some of them is arguing for and evaluating the validity of the ethical assumptions behind caring about the cosmic endowment. Your most recent paper seems much weaker on this dimension (though my memory might be betraying me, and plausible I will have the same criticism if I were to read your past work, though even then, arguing from the basis of an approximately correct premise, even if the premise is left unevaluated, clearly is better than arguing from the basis of an IMO obviously incorrect premise, without evaluating it as such).
        TsviBT 15 Feb 2026 5:11 UTC
        4 points
        0
        Parent
        
        And then separately, the person-affecting view just really doesn’t seem very interesting to me as a thing to extrapolate.
        
        [didn’t read original, just responding locally] My impression is that often people do justify AGI research using this sort of view. (Do you disagree?) That would make it an interesting view to extrapolate, no?
      - Lukas_Gloor 14 Feb 2026 11:34 UTC
        10 points
        0
        Parent
        Yeah, and as I said, I think choosing the person-affecting view for this seems about as useful as choosing the “sociopath willing to kill every person in their path to get to their goal” point of view.
        I really disagree. See here (or some of Michael St Jules’ comments in the same thread) for why person-affecting views aren’t “obviously dumb,” as some people seem to think. Or just read the flow-chart towards the end of this post (no need to read the whole post, the flow-chart will give you the basics of the idea.) More relevantly here, they are certainly not selfish, and it reflects poorly on you to make that insinuation. People who dedicate much of their life to EA hold them sincerely as a formalization of what it means to be “maximally altruistic.” It’s just not at all obvious that creating new happy people when there’s suffering in the world is the altruistic thing to do, and you’re begging the question by pretending it’s universally obvious.
        
        I think what’s a lot more fair as a criticism is Jan Kulveit’s point: that person-affecting views combined with only crude welfare utilitarianism—without even factoring in things like existing people wanting their children to survive or wanting humanity as a whole to survive (though maybe less so now with recent world events making more people disappointed in humanity) -- is a weird and unrepresentative combination that would amount to basically only acting on the views of selfish people.
        (And maybe that’s what drove your harsh judgment and you’d be more lenient if the paper made person-affecting assumptions that still put some indirect value on humanity surviving through the step of, “if Oliver as an existing person strongly cares about humanity’s long-run survival, then on a person-affecting view that gives weight to people’s life goals, humanity surviving the long run now gets +1 votes.” Like, maybe you’d still think such a view seems dumb to you, but you wouldn’t feel like it’s okay to proclaim that it’s socially inappropriate for others to have such views.)
        
        But even that—the combination of views Jan Kulveit criticized—seems defensible to me to write a paper about, as long as the assumptions are clearly laid out. (I haven’t read the paper, but I asked Claude to tell me the assumptions it’s based on, and Claude seemed to get it correct in 4 seconds.) Bostrom said the points in this paper could be a building block, not that it’s his view of all we should consider. This sort of thing is pretty standard in philosophy, so much so that it often doesn’t even need to be explicitly stated and proactively contextualized at length, and I think we should just decouple better.
        
        Instead of saying things like “this is a bad paper,” I feel like the fairer criticism would be something more like, “unfortunately we live in an age of twitter were stupid influencers take things out of context, and you could have foreseen that and proactively prevented certain obvious misinterpretations of the paper.” That would be fair criticism, but it makes clear that the paper might still be good for what it aims to be, and it at least puts part of the blame on twitter culture.
        
        On the substance-level, btw, one “arcane consideration” that I would put a lot of weight on, even on person-affecting views, is stuff like what Bostrom talks about in the Cosmic Host paper. (It matters on person-affecting views because other civs in the multiverse exist independently of our actions.) In that paper too I get weird accelerationist vibes and I don’t agree with them either. I think pretty strongly that this cosmic host stuff is an argument for quality over speed when it comes to introducing a new technologically mature civilization to the multiverse-wide commons. It’s pretty bad form to bring an antisocial kid to a birthday party when you could just take extra efforts to first socialize the kid. If a planet just recklessly presses “go” on something that is 80% or 95% likely to be uncontrolled Moloch-stuff/optimization, that’s really shitty. Even if we can’t align AIs to human values, I feel like we at least have a duty to make them good at the building of peaceful coalitions/it being an okay thing to add to the cosmic host.
        What links here?
        Lukas_Gloor's comment on Optimal Timing for Superintelligence: Mundane Considerations for Existing People by Nick Bostrom (14 Feb 2026 13:39 UTC; 8 points)
        habryka 14 Feb 2026 17:20 UTC
        4 points
        2
        Parent
        I think what’s a lot more fair as a criticism is Jan Kulveit’s point: that person-affecting views combined with only crude welfare utilitarianism
        Yep, by person-affecting views I here meant person-affecting welfare utilitarianism views. Sorry for the confusion! I don’t think what I said was super unclear in the context of the criticism of the paper (and even in general, as person-affecting view in my experience almost exclusively gets used in the context of welfare utilitarianism).^[1]
        I haven’t read the paper
        Please read the paper before you criticize my criticism of it then! The paper repeatedly makes claims about optimal policy in an uncaveated fashion, saying things like “The appropriate analogy for the development of superintelligence is not Russian roulette but surgery for a serious condition that would be fatal if left untreated.”
        That sentence has no “ifs” or “buts” or anything. It doesn’t say “from the perspective of a naive welfare utilitarian taking a person-affecting view, the appropriate analogy for the development of superintelligence is...”. It just says “the appropriate analogy is...”.
        It’s clear the paper is treating a welfare utilitarian person-affecting view as a reasonable guide to global policy decisions. The paper does not spend a single paragraph talking about limitations of this view, or explains why one might not want to take it seriously. If this is common in philosophy, then it is bad practice and I don’t want it repeated here.
        And I also don’t buy this is a presentational decision. I am pretty (though not overwhelmingly confident) that Nick does think that this person-affecting moral view should play a major role in the moral parliament of humanity, and that the arguments in the paper are strong arguments for accelerating in many worlds where risks are very but not overwhelmingly high. Do you want to bet with me on this not being the case? And I think doing so is making a grave mistake, and the paper is arguing many people straightforwardly into the grave mistake.
        ^
        If you take a preference utilitarianism perspective, then I defend the optimal morality actually only cares about the extrapolated preferences of exactly me! Moral realism seems pretty obviously false, so a person-affecting preference utilitarianism perspective seems also pretty silly, though I do think that for social coordination reasons, optimizing the preferences of everyone alive (and maybe everyone who was alive in the past) is the right choice, but for me that’s a contingent fact on what will cause the future to go best by my own values.
        What links here?
        Lukas_Gloor's comment on Optimal Timing for Superintelligence: Mundane Considerations for Existing People by Nick Bostrom (14 Feb 2026 18:33 UTC; 5 points)
        Lukas_Gloor 14 Feb 2026 18:33 UTC
        5 points
        0
        Parent
        Sorry for the confusion! I don’t think what I said was super unclear in the context of the criticism of the paper (and even in general, as person-affecting view in my experience almost exclusively gets used in the context of welfare utilitarianism).^[1]
        I see why you have that impression. (I feel like this is an artefact of critics of person-affecting views tending to be classical welfare utilitarians quite often, and they IMO have the bad habit of presenting opposing views inside their rigid framework and then ridiculing them for seeming silly under those odd assumptions. I would guess that most people who self-describe as having some sort of person-affecting view care very much about preferences, in one way or another.)
        Please read the paper before you criticize my criticism of it then! The paper repeatedly makes claims about optimal policy in an uncaveated fashion, saying things like “The appropriate analogy for the development of superintelligence is not Russian roulette but surgery for a serious condition that would be fatal if left untreated.”
        That’s fair, sorry!
        
        It bothered me that people on twitter didn’t even label that the paper explicitly bracketed a lot of stuff and laid out its very simplistic assumptions, but then I updated too far in the direction of “backlash lacked justification.”
        And I think doing so is making a grave mistake, and the paper is arguing many people straightforwardly into the grave mistake.
        I agree it would be a mistake to give it a ton of weight, but I think this view deserves a bit of weight.
        Indirectly related to that, I think some of the points people make of the sort of “if you’re so worried about everyone dying, let’s try cryonics” or “let’s try human enhancement” are unfortunately not very convincing. I think that “everything is doomed unless we hail mary bail ourselves out with magic-like AI takeoff fixing it all for us” is unfortunately quite an accurate outlook. (I’m still open to being proven wrong if suddenly a lot of things were to get more hopeful, though.) Civilization has seemed pretty fucked even just a couple of years ago, and it hasn’t gotten any better more recently. Still, on my suffering-focused views, that makes it EVEN LESS appealing that we should launch AI, not more appealing.
        
        To be clear, I agree that it’s a failure mode to prematurely rule things out just because they seem difficult. And I agree that it’s insane to act as though global coordination to pause AI is somehow socially or politically impossible. It clearly isn’t. I think pausing AI is difficult but feasible. I think “fixing the sanity of civilization so that you have competent people in charge in many places that matter” seems much less realistic? Basically, I think you can build local bubbles of sanity around leaders with the right traits and groups with the right culture, but it’s unfortunately quite hard given human limitations (and maybe other aspects of our situation) to make these bubbles large enough to ensure things like cryonics or human enhancement goes well for many decades without somehow running into a catastrophe sooner or later. (Because progress moves onwards in certain areas even with an AI pause.)
        I’m just saying that, given what I think is the accurate outlook, it isn’t entirely fair to shoot down any high-variance strategies with “wtf, why go there, why don’t we do this other safer thing instead ((that clearly isn’t going to work))?”
        
        If I didn’t have suffering-focused values, I would be sympathetic to the intuition of “maybe we should increase the variance,” and so, on an intellectual level at least, I feel like Bostrom deserves credit for pointing that out.
        
        But I have a suffering-focused outlook, so, for the record, I disagree with the conclusions. Also, I think even based on less suffering-focused values, it seems very plausible to me that civilizations that don’t have their act together enough to proceed into AI takeoff with coordination and at least a good plan, shouldn’t launch AI at all. It’s uncooperative towards possibly nearby other civilizations or towards the “cosmic host.” Bostrom says he’s concerned about scenarios where superintelligence never gets built. It’s not obvious to me that this is very likely, though, so if I’m right that earth would rebuild even after a catastrophe, and if totalitarianism or other lock ins without superintelligence wouldn’t last all that long before collapsing in one way or another, then there’s no rush from a purely longtermist perspective. (I’m not confident in these assumptions, but I partly have these views from deferring to former FHI staff/affiliates, of all people (on the rebuilding point).)
        habryka 14 Feb 2026 18:53 UTC
        5 points
        3
        Parent
        I’m just saying that, given what I think is the accurate outlook, it isn’t entirely fair to shoot down any high-variance strategies with “wtf, why go there, why don’t we do this other safer thing instead ((that clearly isn’t going to work))?”
        While I disagree with your outlook^[1], I agree that we shouldn’t dismiss high variance strategies lightly. I am not criticizing the paper on the grounds of the policy it advocates. If someone were to wrote a paper that had as shaky foundations, and treated those foundations with as little suspicion as this paper, I would react the same way (e.g. if someone wrote a paper arguing against developing AI for job loss reasons, without once questioning whether job loss is actually bad, I would object on similar grounds).
        Bostrom says he’s concerned about scenarios where superintelligence never gets built.
        That is also a concern I have much more sympathy towards than this paper. I think it’s quite unlikely, but I can see the argument. I don’t feel that way about the arguments in this paper.
        ^
        indeed, I think in the absence of developing AI we would quickly develop alternative, much safer technologies which would most likely cause humanity to very substantially become better at governing itself, and to navigate the future reasonably
        Nick Bostrom 14 Feb 2026 16:08 UTC
        3 points
        0
        Parent
        On the substance-level, btw, one “arcane consideration” that I would put a lot of weight on, even on person-affecting views, is stuff like what Bostrom talks about in the Cosmic Host paper. … Even if we can’t align AIs to human values, I feel like we at least have a duty to make them good at the building of peaceful coalitions/it being an okay thing to add to the cosmic host.
        Yes, although, as that paper discusses, speed may also be important insofar as it reduces the risk of us failing to add anything at all, since that’s also something the cosmic host may care about—the risk that we fail ever to produce superintelligence. (My views about those things are quite tentative, and they fall squarely into the ‘arcane’. I agree on their importance.)
- Wei Dai 14 Feb 2026 12:44 UTC
  11 points
  2
  Parent
  Nick, I’m afraid that a faction^[1] of your moral parliament may have staged a (hopefully temporary) coup or takeover, because if all of the representatives were still in a cooperative mood it seems like you’d probably have inserted at least a few more sentences to frame it differently to mitigate potential risks. You have enough people around you who would presumably be happy to help you with this even if you “have no comparative advantage” in it. (Comparative advantage is supposed to be an argument for trade, not an excuse for ignoring risks/downsides to your other values!)
  1. ^
    perhaps a coalition of egoism, person-affecting altruism, and intellectual pursuit for its own sake
  - Lukas_Gloor 14 Feb 2026 13:39 UTC
    8 points
    1
    Parent
    I agree with the concern generally, but I think we very much should not concede the point (to people with EPOCH-type beliefs, for instance) that AI accelerationism is an okay conclusion for people with person-affecting views (as you imply a bit in your endnote). For one thing, even on Bostrom’s analysis, pausing for multiple years makes sense under quite a broad class of assumptions (personally I think it’s clearly bad thinking to put only <15% on risk of AI ruin, and my own credence is >>50%). Secondly, as Jan Kulveit’s top-level comment here pointed out, more things matter on person-affecting views than crude welfare-utilitarian considerations (it also matters that some people want their children to grow up or for humanity to succeed in the long run even at some personal cost). Lastly, see the point in the last paragraph of my reply to habryka: Other civs in the multiverse matter also on person-affecting views, and it’s quite embarrassing and bad form if our civilization presses “go” on something that is 80% or 95% likely to get out of control and follow Moloch dynamics, when we could try to take more care and add a more-likely-to-be cooperative and decent citizen to the “cosmic host”.