Yeah, and as I said, I think choosing the person-affecting view for this seems about as useful as choosing the “sociopath willing to kill every person in their path to get to their goal” point of view. I don’t understand this choice. It has obvious reductio-ad-absurdum cases.
And most importantly, the choice of ethical framework of course trivially overdetermines the answer. If your assumption is “assume humanity should gamble arbitrarily much on the immortality of the current generation” and “assume AI could provide immortality and nothing else can”, the answer of course becomes “humanity should gamble everything on AI providing immortality”. The rest is just fancy dress-up work.
Like, I think a persuasive or reasonable paper would have put its central load-bearing assumptions up front. It might still be worth chasing out the implications of the assumptions, but this is such a weird set of modeling assumptions, that explaining how sensitive the conclusion to this modeling assumption is, is the central thing a reader needs to understand when reading this. I think almost no reader would naively understand this, and indeed the conclusion seems to actively contradict the perspective, with most sentences not being phrased in the form “if you had belief X, then Y would make sense” but just directly phrased in pithy ways like “The appropriate analogy for the development of superintelligence is not Russian roulette but surgery for a serious condition that would be fatal if left untreated.”.
Like, I think a persuasive or reasonable paper would have put its central load-bearing assumptions up front.
More up front than in the title?
it obviously shouldn’t be the basis of societal decision-making, and luckily also isn’t”.
Societal decision-making typically uses a far narrower basis than the general person-affecting stance that this paper analyzes. For example, not only do voters and governments usually not place much weight on not-yet-conceived people that might come into existence in future millennia, but they care relatively little about what happens to currently existing persons in other countries.
It’s not in the title, which is “Optimal Timing for Superintelligence: Mundane Considerations for Existing People”. My guess is you were maybe hoping that people would interpret “considerations for existing people”, to be equivalent to “person-affecting views” but that IMO doesn’t make any sense. A person-affecting assumption is not anywhere close to equivalent to “considerations for existing people”.
Existing people care about the future, and the future of humanity! If existing people (including me) didn’t care about future people, then the person-affecting view would indeed be correct, but people do!
For example, not only do voters and governments usually not place much weight on not-yet-conceived people that might come into existence in future millennia, but they care relatively little about what happens to currently existing persons in other countries.
Voters and governments put enormous weight on not-yet-conceived people! The average planning horizon for climate change regulation is many decades in the future. Nuclear waste management policies are expected to contain waste for hundreds of years. If anything, I think current governance tends to put far too much weight on the future relative to their actual ability to predict the future (as indeed, I expect neither nuclear waste nor climate change to still be relevant when they are forecasted to have large impacts).
It’s true that governments care less about what happens to people outside of their country, but that just seems like an orthogonal moral issue. They do very much care about their own country and routinely make plans that extend beyond the median life-expectancy of the people within their country (though usually this is a bad idea because actually they aren’t able to predict the future well enough to make plans that far out, but extinction risk is one of the cases where you can actually predict what will happen that far out, because you do know that you don’t have a country anymore if everyone in your country is dead).
Caring about future generations seems common, if not practically universal in policymaking. All the variance in why policymaking tends to focus on short-term effects is explained by the fact the future is hard to predict, not a lack of caring by governance institutions about the future of their countries or humanity at large. But that variance simply doesn’t exist for considering extinction risks. Maybe you have some other evidence that convinced you that countries and policy-makers operate on person-affecting views?
I am very confident that if you talk to practically any elected politician and ask them “how bad is it for everyone in the world to become infertile but otherwise they would lead happy lives until their deaths?”, their reaction would be “that would be extremely catastrophic and bad, humanity would be extinct soon, that would be extremely terrible” (in as much as you can get them to engage with the hypothetical seriously, which is of course often difficult).
The average planning horizon for climate change regulation is many decades in the future. Nuclear waste management policies are expected to contain waste for hundreds of years. … Maybe you have some other evidence that convinced you that countries and policy-makers operate on person-affecting views?
Of course they don’t consistently operate on any specific moral view. But I would claim that they are less badly approximated by ‘benefit currently existing citizens’ than ‘neutrally benefit all possible future people (or citizens) that might be brought into existence over future eons’. Much less is spent on things like nuclear waste management and preventing climate change than on providing amenities for the current population. In fact, they may be spending a net negative amount of resources on trying to benefit future generations, since they are often saddling future generations with vast debt burdens in order to fund current consumption. (FHI—particularly Toby Ord—was involved in some efforts to try to infuse a little bit more consideration of future generations in UK policymaking, but I think only very limited inroads where made on that front.)
Yep, I am definitely not saying that current governance cares about future people equally as they do for current people! (My guess is I don’t either, but I don’t know, morality is tricky and I don’t have super strong stances on population ethics)
But “not caring equally strongly about future people” and “being indifferent to human extinction as long as everyone alive gets to spend the rest of their days happy” are of course drastically different. You are making the second assumption in the paper, which even setting aside whether it’s a reasonable assumption on moral grounds, is extremely divorced from how humanity makes governance decisions (and even more divorced from how people would want humanity to make policy decisions, which would IMO be the standard to aspire to for a policy analysis like this).
In other papers (e.g. Existential Risks (2001), Astronomical Waste (2003), and Existential Risk Prevention as a Global Priority (2013)) I focus mostly on what follows from a mundane impersonal perspective. Since that perspective is even further out of step with how humanity makes governance decisions, is it your opinion that those paper should likewise be castigated? (Some people who hate longtermism have done so, quite vehemently.) But my view is that there can be value in working out what follows from various possible theoretical positions, especially ones that have a distinguished pedigree and are taken seriously in the intellectual tradition. Certainly this is a very standard thing to do in academic philosophy, and I think it’s usually a healthy practice.
Since that perspective is even further out of step with how humanity makes governance decisions, is it your opinion that those paper should likewise be castigated?
I am not fully sure what you are referring to by “mundane impersonal perspective”, but I like all of those papers. I both think they are substantially closer to capturing actual decision-making, and also are closer to what seems to me like good decision-making. They aren’t perfect (I could critique them as well), but my relationship to the perspective brought up in those papers is not the same as I would have to the sociopathic example I mention upthread, and I don’t think they have that many obvious reductio-ad-absurdum cases that obviously violate common-sense morality (and I do not remember these papers advocating for those things, but it’s been a while since I read them).
But my view is that there can be value in working out what follows from various possible theoretical positions, especially ones that have a distinguished pedigree and are taken seriously in the intellectual tradition. Certainly this is a very standard thing to do in academic philosophy, and I think it’s usually a healthy practice.
Absolute agree there is value in mapping out these kinds of things! But again, your paper really unambiguously to me does not maintain the usual “if X then Y” structure. It repeatedly falls back into making statements from an all-things-considered viewpoint, using the person-affecting view as a load-bearing argument in those statements (I could provide more quotes of it doing such).
And then separately, the person-affecting view just really doesn’t seem very interesting to me as a thing to extrapolate. I don’t know why you find it interesting. It seems to me like an exceptionally weak starting point with obvious giant holes in its ethical validity, that make exploring its conclusions much less interesting than the vast majority of other ethical frameworks (like, I would be substantially more interested in a deontological analysis of AI takeoff, or a virtue ethical analysis of AI risk, or a pragmatist analysis, all of which strike me as more interesting and more potentially valid starting point than person-affecting welfare-utilitarianism).
And then beyond that, even if one were to chase out these implications, it seems like a huge improvement to include an analysis of the likelihood of the premises of the perspective you are chasing out, and how robust or likely to be true they are. It has been a while since I read the papers you linked, but much of at least some of them is arguing for and evaluating the validity of the ethical assumptions behind caring about the cosmic endowment. Your most recent paper seems much weaker on this dimension (though my memory might be betraying me, and plausible I will have the same criticism if I were to read your past work, though even then, arguing from the basis of an approximately correct premise, even if the premise is left unevaluated, clearly is better than arguing from the basis of an IMO obviously incorrect premise, without evaluating it as such).
And then separately, the person-affecting view just really doesn’t seem very interesting to me as a thing to extrapolate.
[didn’t read original, just responding locally] My impression is that often people do justify AGI research using this sort of view. (Do you disagree?) That would make it an interesting view to extrapolate, no?
Yeah, and as I said, I think choosing the person-affecting view for this seems about as useful as choosing the “sociopath willing to kill every person in their path to get to their goal” point of view.
I really disagree. See here (or some of Michael St Jules’ comments in the same thread) for why person-affecting views aren’t “obviously dumb,” as some people seem to think. Or just read the flow-chart towards the end of this post (no need to read the whole post, the flow-chart will give you the basics of the idea.) More relevantly here, they are certainly not selfish, and it reflects poorly on you to make that insinuation. People who dedicate much of their life to EA hold them sincerely as a formalization of what it means to be “maximally altruistic.” It’s just not at all obvious that creating new happy people when there’s suffering in the world is the altruistic thing to do, and you’re begging the question by pretending it’s universally obvious.
I think what’s a lot more fair as a criticism is Jan Kulveit’s point: that person-affecting views combined with only crude welfare utilitarianism—without even factoring in things like existing people wanting their children to survive or wanting humanity as a whole to survive (though maybe less so now with recent world events making more people disappointed in humanity) -- is a weird and unrepresentative combination that would amount to basically only acting on the views of selfish people.
(And maybe that’s what drove your harsh judgment and you’d be more lenient if the paper made person-affecting assumptions that still put some indirect value on humanity surviving through the step of, “if Oliver as an existing person strongly cares about humanity’s long-run survival, then on a person-affecting view that gives weight to people’s life goals, humanity surviving the long run now gets +1 votes.” Like, maybe you’d still think such a view seems dumb to you, but you wouldn’t feel like it’s okay to proclaim that it’s socially inappropriate for others to have such views.)
But even that—the combination of views Jan Kulveit criticized—seems defensible to me to write a paper about, as long as the assumptions are clearly laid out. (I haven’t read the paper, but I asked Claude to tell me the assumptions it’s based on, and Claude seemed to get it correct in 4 seconds.) Bostrom said the points in this paper could be a building block, not that it’s his view of all we should consider. This sort of thing is pretty standard in philosophy, so much so that it often doesn’t even need to be explicitly stated and proactively contextualized at length, and I think we should just decouple better.
Instead of saying things like “this is a bad paper,” I feel like the fairer criticism would be something more like, “unfortunately we live in an age of twitter were stupid influencers take things out of context, and you could have foreseen that and proactively prevented certain obvious misinterpretations of the paper.” That would be fair criticism, but it makes clear that the paper might still be good for what it aims to be, and it at least puts part of the blame on twitter culture.
On the substance-level, btw, one “arcane consideration” that I would put a lot of weight on, even on person-affecting views, is stuff like what Bostrom talks about in the Cosmic Host paper. (It matters on person-affecting views because other civs in the multiverse exist independently of our actions.) In that paper too I get weird accelerationist vibes and I don’t agree with them either. I think pretty strongly that this cosmic host stuff is an argument for quality over speed when it comes to introducing a new technologically mature civilization to the multiverse-wide commons. It’s pretty bad form to bring an antisocial kid to a birthday party when you could just take extra efforts to first socialize the kid. If a planet just recklessly presses “go” on something that is 80% or 95% likely to be uncontrolled Moloch-stuff/optimization, that’s really shitty. Even if we can’t align AIs to human values, I feel like we at least have a duty to make them good at the building of peaceful coalitions/it being an okay thing to add to the cosmic host.
I think what’s a lot more fair as a criticism is Jan Kulveit’s point: that person-affecting views combined with only crude welfare utilitarianism
Yep, by person-affecting views I here meant person-affecting welfare utilitarianism views. Sorry for the confusion! I don’t think what I said was super unclear in the context of the criticism of the paper (and even in general, as person-affecting view in my experience almost exclusively gets used in the context of welfare utilitarianism).[1]
I haven’t read the paper
Please read the paper before you criticize my criticism of it then! The paper repeatedly makes claims about optimal policy in an uncaveated fashion, saying things like “The appropriate analogy for the development of superintelligence is not Russian roulette but surgery for a serious condition that would be fatal if left untreated.”
That sentence has no “ifs” or “buts” or anything. It doesn’t say “from the perspective of a naive welfare utilitarian taking a person-affecting view, the appropriate analogy for the development of superintelligence is...”. It just says “the appropriate analogy is...”.
It’s clear the paper is treating a welfare utilitarian person-affecting view as a reasonable guide to global policy decisions. The paper does not spend a single paragraph talking about limitations of this view, or explains why one might not want to take it seriously. If this is common in philosophy, then it is bad practice and I don’t want it repeated here.
And I also don’t buy this is a presentational decision. I am pretty (though not overwhelmingly confident) that Nick does think that this person-affecting moral view should play a major role in the moral parliament of humanity, and that the arguments in the paper are strong arguments for accelerating in many worlds where risks are very but not overwhelmingly high. Do you want to bet with me on this not being the case? And I think doing so is making a grave mistake, and the paper is arguing many people straightforwardly into the grave mistake.
If you take a preference utilitarianism perspective, then I defend the optimal morality actually only cares about the extrapolated preferences of exactly me! Moral realism seems pretty obviously false, so a person-affecting preference utilitarianism perspective seems also pretty silly, though I do think that for social coordination reasons, optimizing the preferences of everyone alive (and maybe everyone who was alive in the past) is the right choice, but for me that’s a contingent fact on what will cause the future to go best by my own values.
Sorry for the confusion! I don’t think what I said was super unclear in the context of the criticism of the paper (and even in general, as person-affecting view in my experience almost exclusively gets used in the context of welfare utilitarianism).[1]
I see why you have that impression. (I feel like this is an artefact of critics of person-affecting views tending to be classical welfare utilitarians quite often, and they IMO have the bad habit of presenting opposing views inside their rigid framework and then ridiculing them for seeming silly under those odd assumptions. I would guess that most people who self-describe as having some sort of person-affecting view care very much about preferences, in one way or another.)
Please read the paper before you criticize my criticism of it then! The paper repeatedly makes claims about optimal policy in an uncaveated fashion, saying things like “The appropriate analogy for the development of superintelligence is not Russian roulette but surgery for a serious condition that would be fatal if left untreated.”
That’s fair, sorry!
It bothered me that people on twitter didn’t even label that the paper explicitly bracketed a lot of stuff and laid out its very simplistic assumptions, but then I updated too far in the direction of “backlash lacked justification.”
And I think doing so is making a grave mistake, and the paper is arguing many people straightforwardly into the grave mistake.
I agree it would be a mistake to give it a ton of weight, but I think this view deserves a bit of weight.
Indirectly related to that, I think some of the points people make of the sort of “if you’re so worried about everyone dying, let’s try cryonics” or “let’s try human enhancement” are unfortunately not very convincing. I think that “everything is doomed unless we hail mary bail ourselves out with magic-like AI takeoff fixing it all for us” is unfortunately quite an accurate outlook. (I’m still open to being proven wrong if suddenly a lot of things were to get more hopeful, though.) Civilization has seemed pretty fucked even just a couple of years ago, and it hasn’t gotten any better more recently. Still, on my suffering-focused views, that makes it EVEN LESS appealing that we should launch AI, not more appealing.
To be clear, I agree that it’s a failure mode to prematurely rule things out just because they seem difficult. And I agree that it’s insane to act as though global coordination to pause AI is somehow socially or politically impossible. It clearly isn’t. I think pausing AI is difficult but feasible. I think “fixing the sanity of civilization so that you have competent people in charge in many places that matter” seems much less realistic? Basically, I think you can build local bubbles of sanity around leaders with the right traits and groups with the right culture, but it’s unfortunately quite hard given human limitations (and maybe other aspects of our situation) to make these bubbles large enough to ensure things like cryonics or human enhancement goes well for many decades without somehow running into a catastrophe sooner or later. (Because progress moves onwards in certain areas even with an AI pause.)
I’m just saying that, given what I think is the accurate outlook, it isn’t entirely fair to shoot down any high-variance strategies with “wtf, why go there, why don’t we do this other safer thing instead ((that clearly isn’t going to work))?”
If I didn’t have suffering-focused values, I would be sympathetic to the intuition of “maybe we should increase the variance,” and so, on an intellectual level at least, I feel like Bostrom deserves credit for pointing that out.
But I have a suffering-focused outlook, so, for the record, I disagree with the conclusions. Also, I think even based on less suffering-focused values, it seems very plausible to me that civilizations that don’t have their act together enough to proceed into AI takeoff with coordination and at least a good plan, shouldn’t launch AI at all. It’s uncooperative towards possibly nearby other civilizations or towards the “cosmic host.” Bostrom says he’s concerned about scenarios where superintelligence never gets built. It’s not obvious to me that this is very likely, though, so if I’m right that earth would rebuild even after a catastrophe, and if totalitarianism or other lock ins without superintelligence wouldn’t last all that long before collapsing in one way or another, then there’s no rush from a purely longtermist perspective. (I’m not confident in these assumptions, but I partly have these views from deferring to former FHI staff/affiliates, of all people (on the rebuilding point).)
I’m just saying that, given what I think is the accurate outlook, it isn’t entirely fair to shoot down any high-variance strategies with “wtf, why go there, why don’t we do this other safer thing instead ((that clearly isn’t going to work))?”
While I disagree with your outlook[1], I agree that we shouldn’t dismiss high variance strategies lightly. I am not criticizing the paper on the grounds of the policy it advocates. If someone were to wrote a paper that had as shaky foundations, and treated those foundations with as little suspicion as this paper, I would react the same way (e.g. if someone wrote a paper arguing against developing AI for job loss reasons, without once questioning whether job loss is actually bad, I would object on similar grounds).
Bostrom says he’s concerned about scenarios where superintelligence never gets built.
That is also a concern I have much more sympathy towards than this paper. I think it’s quite unlikely, but I can see the argument. I don’t feel that way about the arguments in this paper.
indeed, I think in the absence of developing AI we would quickly develop alternative, much safer technologies which would most likely cause humanity to very substantially become better at governing itself, and to navigate the future reasonably
On the substance-level, btw, one “arcane consideration” that I would put a lot of weight on, even on person-affecting views, is stuff like what Bostrom talks about in the Cosmic Host paper. … Even if we can’t align AIs to human values, I feel like we at least have a duty to make them good at the building of peaceful coalitions/it being an okay thing to add to the cosmic host.
Yes, although, as that paper discusses, speed may also be important insofar as it reduces the risk of us failing to add anything at all, since that’s also something the cosmic host may care about—the risk that we fail ever to produce superintelligence. (My views about those things are quite tentative, and they fall squarely into the ‘arcane’. I agree on their importance.)
Yeah, and as I said, I think choosing the person-affecting view for this seems about as useful as choosing the “sociopath willing to kill every person in their path to get to their goal” point of view. I don’t understand this choice. It has obvious reductio-ad-absurdum cases.
And most importantly, the choice of ethical framework of course trivially overdetermines the answer. If your assumption is “assume humanity should gamble arbitrarily much on the immortality of the current generation” and “assume AI could provide immortality and nothing else can”, the answer of course becomes “humanity should gamble everything on AI providing immortality”. The rest is just fancy dress-up work.
Like, I think a persuasive or reasonable paper would have put its central load-bearing assumptions up front. It might still be worth chasing out the implications of the assumptions, but this is such a weird set of modeling assumptions, that explaining how sensitive the conclusion to this modeling assumption is, is the central thing a reader needs to understand when reading this. I think almost no reader would naively understand this, and indeed the conclusion seems to actively contradict the perspective, with most sentences not being phrased in the form “if you had belief X, then Y would make sense” but just directly phrased in pithy ways like “The appropriate analogy for the development of superintelligence is not Russian roulette but surgery for a serious condition that would be fatal if left untreated.”.
More up front than in the title?
Societal decision-making typically uses a far narrower basis than the general person-affecting stance that this paper analyzes. For example, not only do voters and governments usually not place much weight on not-yet-conceived people that might come into existence in future millennia, but they care relatively little about what happens to currently existing persons in other countries.
It’s not in the title, which is “Optimal Timing for Superintelligence: Mundane Considerations for Existing People”. My guess is you were maybe hoping that people would interpret “considerations for existing people”, to be equivalent to “person-affecting views” but that IMO doesn’t make any sense. A person-affecting assumption is not anywhere close to equivalent to “considerations for existing people”.
Existing people care about the future, and the future of humanity! If existing people (including me) didn’t care about future people, then the person-affecting view would indeed be correct, but people do!
Voters and governments put enormous weight on not-yet-conceived people! The average planning horizon for climate change regulation is many decades in the future. Nuclear waste management policies are expected to contain waste for hundreds of years. If anything, I think current governance tends to put far too much weight on the future relative to their actual ability to predict the future (as indeed, I expect neither nuclear waste nor climate change to still be relevant when they are forecasted to have large impacts).
It’s true that governments care less about what happens to people outside of their country, but that just seems like an orthogonal moral issue. They do very much care about their own country and routinely make plans that extend beyond the median life-expectancy of the people within their country (though usually this is a bad idea because actually they aren’t able to predict the future well enough to make plans that far out, but extinction risk is one of the cases where you can actually predict what will happen that far out, because you do know that you don’t have a country anymore if everyone in your country is dead).
Caring about future generations seems common, if not practically universal in policymaking. All the variance in why policymaking tends to focus on short-term effects is explained by the fact the future is hard to predict, not a lack of caring by governance institutions about the future of their countries or humanity at large. But that variance simply doesn’t exist for considering extinction risks. Maybe you have some other evidence that convinced you that countries and policy-makers operate on person-affecting views?
I am very confident that if you talk to practically any elected politician and ask them “how bad is it for everyone in the world to become infertile but otherwise they would lead happy lives until their deaths?”, their reaction would be “that would be extremely catastrophic and bad, humanity would be extinct soon, that would be extremely terrible” (in as much as you can get them to engage with the hypothetical seriously, which is of course often difficult).
Of course they don’t consistently operate on any specific moral view. But I would claim that they are less badly approximated by ‘benefit currently existing citizens’ than ‘neutrally benefit all possible future people (or citizens) that might be brought into existence over future eons’. Much less is spent on things like nuclear waste management and preventing climate change than on providing amenities for the current population. In fact, they may be spending a net negative amount of resources on trying to benefit future generations, since they are often saddling future generations with vast debt burdens in order to fund current consumption. (FHI—particularly Toby Ord—was involved in some efforts to try to infuse a little bit more consideration of future generations in UK policymaking, but I think only very limited inroads where made on that front.)
Yep, I am definitely not saying that current governance cares about future people equally as they do for current people! (My guess is I don’t either, but I don’t know, morality is tricky and I don’t have super strong stances on population ethics)
But “not caring equally strongly about future people” and “being indifferent to human extinction as long as everyone alive gets to spend the rest of their days happy” are of course drastically different. You are making the second assumption in the paper, which even setting aside whether it’s a reasonable assumption on moral grounds, is extremely divorced from how humanity makes governance decisions (and even more divorced from how people would want humanity to make policy decisions, which would IMO be the standard to aspire to for a policy analysis like this).
In other papers (e.g. Existential Risks (2001), Astronomical Waste (2003), and Existential Risk Prevention as a Global Priority (2013)) I focus mostly on what follows from a mundane impersonal perspective. Since that perspective is even further out of step with how humanity makes governance decisions, is it your opinion that those paper should likewise be castigated? (Some people who hate longtermism have done so, quite vehemently.) But my view is that there can be value in working out what follows from various possible theoretical positions, especially ones that have a distinguished pedigree and are taken seriously in the intellectual tradition. Certainly this is a very standard thing to do in academic philosophy, and I think it’s usually a healthy practice.
I am not fully sure what you are referring to by “mundane impersonal perspective”, but I like all of those papers. I both think they are substantially closer to capturing actual decision-making, and also are closer to what seems to me like good decision-making. They aren’t perfect (I could critique them as well), but my relationship to the perspective brought up in those papers is not the same as I would have to the sociopathic example I mention upthread, and I don’t think they have that many obvious reductio-ad-absurdum cases that obviously violate common-sense morality (and I do not remember these papers advocating for those things, but it’s been a while since I read them).
Absolute agree there is value in mapping out these kinds of things! But again, your paper really unambiguously to me does not maintain the usual “if X then Y” structure. It repeatedly falls back into making statements from an all-things-considered viewpoint, using the person-affecting view as a load-bearing argument in those statements (I could provide more quotes of it doing such).
And then separately, the person-affecting view just really doesn’t seem very interesting to me as a thing to extrapolate. I don’t know why you find it interesting. It seems to me like an exceptionally weak starting point with obvious giant holes in its ethical validity, that make exploring its conclusions much less interesting than the vast majority of other ethical frameworks (like, I would be substantially more interested in a deontological analysis of AI takeoff, or a virtue ethical analysis of AI risk, or a pragmatist analysis, all of which strike me as more interesting and more potentially valid starting point than person-affecting welfare-utilitarianism).
And then beyond that, even if one were to chase out these implications, it seems like a huge improvement to include an analysis of the likelihood of the premises of the perspective you are chasing out, and how robust or likely to be true they are. It has been a while since I read the papers you linked, but much of at least some of them is arguing for and evaluating the validity of the ethical assumptions behind caring about the cosmic endowment. Your most recent paper seems much weaker on this dimension (though my memory might be betraying me, and plausible I will have the same criticism if I were to read your past work, though even then, arguing from the basis of an approximately correct premise, even if the premise is left unevaluated, clearly is better than arguing from the basis of an IMO obviously incorrect premise, without evaluating it as such).
[didn’t read original, just responding locally] My impression is that often people do justify AGI research using this sort of view. (Do you disagree?) That would make it an interesting view to extrapolate, no?
I really disagree. See here (or some of Michael St Jules’ comments in the same thread) for why person-affecting views aren’t “obviously dumb,” as some people seem to think. Or just read the flow-chart towards the end of this post (no need to read the whole post, the flow-chart will give you the basics of the idea.) More relevantly here, they are certainly not selfish, and it reflects poorly on you to make that insinuation. People who dedicate much of their life to EA hold them sincerely as a formalization of what it means to be “maximally altruistic.” It’s just not at all obvious that creating new happy people when there’s suffering in the world is the altruistic thing to do, and you’re begging the question by pretending it’s universally obvious.
I think what’s a lot more fair as a criticism is Jan Kulveit’s point: that person-affecting views combined with only crude welfare utilitarianism—without even factoring in things like existing people wanting their children to survive or wanting humanity as a whole to survive (though maybe less so now with recent world events making more people disappointed in humanity) -- is a weird and unrepresentative combination that would amount to basically only acting on the views of selfish people.
(And maybe that’s what drove your harsh judgment and you’d be more lenient if the paper made person-affecting assumptions that still put some indirect value on humanity surviving through the step of, “if Oliver as an existing person strongly cares about humanity’s long-run survival, then on a person-affecting view that gives weight to people’s life goals, humanity surviving the long run now gets +1 votes.” Like, maybe you’d still think such a view seems dumb to you, but you wouldn’t feel like it’s okay to proclaim that it’s socially inappropriate for others to have such views.)
But even that—the combination of views Jan Kulveit criticized—seems defensible to me to write a paper about, as long as the assumptions are clearly laid out. (I haven’t read the paper, but I asked Claude to tell me the assumptions it’s based on, and Claude seemed to get it correct in 4 seconds.) Bostrom said the points in this paper could be a building block, not that it’s his view of all we should consider. This sort of thing is pretty standard in philosophy, so much so that it often doesn’t even need to be explicitly stated and proactively contextualized at length, and I think we should just decouple better.
Instead of saying things like “this is a bad paper,” I feel like the fairer criticism would be something more like, “unfortunately we live in an age of twitter were stupid influencers take things out of context, and you could have foreseen that and proactively prevented certain obvious misinterpretations of the paper.” That would be fair criticism, but it makes clear that the paper might still be good for what it aims to be, and it at least puts part of the blame on twitter culture.
On the substance-level, btw, one “arcane consideration” that I would put a lot of weight on, even on person-affecting views, is stuff like what Bostrom talks about in the Cosmic Host paper. (It matters on person-affecting views because other civs in the multiverse exist independently of our actions.) In that paper too I get weird accelerationist vibes and I don’t agree with them either. I think pretty strongly that this cosmic host stuff is an argument for quality over speed when it comes to introducing a new technologically mature civilization to the multiverse-wide commons. It’s pretty bad form to bring an antisocial kid to a birthday party when you could just take extra efforts to first socialize the kid. If a planet just recklessly presses “go” on something that is 80% or 95% likely to be uncontrolled Moloch-stuff/optimization, that’s really shitty. Even if we can’t align AIs to human values, I feel like we at least have a duty to make them good at the building of peaceful coalitions/it being an okay thing to add to the cosmic host.
Yep, by person-affecting views I here meant person-affecting welfare utilitarianism views. Sorry for the confusion! I don’t think what I said was super unclear in the context of the criticism of the paper (and even in general, as person-affecting view in my experience almost exclusively gets used in the context of welfare utilitarianism).[1]
Please read the paper before you criticize my criticism of it then! The paper repeatedly makes claims about optimal policy in an uncaveated fashion, saying things like “The appropriate analogy for the development of superintelligence is not Russian roulette but surgery for a serious condition that would be fatal if left untreated.”
That sentence has no “ifs” or “buts” or anything. It doesn’t say “from the perspective of a naive welfare utilitarian taking a person-affecting view, the appropriate analogy for the development of superintelligence is...”. It just says “the appropriate analogy is...”.
It’s clear the paper is treating a welfare utilitarian person-affecting view as a reasonable guide to global policy decisions. The paper does not spend a single paragraph talking about limitations of this view, or explains why one might not want to take it seriously. If this is common in philosophy, then it is bad practice and I don’t want it repeated here.
And I also don’t buy this is a presentational decision. I am pretty (though not overwhelmingly confident) that Nick does think that this person-affecting moral view should play a major role in the moral parliament of humanity, and that the arguments in the paper are strong arguments for accelerating in many worlds where risks are very but not overwhelmingly high. Do you want to bet with me on this not being the case? And I think doing so is making a grave mistake, and the paper is arguing many people straightforwardly into the grave mistake.
If you take a preference utilitarianism perspective, then I defend the optimal morality actually only cares about the extrapolated preferences of exactly me! Moral realism seems pretty obviously false, so a person-affecting preference utilitarianism perspective seems also pretty silly, though I do think that for social coordination reasons, optimizing the preferences of everyone alive (and maybe everyone who was alive in the past) is the right choice, but for me that’s a contingent fact on what will cause the future to go best by my own values.
I see why you have that impression. (I feel like this is an artefact of critics of person-affecting views tending to be classical welfare utilitarians quite often, and they IMO have the bad habit of presenting opposing views inside their rigid framework and then ridiculing them for seeming silly under those odd assumptions. I would guess that most people who self-describe as having some sort of person-affecting view care very much about preferences, in one way or another.)
That’s fair, sorry!
It bothered me that people on twitter didn’t even label that the paper explicitly bracketed a lot of stuff and laid out its very simplistic assumptions, but then I updated too far in the direction of “backlash lacked justification.”
I agree it would be a mistake to give it a ton of weight, but I think this view deserves a bit of weight.
Indirectly related to that, I think some of the points people make of the sort of “if you’re so worried about everyone dying, let’s try cryonics” or “let’s try human enhancement” are unfortunately not very convincing. I think that “everything is doomed unless we hail mary bail ourselves out with magic-like AI takeoff fixing it all for us” is unfortunately quite an accurate outlook. (I’m still open to being proven wrong if suddenly a lot of things were to get more hopeful, though.) Civilization has seemed pretty fucked even just a couple of years ago, and it hasn’t gotten any better more recently. Still, on my suffering-focused views, that makes it EVEN LESS appealing that we should launch AI, not more appealing.
To be clear, I agree that it’s a failure mode to prematurely rule things out just because they seem difficult. And I agree that it’s insane to act as though global coordination to pause AI is somehow socially or politically impossible. It clearly isn’t. I think pausing AI is difficult but feasible. I think “fixing the sanity of civilization so that you have competent people in charge in many places that matter” seems much less realistic? Basically, I think you can build local bubbles of sanity around leaders with the right traits and groups with the right culture, but it’s unfortunately quite hard given human limitations (and maybe other aspects of our situation) to make these bubbles large enough to ensure things like cryonics or human enhancement goes well for many decades without somehow running into a catastrophe sooner or later. (Because progress moves onwards in certain areas even with an AI pause.)
I’m just saying that, given what I think is the accurate outlook, it isn’t entirely fair to shoot down any high-variance strategies with “wtf, why go there, why don’t we do this other safer thing instead ((that clearly isn’t going to work))?”
If I didn’t have suffering-focused values, I would be sympathetic to the intuition of “maybe we should increase the variance,” and so, on an intellectual level at least, I feel like Bostrom deserves credit for pointing that out.
But I have a suffering-focused outlook, so, for the record, I disagree with the conclusions. Also, I think even based on less suffering-focused values, it seems very plausible to me that civilizations that don’t have their act together enough to proceed into AI takeoff with coordination and at least a good plan, shouldn’t launch AI at all. It’s uncooperative towards possibly nearby other civilizations or towards the “cosmic host.” Bostrom says he’s concerned about scenarios where superintelligence never gets built. It’s not obvious to me that this is very likely, though, so if I’m right that earth would rebuild even after a catastrophe, and if totalitarianism or other lock ins without superintelligence wouldn’t last all that long before collapsing in one way or another, then there’s no rush from a purely longtermist perspective. (I’m not confident in these assumptions, but I partly have these views from deferring to former FHI staff/affiliates, of all people (on the rebuilding point).)
While I disagree with your outlook[1], I agree that we shouldn’t dismiss high variance strategies lightly. I am not criticizing the paper on the grounds of the policy it advocates. If someone were to wrote a paper that had as shaky foundations, and treated those foundations with as little suspicion as this paper, I would react the same way (e.g. if someone wrote a paper arguing against developing AI for job loss reasons, without once questioning whether job loss is actually bad, I would object on similar grounds).
That is also a concern I have much more sympathy towards than this paper. I think it’s quite unlikely, but I can see the argument. I don’t feel that way about the arguments in this paper.
indeed, I think in the absence of developing AI we would quickly develop alternative, much safer technologies which would most likely cause humanity to very substantially become better at governing itself, and to navigate the future reasonably
Yes, although, as that paper discusses, speed may also be important insofar as it reduces the risk of us failing to add anything at all, since that’s also something the cosmic host may care about—the risk that we fail ever to produce superintelligence. (My views about those things are quite tentative, and they fall squarely into the ‘arcane’. I agree on their importance.)