Yeah, and as I said, I think choosing the person-affecting view for this seems about as useful as choosing the “sociopath willing to kill every person in their path to get to their goal” point of view.
I really disagree. See here (or some of Michael St Jules’ comments in the same thread) for why person-affecting views aren’t “obviously dumb,” as some people seem to think. Or just read the flow-chart towards the end of this post (no need to read the whole post, the flow-chart will give you the basics of the idea.) More relevantly here, they are certainly not selfish, and it reflects poorly on you to make that insinuation. People who dedicate much of their life to EA hold them sincerely as a formalization of what it means to be “maximally altruistic.” It’s just not at all obvious that creating new happy people when there’s suffering in the world is the altruistic thing to do, and you’re begging the question by pretending it’s universally obvious.
I think what’s a lot more fair as a criticism is Jan Kulveit’s point: that person-affecting views combined with only crude welfare utilitarianism—without even factoring in things like existing people wanting their children to survive or wanting humanity as a whole to survive (though maybe less so now with recent world events making more people disappointed in humanity) -- is a weird and unrepresentative combination that would amount to basically only acting on the views of selfish people.
(And maybe that’s what drove your harsh judgment and you’d be more lenient if the paper made person-affecting assumptions that still put some indirect value on humanity surviving through the step of, “if Oliver as an existing person strongly cares about humanity’s long-run survival, then on a person-affecting view that gives weight to people’s life goals, humanity surviving the long run now gets +1 votes.” Like, maybe you’d still think such a view seems dumb to you, but you wouldn’t feel like it’s okay to proclaim that it’s socially inappropriate for others to have such views.)
But even that—the combination of views Jan Kulveit criticized—seems defensible to me to write a paper about, as long as the assumptions are clearly laid out. (I haven’t read the paper, but I asked Claude to tell me the assumptions it’s based on, and Claude seemed to get it correct in 4 seconds.) Bostrom said the points in this paper could be a building block, not that it’s his view of all we should consider. This sort of thing is pretty standard in philosophy, so much so that it often doesn’t even need to be explicitly stated and proactively contextualized at length, and I think we should just decouple better.
Instead of saying things like “this is a bad paper,” I feel like the fairer criticism would be something more like, “unfortunately we live in an age of twitter were stupid influencers take things out of context, and you could have foreseen that and proactively prevented certain obvious misinterpretations of the paper.” That would be fair criticism, but it makes clear that the paper might still be good for what it aims to be, and it at least puts part of the blame on twitter culture.
On the substance-level, btw, one “arcane consideration” that I would put a lot of weight on, even on person-affecting views, is stuff like what Bostrom talks about in the Cosmic Host paper. (It matters on person-affecting views because other civs in the multiverse exist independently of our actions.) In that paper too I get weird accelerationist vibes and I don’t agree with them either. I think pretty strongly that this cosmic host stuff is an argument for quality over speed when it comes to introducing a new technologically mature civilization to the multiverse-wide commons. It’s pretty bad form to bring an antisocial kid to a birthday party when you could just take extra efforts to first socialize the kid. If a planet just recklessly presses “go” on something that is 80% or 95% likely to be uncontrolled Moloch-stuff/optimization, that’s really shitty. Even if we can’t align AIs to human values, I feel like we at least have a duty to make them good at the building of peaceful coalitions/it being an okay thing to add to the cosmic host.
I think what’s a lot more fair as a criticism is Jan Kulveit’s point: that person-affecting views combined with only crude welfare utilitarianism
Yep, by person-affecting views I here meant person-affecting welfare utilitarianism views. Sorry for the confusion! I don’t think what I said was super unclear in the context of the criticism of the paper (and even in general, as person-affecting view in my experience almost exclusively gets used in the context of welfare utilitarianism).[1]
I haven’t read the paper
Please read the paper before you criticize my criticism of it then! The paper repeatedly makes claims about optimal policy in an uncaveated fashion, saying things like “The appropriate analogy for the development of superintelligence is not Russian roulette but surgery for a serious condition that would be fatal if left untreated.”
That sentence has no “ifs” or “buts” or anything. It doesn’t say “from the perspective of a naive welfare utilitarian taking a person-affecting view, the appropriate analogy for the development of superintelligence is...”. It just says “the appropriate analogy is...”.
It’s clear the paper is treating a welfare utilitarian person-affecting view as a reasonable guide to global policy decisions. The paper does not spend a single paragraph talking about limitations of this view, or explains why one might not want to take it seriously. If this is common in philosophy, then it is bad practice and I don’t want it repeated here.
And I also don’t buy this is a presentational decision. I am pretty (though not overwhelmingly confident) that Nick does think that this person-affecting moral view should play a major role in the moral parliament of humanity, and that the arguments in the paper are strong arguments for accelerating in many worlds where risks are very but not overwhelmingly high. Do you want to bet with me on this not being the case? And I think doing so is making a grave mistake, and the paper is arguing many people straightforwardly into the grave mistake.
If you take a preference utilitarianism perspective, then I defend the optimal morality actually only cares about the extrapolated preferences of exactly me! Moral realism seems pretty obviously false, so a person-affecting preference utilitarianism perspective seems also pretty silly, though I do think that for social coordination reasons, optimizing the preferences of everyone alive (and maybe everyone who was alive in the past) is the right choice, but for me that’s a contingent fact on what will cause the future to go best by my own values.
Sorry for the confusion! I don’t think what I said was super unclear in the context of the criticism of the paper (and even in general, as person-affecting view in my experience almost exclusively gets used in the context of welfare utilitarianism).[1]
I see why you have that impression. (I feel like this is an artefact of critics of person-affecting views tending to be classical welfare utilitarians quite often, and they IMO have the bad habit of presenting opposing views inside their rigid framework and then ridiculing them for seeming silly under those odd assumptions. I would guess that most people who self-describe as having some sort of person-affecting view care very much about preferences, in one way or another.)
Please read the paper before you criticize my criticism of it then! The paper repeatedly makes claims about optimal policy in an uncaveated fashion, saying things like “The appropriate analogy for the development of superintelligence is not Russian roulette but surgery for a serious condition that would be fatal if left untreated.”
That’s fair, sorry!
It bothered me that people on twitter didn’t even label that the paper explicitly bracketed a lot of stuff and laid out its very simplistic assumptions, but then I updated too far in the direction of “backlash lacked justification.”
And I think doing so is making a grave mistake, and the paper is arguing many people straightforwardly into the grave mistake.
I agree it would be a mistake to give it a ton of weight, but I think this view deserves a bit of weight.
Indirectly related to that, I think some of the points people make of the sort of “if you’re so worried about everyone dying, let’s try cryonics” or “let’s try human enhancement” are unfortunately not very convincing. I think that “everything is doomed unless we hail mary bail ourselves out with magic-like AI takeoff fixing it all for us” is unfortunately quite an accurate outlook. (I’m still open to being proven wrong if suddenly a lot of things were to get more hopeful, though.) Civilization has seemed pretty fucked even just a couple of years ago, and it hasn’t gotten any better more recently. Still, on my suffering-focused views, that makes it EVEN LESS appealing that we should launch AI, not more appealing.
To be clear, I agree that it’s a failure mode to prematurely rule things out just because they seem difficult. And I agree that it’s insane to act as though global coordination to pause AI is somehow socially or politically impossible. It clearly isn’t. I think pausing AI is difficult but feasible. I think “fixing the sanity of civilization so that you have competent people in charge in many places that matter” seems much less realistic? Basically, I think you can build local bubbles of sanity around leaders with the right traits and groups with the right culture, but it’s unfortunately quite hard given human limitations (and maybe other aspects of our situation) to make these bubbles large enough to ensure things like cryonics or human enhancement goes well for many decades without somehow running into a catastrophe sooner or later. (Because progress moves onwards in certain areas even with an AI pause.)
I’m just saying that, given what I think is the accurate outlook, it isn’t entirely fair to shoot down any high-variance strategies with “wtf, why go there, why don’t we do this other safer thing instead ((that clearly isn’t going to work))?”
If I didn’t have suffering-focused values, I would be sympathetic to the intuition of “maybe we should increase the variance,” and so, on an intellectual level at least, I feel like Bostrom deserves credit for pointing that out.
But I have a suffering-focused outlook, so, for the record, I disagree with the conclusions. Also, I think even based on less suffering-focused values, it seems very plausible to me that civilizations that don’t have their act together enough to proceed into AI takeoff with coordination and at least a good plan, shouldn’t launch AI at all. It’s uncooperative towards possibly nearby other civilizations or towards the “cosmic host.” Bostrom says he’s concerned about scenarios where superintelligence never gets built. It’s not obvious to me that this is very likely, though, so if I’m right that earth would rebuild even after a catastrophe, and if totalitarianism or other lock ins without superintelligence wouldn’t last all that long before collapsing in one way or another, then there’s no rush from a purely longtermist perspective. (I’m not confident in these assumptions, but I partly have these views from deferring to former FHI staff/affiliates, of all people (on the rebuilding point).)
I’m just saying that, given what I think is the accurate outlook, it isn’t entirely fair to shoot down any high-variance strategies with “wtf, why go there, why don’t we do this other safer thing instead ((that clearly isn’t going to work))?”
While I disagree with your outlook[1], I agree that we shouldn’t dismiss high variance strategies lightly. I am not criticizing the paper on the grounds of the policy it advocates. If someone were to wrote a paper that had as shaky foundations, and treated those foundations with as little suspicion as this paper, I would react the same way (e.g. if someone wrote a paper arguing against developing AI for job loss reasons, without once questioning whether job loss is actually bad, I would object on similar grounds).
Bostrom says he’s concerned about scenarios where superintelligence never gets built.
That is also a concern I have much more sympathy towards than this paper. I think it’s quite unlikely, but I can see the argument. I don’t feel that way about the arguments in this paper.
indeed, I think in the absence of developing AI we would quickly develop alternative, much safer technologies which would most likely cause humanity to very substantially become better at governing itself, and to navigate the future reasonably
On the substance-level, btw, one “arcane consideration” that I would put a lot of weight on, even on person-affecting views, is stuff like what Bostrom talks about in the Cosmic Host paper. … Even if we can’t align AIs to human values, I feel like we at least have a duty to make them good at the building of peaceful coalitions/it being an okay thing to add to the cosmic host.
Yes, although, as that paper discusses, speed may also be important insofar as it reduces the risk of us failing to add anything at all, since that’s also something the cosmic host may care about—the risk that we fail ever to produce superintelligence. (My views about those things are quite tentative, and they fall squarely into the ‘arcane’. I agree on their importance.)
I really disagree. See here (or some of Michael St Jules’ comments in the same thread) for why person-affecting views aren’t “obviously dumb,” as some people seem to think. Or just read the flow-chart towards the end of this post (no need to read the whole post, the flow-chart will give you the basics of the idea.) More relevantly here, they are certainly not selfish, and it reflects poorly on you to make that insinuation. People who dedicate much of their life to EA hold them sincerely as a formalization of what it means to be “maximally altruistic.” It’s just not at all obvious that creating new happy people when there’s suffering in the world is the altruistic thing to do, and you’re begging the question by pretending it’s universally obvious.
I think what’s a lot more fair as a criticism is Jan Kulveit’s point: that person-affecting views combined with only crude welfare utilitarianism—without even factoring in things like existing people wanting their children to survive or wanting humanity as a whole to survive (though maybe less so now with recent world events making more people disappointed in humanity) -- is a weird and unrepresentative combination that would amount to basically only acting on the views of selfish people.
(And maybe that’s what drove your harsh judgment and you’d be more lenient if the paper made person-affecting assumptions that still put some indirect value on humanity surviving through the step of, “if Oliver as an existing person strongly cares about humanity’s long-run survival, then on a person-affecting view that gives weight to people’s life goals, humanity surviving the long run now gets +1 votes.” Like, maybe you’d still think such a view seems dumb to you, but you wouldn’t feel like it’s okay to proclaim that it’s socially inappropriate for others to have such views.)
But even that—the combination of views Jan Kulveit criticized—seems defensible to me to write a paper about, as long as the assumptions are clearly laid out. (I haven’t read the paper, but I asked Claude to tell me the assumptions it’s based on, and Claude seemed to get it correct in 4 seconds.) Bostrom said the points in this paper could be a building block, not that it’s his view of all we should consider. This sort of thing is pretty standard in philosophy, so much so that it often doesn’t even need to be explicitly stated and proactively contextualized at length, and I think we should just decouple better.
Instead of saying things like “this is a bad paper,” I feel like the fairer criticism would be something more like, “unfortunately we live in an age of twitter were stupid influencers take things out of context, and you could have foreseen that and proactively prevented certain obvious misinterpretations of the paper.” That would be fair criticism, but it makes clear that the paper might still be good for what it aims to be, and it at least puts part of the blame on twitter culture.
On the substance-level, btw, one “arcane consideration” that I would put a lot of weight on, even on person-affecting views, is stuff like what Bostrom talks about in the Cosmic Host paper. (It matters on person-affecting views because other civs in the multiverse exist independently of our actions.) In that paper too I get weird accelerationist vibes and I don’t agree with them either. I think pretty strongly that this cosmic host stuff is an argument for quality over speed when it comes to introducing a new technologically mature civilization to the multiverse-wide commons. It’s pretty bad form to bring an antisocial kid to a birthday party when you could just take extra efforts to first socialize the kid. If a planet just recklessly presses “go” on something that is 80% or 95% likely to be uncontrolled Moloch-stuff/optimization, that’s really shitty. Even if we can’t align AIs to human values, I feel like we at least have a duty to make them good at the building of peaceful coalitions/it being an okay thing to add to the cosmic host.
Yep, by person-affecting views I here meant person-affecting welfare utilitarianism views. Sorry for the confusion! I don’t think what I said was super unclear in the context of the criticism of the paper (and even in general, as person-affecting view in my experience almost exclusively gets used in the context of welfare utilitarianism).[1]
Please read the paper before you criticize my criticism of it then! The paper repeatedly makes claims about optimal policy in an uncaveated fashion, saying things like “The appropriate analogy for the development of superintelligence is not Russian roulette but surgery for a serious condition that would be fatal if left untreated.”
That sentence has no “ifs” or “buts” or anything. It doesn’t say “from the perspective of a naive welfare utilitarian taking a person-affecting view, the appropriate analogy for the development of superintelligence is...”. It just says “the appropriate analogy is...”.
It’s clear the paper is treating a welfare utilitarian person-affecting view as a reasonable guide to global policy decisions. The paper does not spend a single paragraph talking about limitations of this view, or explains why one might not want to take it seriously. If this is common in philosophy, then it is bad practice and I don’t want it repeated here.
And I also don’t buy this is a presentational decision. I am pretty (though not overwhelmingly confident) that Nick does think that this person-affecting moral view should play a major role in the moral parliament of humanity, and that the arguments in the paper are strong arguments for accelerating in many worlds where risks are very but not overwhelmingly high. Do you want to bet with me on this not being the case? And I think doing so is making a grave mistake, and the paper is arguing many people straightforwardly into the grave mistake.
If you take a preference utilitarianism perspective, then I defend the optimal morality actually only cares about the extrapolated preferences of exactly me! Moral realism seems pretty obviously false, so a person-affecting preference utilitarianism perspective seems also pretty silly, though I do think that for social coordination reasons, optimizing the preferences of everyone alive (and maybe everyone who was alive in the past) is the right choice, but for me that’s a contingent fact on what will cause the future to go best by my own values.
I see why you have that impression. (I feel like this is an artefact of critics of person-affecting views tending to be classical welfare utilitarians quite often, and they IMO have the bad habit of presenting opposing views inside their rigid framework and then ridiculing them for seeming silly under those odd assumptions. I would guess that most people who self-describe as having some sort of person-affecting view care very much about preferences, in one way or another.)
That’s fair, sorry!
It bothered me that people on twitter didn’t even label that the paper explicitly bracketed a lot of stuff and laid out its very simplistic assumptions, but then I updated too far in the direction of “backlash lacked justification.”
I agree it would be a mistake to give it a ton of weight, but I think this view deserves a bit of weight.
Indirectly related to that, I think some of the points people make of the sort of “if you’re so worried about everyone dying, let’s try cryonics” or “let’s try human enhancement” are unfortunately not very convincing. I think that “everything is doomed unless we hail mary bail ourselves out with magic-like AI takeoff fixing it all for us” is unfortunately quite an accurate outlook. (I’m still open to being proven wrong if suddenly a lot of things were to get more hopeful, though.) Civilization has seemed pretty fucked even just a couple of years ago, and it hasn’t gotten any better more recently. Still, on my suffering-focused views, that makes it EVEN LESS appealing that we should launch AI, not more appealing.
To be clear, I agree that it’s a failure mode to prematurely rule things out just because they seem difficult. And I agree that it’s insane to act as though global coordination to pause AI is somehow socially or politically impossible. It clearly isn’t. I think pausing AI is difficult but feasible. I think “fixing the sanity of civilization so that you have competent people in charge in many places that matter” seems much less realistic? Basically, I think you can build local bubbles of sanity around leaders with the right traits and groups with the right culture, but it’s unfortunately quite hard given human limitations (and maybe other aspects of our situation) to make these bubbles large enough to ensure things like cryonics or human enhancement goes well for many decades without somehow running into a catastrophe sooner or later. (Because progress moves onwards in certain areas even with an AI pause.)
I’m just saying that, given what I think is the accurate outlook, it isn’t entirely fair to shoot down any high-variance strategies with “wtf, why go there, why don’t we do this other safer thing instead ((that clearly isn’t going to work))?”
If I didn’t have suffering-focused values, I would be sympathetic to the intuition of “maybe we should increase the variance,” and so, on an intellectual level at least, I feel like Bostrom deserves credit for pointing that out.
But I have a suffering-focused outlook, so, for the record, I disagree with the conclusions. Also, I think even based on less suffering-focused values, it seems very plausible to me that civilizations that don’t have their act together enough to proceed into AI takeoff with coordination and at least a good plan, shouldn’t launch AI at all. It’s uncooperative towards possibly nearby other civilizations or towards the “cosmic host.” Bostrom says he’s concerned about scenarios where superintelligence never gets built. It’s not obvious to me that this is very likely, though, so if I’m right that earth would rebuild even after a catastrophe, and if totalitarianism or other lock ins without superintelligence wouldn’t last all that long before collapsing in one way or another, then there’s no rush from a purely longtermist perspective. (I’m not confident in these assumptions, but I partly have these views from deferring to former FHI staff/affiliates, of all people (on the rebuilding point).)
While I disagree with your outlook[1], I agree that we shouldn’t dismiss high variance strategies lightly. I am not criticizing the paper on the grounds of the policy it advocates. If someone were to wrote a paper that had as shaky foundations, and treated those foundations with as little suspicion as this paper, I would react the same way (e.g. if someone wrote a paper arguing against developing AI for job loss reasons, without once questioning whether job loss is actually bad, I would object on similar grounds).
That is also a concern I have much more sympathy towards than this paper. I think it’s quite unlikely, but I can see the argument. I don’t feel that way about the arguments in this paper.
indeed, I think in the absence of developing AI we would quickly develop alternative, much safer technologies which would most likely cause humanity to very substantially become better at governing itself, and to navigate the future reasonably
Yes, although, as that paper discusses, speed may also be important insofar as it reduces the risk of us failing to add anything at all, since that’s also something the cosmic host may care about—the risk that we fail ever to produce superintelligence. (My views about those things are quite tentative, and they fall squarely into the ‘arcane’. I agree on their importance.)