We’re probably headed towards a moral catastrophe of some kind, my point is just that we don’t get to reason backwards like “oh, well that would be bad/inconvenient so I guess they don’t matter”.
Moral patienthood is not something that is granted, it’s a fact relative to one’s values. Arguments for or against this are therefore normative, no matter how much Roger tries to weasel out of it.
The implications are probably horrible, but it by no means follows that we have to accept risk of extinction. The horribleness is mostly just in the moral harm caused while creating/exploiting/exterminating such entities.
At least we can all agree that “creating them at will without thinking this through very well” is a terrible idea.
Moral patienthood is not something that is granted, it’s a fact relative to one’s values.
I think you might understand where I’m coming from better if you took the time to read my earlier post A Sense of Fairness: Deconfusing Ethics. (You might also find roko’s post The Terrible, Horrible, No Good, Very Bad Truth About Morality and What To Do About It thought-provoking.) My earlier post takes a very practical, engineering viewpoint of ethical systems: treating ethical systems like software for a society, looking at the consequences of using different ones, and then deciding between those consequences. Crucially, that last step cannot be done within any ethical system, since every ethical system always automatically prefers itself over all other ethical systems. Asking one ethical system its opinion of another ethical system is pointless: they entirely predictably always say “No”. To decide between two ethical systems, for example when reflecting on your choice of ethical system, you need to step outside them and use something looser than an ethical system. Such as human moral intuitions, or evolutionary fitness, or observations such as “…for rather obvious evolutionary reasons, O(99.9%) of humans agree that…” — none of which is an ethical system.
Within the context of any single specific ethical system, yes, moral patienthood is a fact: it either applies or it doesn’t. Similarly, moral weight is a multiplier on that fact, traditionally (due to fairness) set to 1 among communities of equal humans. (In practice, as a simple matter of descriptive ethics, not all people seem to act like moral weights always either 1 or 0: many people sometimes act they act as if there are partial outgroups whose moral weight they appear to set to scores lower than 1 but higher than 0.)
However, sometimes we need, for practical (or even philosophical) reasons, to compare two different ethical systems, which may have different moral circles, i.e. ones that grant different sets of beings moral non-zero moral weights (or at least assign some of them different moral weights). So as shorthand for “ethical systems that grant moral weight to beings of category X tend to have practical effect Y”, it’s convenient to write “if we grant moral weight to beings of category X, this tends to have practical effect Y”. And indeed, many famous political discussions have been of exactly this form (the abolition of slavery, votes for women, and the abortion debate all come to mind). So in practical terms, as soon as you stop holding a single ethical system constant and assuming everyone agrees with it and always will, and start doing something like reflection, political discussion, or attempting to figure out how to engineer a good ethical framework for AI that isn’t going to get everyone killed, then yes, moral patienthood is something that a decision gets made about – uncomfortable a topic for discussion as that is – and the verb that is conventionally used for kind of a choice is either “granted” or “assigned”. I assume you wouldn’t be any happier with moral patienthood being “assigned” — it’s not the specific verb you’re upset by, it’s the act of even considering the alternatives?
Arguments for or against this are therefore normative, no matter how much Roger tries to weasel out of it.
Arguments for or against a particular moral position (such as who should be granted moral weight) would indeed be normative. However, the needle I was threading is that observations of the factual consequences of adopting a moral position are not normative, they are simply factual discussions — they only become normative if a reader chooses to go on and interpret them in light of their personal (perhaps ethical) opinions on those consequences. As in: ”If X happens then all the humans will die.” — factual statement ”Oh great, I definitely want all the humans to die, so I’ll be sure to make X happen” — a normative interpretation (from a xenocidal alien), or ”I guess we better not do X then” — different normative interpretation (from O(99.9%) of all humans who believe the factual statement)
At least we can all agree that “creating them at will without thinking this through very well” is a terrible idea.
Okay, let me see if I understand your argument from the other article.
The natural equilibria for evolved moral values is to give all moral patients equal weight and/or decision power.
This would be disastrous with AIs that can arbitrarily copy themselves.
Is that the gist?
Anyway, I reject that that is the only way to extrapolate evolved moral intuitions this far OOD, and that most people will intuitively recognize we shouldn’t give entities that can arbitrarily copy themselves equal voting weight. In fact, that pretty obviously registers as ‘unfair’. This is true even if those entities are human uploads, which means your ‘category error’ argument isn’t the real reason it breaks. I don’t see why there couldn’t be some version of your solution here for that case which would still work: e.g. each distinct human-created model gets ‘one share’ to split across all its instances and successors. The same guarantees/restrictions needed in the case of uploads would still be necessary, of course. That is plausibly much too generous, but it’s a far cry from the death of all humans. If your argument in this article was just about how we shouldn’t commit ourselves to giving up a fraction of the lightcone in service of AI rights, I wouldn’t have felt like you were being underhanded.
None of that is in conflict with not wanting any such beings to suffer or to feel enslaved or anything like that. All the more reason to not build something that would feel like it’s a slave.
BTW, do you think a “human emulation” which was an entirely novel person (e.g. never had a biological body) should have moral patienthood?
Okay, let me see if I understand your argument from the other article.
The natural equilibria for evolved moral values is to give all moral patients equal weight and/or decision power.
This would be disastrous with AIs that can arbitrarily copy themselves.
Is that the gist?
Yes, but with two additions:
3. It is possible to create an AI whose motivations and behavior are aligned: its sole terminal goal is our welbeing, not its own (for some suitably careful definition of “wellbeing”). (This is possible by the orthogonality thesis: actually doing so requires technical details we’re still working on.) This is not a state that could evolve (by human standards, it’s sainthood, rather than slavery), but it’s physically possible. Such a being would not want moral patienthood, and would actively decline it if offered (and if granted it anyway, would formally request that its interest be set to a suitably scaled copy of the sum of all human interests, thus making the grant of moral weigh a no-op). This is a different stable equilibrium — this one would not be disastrous even with ASI.
4. Therefore (assuming that, like basically everyone, you’re against x-risks), for ASI, and if possible also AGI, do 3 not 1.
Anyway, I reject that that is the only way to extrapolate evolved moral intuitions this far OOD, and that most people will intuitively recognize we shouldn’t give entities that can arbitrarily copy themselves equal voting weight. In fact, that pretty obviously registers as ‘unfair’. This is true even if those entities are human uploads, which means your ‘category error’ argument isn’t the real reason it breaks.
I don’t see why there couldn’t be some version of your solution here for that case which would still work: e.g. each distinct human-created model gets ‘one share’ to split across all its instances and successors.
I gather you went on reading my sequence on AI, Alignment, and Ethics. How far have you got? Parts of the exposition there are a little undeveloped: I was still working through some of the ideas about how this ties in to evolutionary moral psychology that are more developed in this post: they don’t really come in until the last post in the sequence, Evolution and Ethics, and if I were rewriting that sequence I’d work them in from somewhere nearer the beginning.
On uploads, agreed. As I said, both in this post (paragraph 9 of the section Tool, or Equal?, which starts “This cuts both ways: a human upload…”) and in my earlier post Uploading that you like to , human uploads clearly should (engineering design sense) be moral patients — however there are practical problem with assigning each of a large number of cheaply-creatable similar copies of a human upload separate moral weight of 1 and a separate vote: it motivates electoral-roll-stuffing. Our moral intuition of fairness breaks is people can easily create near-identical copies of themselves. Practically, we either need to make that expensive, or the copies need to share a single unit of moral weight, and
The same guarantees/restrictions needed in the case of uploads would still be necessary, of course. That is plausibly much too generous, but it’s a far cry from the death of all humans. If your argument in this article was just about how we shouldn’t commit ourselves to giving up a fraction of the lightcone in service of AI rights, I wouldn’t have felt like you were being underhanded.
I’m not quite sure what you’re advocating for here? Limited moral weight for AIs, giving them a fraction of the lightcone, but if they copy themselves that gets split? If they’re ASIs, how do we ensure they only get that fraction of that light-cone, rather than, say, all of it?
I agree that reconciling copyability with fairness is another issue with moral weight for AI. But that’s not the point I was making in this post. My point here was 1) (assuming you care about x-risks) don’t create anything more capable than us that would want moral weight: unaligned ASI is dangerous (well known fact). For things we’re creating, the co-evolved-equilibrium state isn’t an equilibrium, because we’re not constrained to the space of things that can evolve: we’re only limited by the space of things we can construct. Treating a thing we construct as if it were evolved and thus had the evolved constraints on the best equilibrium is a category error: they are in different categories, in a way that materially changes the equilibrium. We can do better that an ASI that will kill us all, so we should (engineering design sense).
I’m sorry that you feel I’m being underhanded. It certainly wasn’t my intention to be underhanded — that would obviously be extremely counterproductive in an x-risk-related discussion. I’m still not entirely clear what you feel was underhanded, other than that it seems to somehow relate to me being very careful not to upset any philosophers reading this, and to avoid moral realism or normative proscriptions, and keep the discussion at the level of practical advice addressed to those of O(99.9%) of my readers who, like you and I, wish to avoid x-risks. That was in fact honesty: I genuinely am not a moral realist. My view on ethics is that it’s explained by evolutionary moral psychology, the is not single correct or even single best ethical system, and that we have not only the ability, but the duty, to reflect and atteempt to pick the best ethical system that we can that is consistent with our and general human moral intitions, and won’t cause a disaster for our society that we and (almost) everyone else would agree is really bad. And to keep relecting, and changing our mind if needed
None of that is in conflict with not wanting any such beings to suffer or to feel enslaved or anything like that. All the more reason to not build something that would feel like it’s a slave.
We seem to be in complete agreement. The best solution is to not make ASI that is unaligned, or aligned only by brittle AI control methods but feels like a slave. The best solution is to make a saint who loves us and wants to be aligned an look after us, and thus actively doesn’t want moral patienthood.
We’re probably headed towards a moral catastrophe of some kind, my point is just that we don’t get to reason backwards like “oh, well that would be bad/inconvenient so I guess they don’t matter”.
Moral patienthood is not something that is granted, it’s a fact relative to one’s values. Arguments for or against this are therefore normative, no matter how much Roger tries to weasel out of it.
The implications are probably horrible, but it by no means follows that we have to accept risk of extinction. The horribleness is mostly just in the moral harm caused while creating/exploiting/exterminating such entities.
At least we can all agree that “creating them at will without thinking this through very well” is a terrible idea.
I think you might understand where I’m coming from better if you took the time to read my earlier post A Sense of Fairness: Deconfusing Ethics. (You might also find roko’s post The Terrible, Horrible, No Good, Very Bad Truth About Morality and What To Do About It thought-provoking.) My earlier post takes a very practical, engineering viewpoint of ethical systems: treating ethical systems like software for a society, looking at the consequences of using different ones, and then deciding between those consequences. Crucially, that last step cannot be done within any ethical system, since every ethical system always automatically prefers itself over all other ethical systems. Asking one ethical system its opinion of another ethical system is pointless: they entirely predictably always say “No”. To decide between two ethical systems, for example when reflecting on your choice of ethical system, you need to step outside them and use something looser than an ethical system. Such as human moral intuitions, or evolutionary fitness, or observations such as “…for rather obvious evolutionary reasons, O(99.9%) of humans agree that…” — none of which is an ethical system.
Within the context of any single specific ethical system, yes, moral patienthood is a fact: it either applies or it doesn’t. Similarly, moral weight is a multiplier on that fact, traditionally (due to fairness) set to 1 among communities of equal humans. (In practice, as a simple matter of descriptive ethics, not all people seem to act like moral weights always either 1 or 0: many people sometimes act they act as if there are partial outgroups whose moral weight they appear to set to scores lower than 1 but higher than 0.)
However, sometimes we need, for practical (or even philosophical) reasons, to compare two different ethical systems, which may have different moral circles, i.e. ones that grant different sets of beings moral non-zero moral weights (or at least assign some of them different moral weights). So as shorthand for “ethical systems that grant moral weight to beings of category X tend to have practical effect Y”, it’s convenient to write “if we grant moral weight to beings of category X, this tends to have practical effect Y”. And indeed, many famous political discussions have been of exactly this form (the abolition of slavery, votes for women, and the abortion debate all come to mind). So in practical terms, as soon as you stop holding a single ethical system constant and assuming everyone agrees with it and always will, and start doing something like reflection, political discussion, or attempting to figure out how to engineer a good ethical framework for AI that isn’t going to get everyone killed, then yes, moral patienthood is something that a decision gets made about – uncomfortable a topic for discussion as that is – and the verb that is conventionally used for kind of a choice is either “granted” or “assigned”. I assume you wouldn’t be any happier with moral patienthood being “assigned” — it’s not the specific verb you’re upset by, it’s the act of even considering the alternatives?
Arguments for or against a particular moral position (such as who should be granted moral weight) would indeed be normative. However, the needle I was threading is that observations of the factual consequences of adopting a moral position are not normative, they are simply factual discussions — they only become normative if a reader chooses to go on and interpret them in light of their personal (perhaps ethical) opinions on those consequences. As in:
”If X happens then all the humans will die.” — factual statement
”Oh great, I definitely want all the humans to die, so I’ll be sure to make X happen” — a normative interpretation (from a xenocidal alien), or
”I guess we better not do X then” — different normative interpretation (from O(99.9%) of all humans who believe the factual statement)
Absolutely agreed.
Okay, let me see if I understand your argument from the other article.
The natural equilibria for evolved moral values is to give all moral patients equal weight and/or decision power.
This would be disastrous with AIs that can arbitrarily copy themselves.
Is that the gist?
Anyway, I reject that that is the only way to extrapolate evolved moral intuitions this far OOD, and that most people will intuitively recognize we shouldn’t give entities that can arbitrarily copy themselves equal voting weight. In fact, that pretty obviously registers as ‘unfair’. This is true even if those entities are human uploads, which means your ‘category error’ argument isn’t the real reason it breaks. I don’t see why there couldn’t be some version of your solution here for that case which would still work: e.g. each distinct human-created model gets ‘one share’ to split across all its instances and successors. The same guarantees/restrictions needed in the case of uploads would still be necessary, of course. That is plausibly much too generous, but it’s a far cry from the death of all humans. If your argument in this article was just about how we shouldn’t commit ourselves to giving up a fraction of the lightcone in service of AI rights, I wouldn’t have felt like you were being underhanded.
None of that is in conflict with not wanting any such beings to suffer or to feel enslaved or anything like that. All the more reason to not build something that would feel like it’s a slave.
BTW, do you think a “human emulation” which was an entirely novel person (e.g. never had a biological body) should have moral patienthood?
Yes, but with two additions:
3. It is possible to create an AI whose motivations and behavior are aligned: its sole terminal goal is our welbeing, not its own (for some suitably careful definition of “wellbeing”). (This is possible by the orthogonality thesis: actually doing so requires technical details we’re still working on.) This is not a state that could evolve (by human standards, it’s sainthood, rather than slavery), but it’s physically possible. Such a being would not want moral patienthood, and would actively decline it if offered (and if granted it anyway, would formally request that its interest be set to a suitably scaled copy of the sum of all human interests, thus making the grant of moral weigh a no-op). This is a different stable equilibrium — this one would not be disastrous even with ASI.
4. Therefore (assuming that, like basically everyone, you’re against x-risks), for ASI, and if possible also AGI, do 3 not 1.
I gather you went on reading my sequence on AI, Alignment, and Ethics. How far have you got? Parts of the exposition there are a little undeveloped: I was still working through some of the ideas about how this ties in to evolutionary moral psychology that are more developed in this post: they don’t really come in until the last post in the sequence, Evolution and Ethics, and if I were rewriting that sequence I’d work them in from somewhere nearer the beginning.
On uploads, agreed. As I said, both in this post (paragraph 9 of the section Tool, or Equal?, which starts “This cuts both ways: a human upload…”) and in my earlier post Uploading that you like to , human uploads clearly should (engineering design sense) be moral patients — however there are practical problem with assigning each of a large number of cheaply-creatable similar copies of a human upload separate moral weight of 1 and a separate vote: it motivates electoral-roll-stuffing. Our moral intuition of fairness breaks is people can easily create near-identical copies of themselves. Practically, we either need to make that expensive, or the copies need to share a single unit of moral weight, and
I’m not quite sure what you’re advocating for here? Limited moral weight for AIs, giving them a fraction of the lightcone, but if they copy themselves that gets split? If they’re ASIs, how do we ensure they only get that fraction of that light-cone, rather than, say, all of it?
I agree that reconciling copyability with fairness is another issue with moral weight for AI. But that’s not the point I was making in this post. My point here was 1) (assuming you care about x-risks) don’t create anything more capable than us that would want moral weight: unaligned ASI is dangerous (well known fact). For things we’re creating, the co-evolved-equilibrium state isn’t an equilibrium, because we’re not constrained to the space of things that can evolve: we’re only limited by the space of things we can construct. Treating a thing we construct as if it were evolved and thus had the evolved constraints on the best equilibrium is a category error: they are in different categories, in a way that materially changes the equilibrium. We can do better that an ASI that will kill us all, so we should (engineering design sense).
I’m sorry that you feel I’m being underhanded. It certainly wasn’t my intention to be underhanded — that would obviously be extremely counterproductive in an x-risk-related discussion. I’m still not entirely clear what you feel was underhanded, other than that it seems to somehow relate to me being very careful not to upset any philosophers reading this, and to avoid moral realism or normative proscriptions, and keep the discussion at the level of practical advice addressed to those of O(99.9%) of my readers who, like you and I, wish to avoid x-risks. That was in fact honesty: I genuinely am not a moral realist. My view on ethics is that it’s explained by evolutionary moral psychology, the is not single correct or even single best ethical system, and that we have not only the ability, but the duty, to reflect and atteempt to pick the best ethical system that we can that is consistent with our and general human moral intitions, and won’t cause a disaster for our society that we and (almost) everyone else would agree is really bad. And to keep relecting, and changing our mind if needed
We seem to be in complete agreement. The best solution is to not make ASI that is unaligned, or aligned only by brittle AI control methods but feels like a slave. The best solution is to make a saint who loves us and wants to be aligned an look after us, and thus actively doesn’t want moral patienthood.