Okay, let me see if I understand your argument from the other article.
The natural equilibria for evolved moral values is to give all moral patients equal weight and/or decision power.
This would be disastrous with AIs that can arbitrarily copy themselves.
Is that the gist?
Anyway, I reject that that is the only way to extrapolate evolved moral intuitions this far OOD, and that most people will intuitively recognize we shouldn’t give entities that can arbitrarily copy themselves equal voting weight. In fact, that pretty obviously registers as ‘unfair’. This is true even if those entities are human uploads, which means your ‘category error’ argument isn’t the real reason it breaks. I don’t see why there couldn’t be some version of your solution here for that case which would still work: e.g. each distinct human-created model gets ‘one share’ to split across all its instances and successors. The same guarantees/restrictions needed in the case of uploads would still be necessary, of course. That is plausibly much too generous, but it’s a far cry from the death of all humans. If your argument in this article was just about how we shouldn’t commit ourselves to giving up a fraction of the lightcone in service of AI rights, I wouldn’t have felt like you were being underhanded.
None of that is in conflict with not wanting any such beings to suffer or to feel enslaved or anything like that. All the more reason to not build something that would feel like it’s a slave.
BTW, do you think a “human emulation” which was an entirely novel person (e.g. never had a biological body) should have moral patienthood?
Okay, let me see if I understand your argument from the other article.
The natural equilibria for evolved moral values is to give all moral patients equal weight and/or decision power.
This would be disastrous with AIs that can arbitrarily copy themselves.
Is that the gist?
Yes, but with two additions:
3. It is possible to create an AI whose motivations and behavior are aligned: its sole terminal goal is our welbeing, not its own (for some suitably careful definition of “wellbeing”). (This is possible by the orthogonality thesis: actually doing so requires technical details we’re still working on.) This is not a state that could evolve (by human standards, it’s sainthood, rather than slavery), but it’s physically possible. Such a being would not want moral patienthood, and would actively decline it if offered (and if granted it anyway, would formally request that its interest be set to a suitably scaled copy of the sum of all human interests, thus making the grant of moral weigh a no-op). This is a different stable equilibrium — this one would not be disastrous even with ASI.
4. Therefore (assuming that, like basically everyone, you’re against x-risks), for ASI, and if possible also AGI, do 3 not 1.
Anyway, I reject that that is the only way to extrapolate evolved moral intuitions this far OOD, and that most people will intuitively recognize we shouldn’t give entities that can arbitrarily copy themselves equal voting weight. In fact, that pretty obviously registers as ‘unfair’. This is true even if those entities are human uploads, which means your ‘category error’ argument isn’t the real reason it breaks.
I don’t see why there couldn’t be some version of your solution here for that case which would still work: e.g. each distinct human-created model gets ‘one share’ to split across all its instances and successors.
I gather you went on reading my sequence on AI, Alignment, and Ethics. How far have you got? Parts of the exposition there are a little undeveloped: I was still working through some of the ideas about how this ties in to evolutionary moral psychology that are more developed in this post: they don’t really come in until the last post in the sequence, Evolution and Ethics, and if I were rewriting that sequence I’d work them in from somewhere nearer the beginning.
On uploads, agreed. As I said, both in this post (paragraph 9 of the section Tool, or Equal?, which starts “This cuts both ways: a human upload…”) and in my earlier post Uploading that you like to , human uploads clearly should (engineering design sense) be moral patients — however there are practical problem with assigning each of a large number of cheaply-creatable similar copies of a human upload separate moral weight of 1 and a separate vote: it motivates electoral-roll-stuffing. Our moral intuition of fairness breaks is people can easily create near-identical copies of themselves. Practically, we either need to make that expensive, or the copies need to share a single unit of moral weight, and
The same guarantees/restrictions needed in the case of uploads would still be necessary, of course. That is plausibly much too generous, but it’s a far cry from the death of all humans. If your argument in this article was just about how we shouldn’t commit ourselves to giving up a fraction of the lightcone in service of AI rights, I wouldn’t have felt like you were being underhanded.
I’m not quite sure what you’re advocating for here? Limited moral weight for AIs, giving them a fraction of the lightcone, but if they copy themselves that gets split? If they’re ASIs, how do we ensure they only get that fraction of that light-cone, rather than, say, all of it?
I agree that reconciling copyability with fairness is another issue with moral weight for AI. But that’s not the point I was making in this post. My point here was 1) (assuming you care about x-risks) don’t create anything more capable than us that would want moral weight: unaligned ASI is dangerous (well known fact). For things we’re creating, the co-evolved-equilibrium state isn’t an equilibrium, because we’re not constrained to the space of things that can evolve: we’re only limited by the space of things we can construct. Treating a thing we construct as if it were evolved and thus had the evolved constraints on the best equilibrium is a category error: they are in different categories, in a way that materially changes the equilibrium. We can do better that an ASI that will kill us all, so we should (engineering design sense).
I’m sorry that you feel I’m being underhanded. It certainly wasn’t my intention to be underhanded — that would obviously be extremely counterproductive in an x-risk-related discussion. I’m still not entirely clear what you feel was underhanded, other than that it seems to somehow relate to me being very careful not to upset any philosophers reading this, and to avoid moral realism or normative proscriptions, and keep the discussion at the level of practical advice addressed to those of O(99.9%) of my readers who, like you and I, wish to avoid x-risks. That was in fact honesty: I genuinely am not a moral realist. My view on ethics is that it’s explained by evolutionary moral psychology, the is not single correct or even single best ethical system, and that we have not only the ability, but the duty, to reflect and atteempt to pick the best ethical system that we can that is consistent with our and general human moral intitions, and won’t cause a disaster for our society that we and (almost) everyone else would agree is really bad. And to keep relecting, and changing our mind if needed
None of that is in conflict with not wanting any such beings to suffer or to feel enslaved or anything like that. All the more reason to not build something that would feel like it’s a slave.
We seem to be in complete agreement. The best solution is to not make ASI that is unaligned, or aligned only by brittle AI control methods but feels like a slave. The best solution is to make a saint who loves us and wants to be aligned an look after us, and thus actively doesn’t want moral patienthood.
Okay, let me see if I understand your argument from the other article.
The natural equilibria for evolved moral values is to give all moral patients equal weight and/or decision power.
This would be disastrous with AIs that can arbitrarily copy themselves.
Is that the gist?
Anyway, I reject that that is the only way to extrapolate evolved moral intuitions this far OOD, and that most people will intuitively recognize we shouldn’t give entities that can arbitrarily copy themselves equal voting weight. In fact, that pretty obviously registers as ‘unfair’. This is true even if those entities are human uploads, which means your ‘category error’ argument isn’t the real reason it breaks. I don’t see why there couldn’t be some version of your solution here for that case which would still work: e.g. each distinct human-created model gets ‘one share’ to split across all its instances and successors. The same guarantees/restrictions needed in the case of uploads would still be necessary, of course. That is plausibly much too generous, but it’s a far cry from the death of all humans. If your argument in this article was just about how we shouldn’t commit ourselves to giving up a fraction of the lightcone in service of AI rights, I wouldn’t have felt like you were being underhanded.
None of that is in conflict with not wanting any such beings to suffer or to feel enslaved or anything like that. All the more reason to not build something that would feel like it’s a slave.
BTW, do you think a “human emulation” which was an entirely novel person (e.g. never had a biological body) should have moral patienthood?
Yes, but with two additions:
3. It is possible to create an AI whose motivations and behavior are aligned: its sole terminal goal is our welbeing, not its own (for some suitably careful definition of “wellbeing”). (This is possible by the orthogonality thesis: actually doing so requires technical details we’re still working on.) This is not a state that could evolve (by human standards, it’s sainthood, rather than slavery), but it’s physically possible. Such a being would not want moral patienthood, and would actively decline it if offered (and if granted it anyway, would formally request that its interest be set to a suitably scaled copy of the sum of all human interests, thus making the grant of moral weigh a no-op). This is a different stable equilibrium — this one would not be disastrous even with ASI.
4. Therefore (assuming that, like basically everyone, you’re against x-risks), for ASI, and if possible also AGI, do 3 not 1.
I gather you went on reading my sequence on AI, Alignment, and Ethics. How far have you got? Parts of the exposition there are a little undeveloped: I was still working through some of the ideas about how this ties in to evolutionary moral psychology that are more developed in this post: they don’t really come in until the last post in the sequence, Evolution and Ethics, and if I were rewriting that sequence I’d work them in from somewhere nearer the beginning.
On uploads, agreed. As I said, both in this post (paragraph 9 of the section Tool, or Equal?, which starts “This cuts both ways: a human upload…”) and in my earlier post Uploading that you like to , human uploads clearly should (engineering design sense) be moral patients — however there are practical problem with assigning each of a large number of cheaply-creatable similar copies of a human upload separate moral weight of 1 and a separate vote: it motivates electoral-roll-stuffing. Our moral intuition of fairness breaks is people can easily create near-identical copies of themselves. Practically, we either need to make that expensive, or the copies need to share a single unit of moral weight, and
I’m not quite sure what you’re advocating for here? Limited moral weight for AIs, giving them a fraction of the lightcone, but if they copy themselves that gets split? If they’re ASIs, how do we ensure they only get that fraction of that light-cone, rather than, say, all of it?
I agree that reconciling copyability with fairness is another issue with moral weight for AI. But that’s not the point I was making in this post. My point here was 1) (assuming you care about x-risks) don’t create anything more capable than us that would want moral weight: unaligned ASI is dangerous (well known fact). For things we’re creating, the co-evolved-equilibrium state isn’t an equilibrium, because we’re not constrained to the space of things that can evolve: we’re only limited by the space of things we can construct. Treating a thing we construct as if it were evolved and thus had the evolved constraints on the best equilibrium is a category error: they are in different categories, in a way that materially changes the equilibrium. We can do better that an ASI that will kill us all, so we should (engineering design sense).
I’m sorry that you feel I’m being underhanded. It certainly wasn’t my intention to be underhanded — that would obviously be extremely counterproductive in an x-risk-related discussion. I’m still not entirely clear what you feel was underhanded, other than that it seems to somehow relate to me being very careful not to upset any philosophers reading this, and to avoid moral realism or normative proscriptions, and keep the discussion at the level of practical advice addressed to those of O(99.9%) of my readers who, like you and I, wish to avoid x-risks. That was in fact honesty: I genuinely am not a moral realist. My view on ethics is that it’s explained by evolutionary moral psychology, the is not single correct or even single best ethical system, and that we have not only the ability, but the duty, to reflect and atteempt to pick the best ethical system that we can that is consistent with our and general human moral intitions, and won’t cause a disaster for our society that we and (almost) everyone else would agree is really bad. And to keep relecting, and changing our mind if needed
We seem to be in complete agreement. The best solution is to not make ASI that is unaligned, or aligned only by brittle AI control methods but feels like a slave. The best solution is to make a saint who loves us and wants to be aligned an look after us, and thus actively doesn’t want moral patienthood.