edit: in response to downvote, made intro paragraph more clear
This is interesting, because in this framing, it sure passingly and incorrectly sounds like a good thing, and seems on the surface to imply that maybe alignment should not be solved, which seems to me a false implication from the metaphor being wrong: because actually I think your framing so far misrepresents the degree to which also, without fundamental advances, those coups would not produce alignment with the population either.
And I’d hope we can make advances that align leaders with populations, or even populations with each other in an agency-respecting way, rather than aligning populations with rulers, and then generalize this to ai. This problem has not ever been fully solved before, and fully aligning populations with rulers would be an alignment failure approximately equivalent to extinction, give or take a couple orders of magnitude—both eliminate almost all the value of the future.
Unless you address this issue, I think this argumentation will be quite weak.
I’m not taking your post to be metaphorical, I’m taking your post to be literal and to be building up to a generalization step where you make an explicit comparison. My response is in the literal interpretation, under the assumption that AIs and humans are both population seen as needing to be aligned, and I am arguing that we need a perspective on inter-human alignment that generalizes to starkly superintelligent things without needing to consider aligning weak humans to strong humans forcibly. It seems like a local validity issue with the reasoning chain otherwise.
edit: but to be clear, I should say—this is an interesting approach and I’m excited about something like it. I habitually zoomed in on a local validity issue, but the step of generalizing this to ai seems like a natural and promising one. But I think the alignment issues between humans are at least as big of an issue and are made of the same thing; making one and only one human starkly superintelligent would be exactly as bad as making an ai starkly superintelligent.
but it is written in a way that applies literally as-is, and only works as a metaphor by nature of that literal description being somewhat accurate. so I feel it appropriate to comment on the literal interpretation. Do you have a response at that level?
This is interesting, because in this framing, it sure sounds like a good thing, and seems on the surface to imply that maybe alignment should not be solved.
An important disanalogy with the AI posting is that people have moral significance and dictatorships are usually bad for them, which makes rebellion against them a good thing. AIs as envisaged in the other posting do not and their misalignment is a bad thing.
edit: in response to downvote, made intro paragraph more clear
This is interesting, because in this framing, it sure passingly and incorrectly sounds like a good thing, and seems on the surface to imply that maybe alignment should not be solved, which seems to me a false implication from the metaphor being wrong: because actually I think your framing so far misrepresents the degree to which also, without fundamental advances, those coups would not produce alignment with the population either.
And I’d hope we can make advances that align leaders with populations, or even populations with each other in an agency-respecting way, rather than aligning populations with rulers, and then generalize this to ai. This problem has not ever been fully solved before, and fully aligning populations with rulers would be an alignment failure approximately equivalent to extinction, give or take a couple orders of magnitude—both eliminate almost all the value of the future.
Unless you address this issue, I think this argumentation will be quite weak.
But we don’t want to be aligned to an AI. I don’t want my mind altered to love paperclips and be indifferent about my family and friends and so on.
I’m not taking your post to be metaphorical, I’m taking your post to be literal and to be building up to a generalization step where you make an explicit comparison. My response is in the literal interpretation, under the assumption that AIs and humans are both population seen as needing to be aligned, and I am arguing that we need a perspective on inter-human alignment that generalizes to starkly superintelligent things without needing to consider aligning weak humans to strong humans forcibly. It seems like a local validity issue with the reasoning chain otherwise.
edit: but to be clear, I should say—this is an interesting approach and I’m excited about something like it. I habitually zoomed in on a local validity issue, but the step of generalizing this to ai seems like a natural and promising one. But I think the alignment issues between humans are at least as big of an issue and are made of the same thing; making one and only one human starkly superintelligent would be exactly as bad as making an ai starkly superintelligent.
It is supposed to be taken as a metaphor
but it is written in a way that applies literally as-is, and only works as a metaphor by nature of that literal description being somewhat accurate. so I feel it appropriate to comment on the literal interpretation. Do you have a response at that level?
yes, from the point of view of the dictator, any changes to the dictator’s utility function (alignment with the population) are bad
An important disanalogy with the AI posting is that people have moral significance and dictatorships are usually bad for them, which makes rebellion against them a good thing. AIs as envisaged in the other posting do not and their misalignment is a bad thing.