My instinct is that if we can figure out how to align AI to anything at all, then there is basically zero chance that the AI will arrive at
I need to save shrimp and kill all humans
and quite a significant chance that it will arrive at
I will support human flourishing and completely disregard factory-farmed animals
Humans hold the reins of how AI is trained, so if humans have power to direct ASI’s values then they won’t direct it to a place that results in killing all humans. And if humans don’t have that power (i.e. the ASI is misaligned), then I don’t think it will care about humans or shrimp.
The original idea was to align the AI to the simple idea of valuing sentience. Maybe you could align an AI some lumpy human-centric value system, but that not whats under discussion.
My instinct is that if we can figure out how to align AI to anything at all, then there is basically zero chance that the AI will arrive at
and quite a significant chance that it will arrive at
Humans hold the reins of how AI is trained, so if humans have power to direct ASI’s values then they won’t direct it to a place that results in killing all humans. And if humans don’t have that power (i.e. the ASI is misaligned), then I don’t think it will care about humans or shrimp.
The original idea was to align the AI to the simple idea of valuing sentience. Maybe you could align an AI some lumpy human-centric value system, but that not whats under discussion.