Human values and AI alignment do not exist independently. There are several situations when they affect each other, creating complex reflection pattern.
Examples:
Humans want to align AI – so “AI alignment” is itself human value.
Human values are convergent goals (like survival and reproduction) - and thus are similar to AI’s convergent goals.
If humans accept the idea to make paperclips (or whatever), alignment will be reached.
It looks like many humans want to create non-aligned AI. Thus non-aligned AI is aligned.
Humans may not want that their values will be learned. AI alignment will be mis-aligned.
Humans which are connected with AI are not humans any more, and not subjected to alignment.
Non-aligned AI will affect human values while learning them.
Many humans don’t want AI to exist at all—so any aligned AI is misaligned.
Human may want that AI will not be aligned with other person.
AI aligned with mis-aligned human is unaligned
As human values are changing, any aligned AI will be non-aligned soon.
By saying ‘human values’ we exclude mammals values, group values etc and thus define the outcome.
Reflectivity in alignment.
Human values and AI alignment do not exist independently. There are several situations when they affect each other, creating complex reflection pattern.
Examples:
Humans want to align AI – so “AI alignment” is itself human value.
Human values are convergent goals (like survival and reproduction) - and thus are similar to AI’s convergent goals.
If humans accept the idea to make paperclips (or whatever), alignment will be reached.
It looks like many humans want to create non-aligned AI. Thus non-aligned AI is aligned.
Humans may not want that their values will be learned. AI alignment will be mis-aligned.
Humans which are connected with AI are not humans any more, and not subjected to alignment.
Non-aligned AI will affect human values while learning them.
Many humans don’t want AI to exist at all—so any aligned AI is misaligned.
Human may want that AI will not be aligned with other person.
AI aligned with mis-aligned human is unaligned
As human values are changing, any aligned AI will be non-aligned soon.
By saying ‘human values’ we exclude mammals values, group values etc and thus define the outcome.