I appreciate this post and your previous post. Fwiw, I think these terminology concerns/confusions are harming discourse on AI existential safety, and I expect posts like these to help people talk-past-each-other less, notice subtle distinctions, deconfuse more quickly, etc.
(I especially like the point about how increasing intent alignment on the margin doesn’t necessarily help much with increasing intent alignment in the limit. Some version of this idea has come up a few times in discussions about OpenAI’s alignment plan, and the way you presented it makes the point clearer/crisper imo).
I appreciate this post and your previous post. Fwiw, I think these terminology concerns/confusions are harming discourse on AI existential safety, and I expect posts like these to help people talk-past-each-other less, notice subtle distinctions, deconfuse more quickly, etc.
(I especially like the point about how increasing intent alignment on the margin doesn’t necessarily help much with increasing intent alignment in the limit. Some version of this idea has come up a few times in discussions about OpenAI’s alignment plan, and the way you presented it makes the point clearer/crisper imo).