I am definitely thinking of IF as it applies to systems with capability for unlimited autonomy. Intent alignment as a concept doesn’t end at some level of capability—although I think we often assume it would.
How it would understand “the right thing” is the question. But yes, intent alignment as I’m thinking of it does scale smoothly into value alignment plus corrigibility if you can get it right enough.
I am definitely thinking of IF as it applies to systems with capability for unlimited autonomy. Intent alignment as a concept doesn’t end at some level of capability—although I think we often assume it would.
How it would understand “the right thing” is the question. But yes, intent alignment as I’m thinking of it does scale smoothly into value alignment plus corrigibility if you can get it right enough.