[Question] References that treat human values as units of selection?

When I read AI al­ign­ment liter­a­ture, the au­thor usu­ally seems to be as­sum­ing some­thing like: “Hu­man val­ues are fixed, they’re just hard to write down. But we should build in­tel­li­gent agents that ad­here to them.“ Or oc­ca­sion­ally: “Hu­man val­ues are messy, muta­ble things, but there is some (meta-)ethics that clar­ifies what it is we all want and what kind of in­tel­li­gent agents we should build.”

This makes it hard for me to en­gage with the AI al­ign­ment dis­cus­sion, be­cause the as­sump­tion I bring to it is: “Values are units of se­lec­tion, like ze­bra stripes or a be­lief in vac­ci­nat­ing your chil­dren. You can’t talk sen­si­bly about what val­ues are right, or what we ‘should’ build into in­tel­li­gent agents. You can only talk about what val­ues win (i.e. per­sist over more time and space in the fu­ture).”

I’ve tried to find refer­ences whose au­thors ad­dress this as­sump­tion, or write about hu­man val­ues from this rel­a­tivist/​evolu­tion­ary per­spec­tive, but I’ve come up short. Can any­one point me to­ward some? Or dis­abuse me of my as­sump­tion?