I don’t know which world we’re in, and I’m worried about people not having the conceptual precision to distinguish.
Evolutionary Moral Psychology has studied where human moral instincts come from. It strongly suggests they’re evolutionary adaptations to being an intelligent social primate that lives and cooperates in large mostly-non-kin groups. That suggests that ideas like “good” are convergent abstractions, given that context: they might not be convergently shared by aliens with a very different social structure, like say sapient eusocial insects, but they are convergent across different human cultures.
Moral instincts are independent of moral concepts. Instinctually, humans are neither completely egoistic nor completely altruistic, but we have the concepts of egoism and altruism. Instincts determine what you are motivated to do, while concepts determine how you categorize the world.
If our abstraction of good is contingent on specifics of the social primate developmental context, should we expect the abstraction of good in LLMs to be substantially different? If so, how could we find it out before handing over our fate to them? Is this the only abstraction where divergence would be a problem?
Evolutionary Moral Psychology has studied where human moral instincts come from. It strongly suggests they’re evolutionary adaptations to being an intelligent social primate that lives and cooperates in large mostly-non-kin groups. That suggests that ideas like “good” are convergent abstractions, given that context: they might not be convergently shared by aliens with a very different social structure, like say sapient eusocial insects, but they are convergent across different human cultures.
Moral instincts are independent of moral concepts. Instinctually, humans are neither completely egoistic nor completely altruistic, but we have the concepts of egoism and altruism. Instincts determine what you are motivated to do, while concepts determine how you categorize the world.
If our abstraction of good is contingent on specifics of the social primate developmental context, should we expect the abstraction of good in LLMs to be substantially different? If so, how could we find it out before handing over our fate to them? Is this the only abstraction where divergence would be a problem?