Wei Dai comments on Aligning to Virtues

Wei Dai 16 Feb 2026 13:49 UTC
10 points
4

Defer to the processes that produced the virtues we already have.

The way that cultural evolution happened in the past involved a lot of wars/conquests and people copying powerful cultures in part out of fear. What would “deferring to this process” look like going forward, or would you get rid of this part? What do you think about Robin Hanson’s concerns around “cultural drift”?

Specifically reason about the consequences of different virtues.

Do AIs also need to do this? If so, do we not still need to align them with consequentialist values?
- Richard_Ngo 16 Feb 2026 17:35 UTC
  6 points
  −10
  Parent
  The way that cultural evolution happened in the past involved a lot of wars/conquests and people copying powerful cultures in part out of fear. What would “deferring to this process” look like going forward, or would you get rid of this part?
  I model our current culture as trying extremely hard to create a centralized global culture. Deferring to cultural evolution might just mean trying less hard to do so, thereby letting more variation arise. I think wars of conquest are bad but not arbitrarily bad, to some extent they’re a mechanism for reallocating resources from dysfunctional to functional cultures (though of course there’s some goodharting on what counts as functional).
  I think Robin is talking about some interesting stuff re culture but the whole concept of “cultural drift” seems misguided. If a dictator were in charge of a country and kept imposing bad ideas, you wouldn’t call it “policy drift”. Analogously, Robin should be acknowledging that many of most self-destructive aspects of modern culture are downstream of the specific worldview which has taken over elite culture near-worldwide, and figuring out how to get rid of that worldview.
  Do AIs also need to do this? If so, do we not still need to align them with consequentialist values?
  Depends on your definitions. For example, if you feel a deep sense of love for someone, you will naturally want to try to achieve good outcomes for them. So in some sense you could say that aligning to love also involves aligning to consequentialism as a sub-component, but in another sense you could say that aligning to love lets you rederive aspects of consequentialism (just as e.g. aligning to consequentialism might allow an AI to rederive aspects of deontology). Similar for other virtues.