AnthonyC comments on Aligning to Virtues

AnthonyC 16 Feb 2026 21:37 UTC
2 points
0
That virtues are so readily thought of as fuzzy or flexible is an unfortunate consequence of our limited ability to properly anticipate and evaluate the consequences of our actions and the definitions of our words. IMO deontological rules aren’t a problem because they’re rigid, they’re a problem because they try to be succinct and thereby draw an inaccurate, crude boundary around the set of behaviors we would ideally want. If we choose to align AIs to virtues, I’d like to make sure they know that virtues are also rigid and unyielding but fractally complex at their boundaries. It is each mind’s understanding of the virtues it seeks to uphold (along with the world it is operating in) that is fuzzy, and this thereby necessitates flexibility and caution in practice. “Don’t believe everything you think” is critical advice for everyone, and “Don’t optimize too hard without way more evidence than you think you need” is a subset of it, but a well-grounded virtue ethicist can incorporate it more easily into planning and review processes than a deontologist or a naive utilitarian/consequentialist can.
Edit to add: I do think, from a God’s eye view, consequentialism is in some deep sense ‘true’ as a final arbiter of what makes an action good or bad. But, I think the problem that the complete set of results of an action are not computable in advance for any finite agent within the universe is inescapably damning if you want to rely on this kind of reasoning for each decision. We can try to approximate such computations this when the decision is sufficiently important and none of our regular heuristics seem adequate. Otherwise, we use deontological rules as heuristics within known contexts, and virtues as different kinds of heuristics in a broader set of less known contexts. Strict adherence to deontological rules, or deontological definitions of virtues, leads to horrible places out of distribution.