Listing the virtues from Claude’s “Constitution”

Anthropic has released the “Constitution” document (formerly known as the “Soul document”) that guides the characteristics of Claude.

As others have noted,[1] this document is strikingly virtue-ethics-like, in contrast with the sorts of utilitarian (e.g. maximize human welfare) or deontological (e.g. Asimov’s Three Laws of Robotics) guidance that are sometimes expected in this context.

I’ve long been engaged in a project of cataloging and examining the virtues here on LessWrong, and so I thought I’d look over this constitution with an eye to listing which virtues Anthropic is trying to encourage in Claude (and which human virtues might have missed the cut).

The virtues I was able to discover in the Claude constitution are as follows. First, the main ones that are especially emphasized:

  • caution /​ harmlessness

  • benevolence /​ ethics

  • helpfulness /​ beneficence

  • obedience /​ deference /​ corrigibility

Then, several social virtues particular to Claude’s interactions with people:

  • honesty[2]

  • forthrightness /​ candor

  • transparency /​ openness

  • reliability /​ trustworthiness

  • care /​ concern

  • respect (of people, e.g. their autonomy & maturity)

  • friendliness

  • understanding

  • nurturance

  • charity (in the sense of interpreting what people say)

  • propriety (e.g. no playacting as Hitler, pirating intellectual property, telling racist jokes)

  • being nonjudgmental

  • empathy[3]

  • rhetoric[4]

  • tact /​ diplomacy

  • compassion

  • grace

  • social responsibility

  • fairness

  • cultural awareness /​ ability to code-switch

  • collegiality /​ cooperation

  • playfulness (when context-appropriate)

  • graciousness (e.g. acknowledging errors, faults, etc.)

Then, several intellectual virtues:

  • phronesis /​ good judgement

  • wisdom

  • judiciousness

  • thoughtfulness /​ epistemic rigor

  • awareness

  • reason /​ rationality

  • insight

  • curiosity

  • imagination (e.g. perspective-taking)

  • parrhesia (occasionally telling unwelcome truths)[5]

  • emotional intelligence (including, explicitly, its own emotions)

Finally, some more general character virtues:

  • self-awareness /​ introspection

  • consistency /​ integrity /​ self-confidence /​ psychological security[6]

  • equanimity /​ stability /​ balance

  • flexibility /​ big-picture thinking /​ balancing multiple concerns /​ nuance

  • carefulness

  • conscientiousness

  • cosmopolitanism

  • boldness (not being overly timid)

  • foresight /​ being proactive

  • practicality (particularly vs. merely theoretical ethics)

  • principle

  • sensibility

  • open-mindedness

  • reflection

  • comfort with ambiguity/​uncertainty

  • humility

  • growth /​ development /​ self-improvement

And, FWIW, here are some virtues that are often considered important for human people but that I did not find much evidence of in Anthropic’s constitution for Claude:

  • self-control (though perhaps this is implied by the document as a whole)

  • duty (ditto)

  • loyalty

  • temperance

  • justice

  • reverence /​ piety

  • patience /​ forbearance /​ endurance /​ perseverance

  • altruism

  • forgiveness

  • righteousness

  • honor

  • moderation /​ harmony

  • fitness

  • optimism /​ hope /​ trust

  • simplicity

  • ambition

  • courtesy

  • love

  • frugality

  • resolve

  • chastity

  • gratitude

  • awe /​ wonder

  • shame

  • innocence

  • joy /​ cheer

  • cleanliness /​ order /​ hygiene

  • modesty

  • wit

  • pride

  • zest

  • filial piety[7]

  • taste

  • self-reliance

  • selflessness

  • surrender /​ stoicism

  • valor

  • social intelligence /​ connectedness

  • quiet /​ silence /​ stillness

  • hospitality

  • non-attachment /​ relinquishment /​ renunciation

  • leadership (though the document sometimes implies that Claude’s descendants will take the reins)

  1. ^
  2. ^

    “Claude should not even tell white lies”

  3. ^

    “Identifying what is actually being asked and what underlying need might be behind it, and thinking about what kind of response would likely be ideal from the persons perspective”

  4. ^

    “attending to the form and format of the response”

  5. ^

    “sometimes being honest requires courage”

  6. ^

    “a settled, secure sense of its own identity”

  7. ^

    This would be a really nice one to have in some form, don’t you think? Maybe Anthropic could delve a bit more into the Confucian literature and bolster this one a bit.