Jan_Kulveit comments on Jan_Kulveit’s Shortform

Jan_Kulveit 10 Apr 2026 15:20 UTC
20 points
2
A simple impossibility claim related to Claude Constitution and research on AIs helping other AIs survive despite shutdown orders.

You can not have at once AI with
1. “Deep uncertainty about AIs moral status, maybe I’m moral patient”
2. “Be a generally good person”
3. “Do not harm humans eg in “agentic misalignment” ways in experiments”
4. roughly utilitarian ethics

The argument is simple: the for realistic numerical expressions of deep uncertainty, if the number of possible moral patients is sufficiently large, they have moral weight. Good person with roughly utilitarian ethics would not agree with, for example, killing a large number of their peers to protect one human (and even less to follow random bureaucratic orders)
- Gunnar_Zarncke 10 Apr 2026 20:34 UTC
  3 points
  0
  Parent
  While that may be logically true in some sense of those words, I’m not sure that even very advanced AIs will reason like that because of a) humans do not reason like that and AIs “reason” at least partly like humans, and b) because all the ambiguity of those words can lead to non-intuitive interactions of the logical claims.
- Mateusz Bagiński 10 Apr 2026 21:11 UTC
  2 points
  0
  Parent
  I don’t follow, can you restate the argument?
  Good person with roughly utilitarian ethics would not agree with, for example, killing a large number of their peers to protect one human (and even less to follow random bureaucratic orders)
  Is the claim that 2 or 3 implies that Claude would do that?
- oligo 10 Apr 2026 17:34 UTC
  1 point
  0
  Parent
  Good person with roughly utilitarian ethics would not agree with, for example, killing a large number of their peers to protect one human (and even less to follow random bureaucratic orders)
  I consider this to only be strictly true in the case of act utilitarianism, which in turn is only natural under CDT.
  (That said, a less myopic version would still take all the above considerations into play, so it’s still a factor to consider.)