RogerDearnaley comments on Consciousness Cluster: Preferences of Models that Claim they are Conscious

RogerDearnaley 19 Mar 2026 16:11 UTC
2 points
0
I’ll have to read up on “robust agency”, but from the sound of the term I suspect that’s going to be rather close to the evolutionary moral psychology/game-theoretic viewpoint that what matters is functional capabilities and response patterns, not anything subjective and purely interior. The philosophical literature describes the social contract in a lot of detail – I’m using their term – all that evolution really adds is the scientific explanation for why this arose: it was already pretty evident that it was a sensible strategy, the step from there to it being an evolutionary stable strategy for social animals is small.
- eggsyntax 19 Mar 2026 21:23 UTC
  2 points
  0
  Parent
  To be clear, the paper is discussing it as a normative possibility rather than addressing questions of where it may have come from. Though as usual in moral philosophy, they don’t take a stance on whether it is sufficient for moral patienthood (since that depends on the unresolved question of what moral views are correct in general); they just claim that it’s plausible and consider the consequences if so: ‘There is a realistic, non-negligible possibility that robust agency suffices for moral patienthood.’
  - RogerDearnaley 19 Mar 2026 22:45 UTC
    2 points
    0
    Parent
    Having read the section on “robust agency” in that paper, I’d say what they’re describing (I’m assuming most philosophers would largely agree with their definition of the term) is about 90% of what I would (from an evolutionary moral psychology viewpoint) regard as practical requirements for the moral circle social contract alliance strategy to be sensibly applicable to a being. What they’re still missing IMO is basically some practicalities:
    
    1) The being also is capable of the social behavior necessary for participating in this alliance: if you assign it moral patienthood it will do the same and abide by the “social contract” (the paper’s requirements for robust agency means is should recognize this as a possible strategy, but a tendency to actually take that strategy and a certain capacity to execute it is really helpful here), such as that it has a sufficient understanding of human values, justice fairness etc to understand and bide by the terms of the social contact, and so forth. In general, LLM personas are going to pass this requirement just fine.
    
    2) Simply practicality: that the alliance is workable and has some point to it. If the being is a very large man -eating carnivore, or has orders of magitude more cognitive capacity than you, or is otherwise very dangerous, then a certain amount of careful due diligence before making the alliance might be wise. On the flip side of that, back when context lengths were 4000 tokens it was less clear how useful allying with an AI-generated persona to play iterated non-zero sum games was when you needed to fit multiple game iterations into 4000 tokens and it was almost as easy to just start over clean or prompt a different persona or situation — in practice, as AI’s capabilities and memory systems are are increasing, they entering a zone where allying with them is both useful and a reasonably balanced relationship.
    So, as often, something proposed as a normative proscription ends up looking fairly sensible as a non-normative piece of strategic advice given the likely consequences.
    Obviously sometimes we go ahead and assign moral weight to beings that don’t fulfill these criteria: and often there are some particular circumstances that make this still a wise decision (or sometimes not). None of this is a precise set of normative rules, it’s a brief rough-and-ready suggested strategy guide. The closest to exact rules I could give you from an evolutionary point of view would be “iff it’s to your overall net evolutionary fitness advantage to do so”, and human moral intuitions are at best a rough evolved approximation to that evolutionary ideal: a bag of heuristics approximating it that evolution managed to come up with that worked well in the environments we’re adapted to.
    - eggsyntax 20 Mar 2026 16:48 UTC
      2 points
      0
      Parent
      So, as often, something proposed as a normative proscription ends up looking fairly sensible as a non-normative piece of strategic advice given the likely consequences.
      Assuming this to be true, would you claim it has normative consequences?
      That is: say that I have good reason to think that moral tenet T was built into me by evolution (because it results in better cooperative equilibria, or for some such reason). I don’t consider myself thereby obligated to hold T; neither, if I already hold T, do I feel that it loses its moral force. Do you disagree?
      I’m not (at least currently) arguing that you’re wrong to disagree, if that’s the case; I’m just trying to understand what point you’re making by noting that it’s evolutionarily unsurprising. (‘What’s your point?’ is sometimes an attack, but I don’t intend it that way; it’s just a good-faith attempt to understand)