OscarGilg comments on AI welfare research needs basic science

OscarGilg 3 Jul 2026 9:29 UTC
3 points
0
Thanks for the comment! I read 6.3. I definitely agree that we want to do Bayesian updating. I totally agree with:
And when you do this, you’ll need to think about how likely each theory of moral status is, and then how likely AIs are to have status conditional on your bottom up evidence and on that theory of moral status. In this way, there’s no substitute for grappling with questions about which theory of moral status is correct
However I disagree with this:
“and once you do that, you’re basically doing what you call ‘top down’ methodology”
It could be that saying “top-down” makes it sound stronger than we intend. We’re mainly making the case for far more “upwards” updating than past approaches. My view is roughly this:
- My priors on theories of moral status/consciousness and their indicators are low. In particular it seems like Ai systems are too out-of-distribution for them (§1.2). Also i think there are complications with applying them (§1.1)
- This changes how my Bayesian updates are likely to flow. The updates flow disproportionately “upwards” into revising theories compared to past approaches. Doing exploratory basic science is more useful under this scheme.
Maybe this is equivalent to what you mean by a top-down approach but with ~50% prior that all theories are wrong.
- Simon Goldstein 4 Jul 2026 2:07 UTC
  1 point
  0
  Parent
  Yeah that sounds reasonable. I think my main disagreement is that I haven’t seen good examples of updating “upwards” into revising theories based on the basic science stuff. Might be productive to think through some of the potential examples. For each putative example, I’m going to imagine a human that had the relevant structure, and then assess whether the proposed revision of the theory would have insane consequences for humans. I predict that it will.