Noosphere89 comments on Paper: Constitutional AI: Harmlessness from AI Feedback (Anthropic)

Noosphere89 19 Dec 2022 13:21 UTC
1 point
0
Basically, we should use the assumption that is most robust to being wrong. It would be easier if there were objective, mind independent rules of morality, called moral realism, but if that assumption is wrong, your solution can get manipulated.

So in practice, we shouldn’t try to base alignment plans on whether moral realism is correct. In other words I’d simply go with what values you have and solve the edge cases according to your values.
- rpglover64 19 Dec 2022 16:48 UTC
  2 points
  1
  Parent
  I feel like we’re talking past each other. I’m trying to point out the difficulty of “simply go with what values you have and solve the edge cases according to your values” as a learning problem: it is too high dimension, and you need too many case labels; part of the idea of the OP is to reduce the number of training cases required, and my question/suspicion is that it doesn’t doesn’t really help outside of the “easy” stuff.
  - Noosphere89 19 Dec 2022 17:54 UTC
    3 points
    0
    Parent
    Yeah, I think this might be a case where we misunderstood each other.