Vladimir_Nesov comments on Sentience matters

Vladimir_Nesov 29 May 2023 23:03 UTC
2 points
0
It’s probably implied by CEV. The point is that you don’t need the whole CEV to get it, it’s probably easier to get, a simpler concept and a larger alignment target that might be sufficient to at least notkilleveryone, even if in the end we lose most of the universe. Also, you gain the opportunity to work on CEV and eventually get there, even if you have many OOMs less resources to work with. It would of course be better to get CEV before building ASIs with different values or going on a long value drift trip ourselves.
- Seth Herd 30 May 2023 17:23 UTC
  3 points
  0
  Parent
  I’d suggest that long-term corrigibility is a still easier target. If respecting future sentients’ preferences is the goal, why not make that the alignment target?
  
  While boundaries are a coherent idea, imposing them in our alignment solutions would seem to very much be dictating the future rather than letting it unfold with protection from benevolent ASI.
  - Vladimir_Nesov 30 May 2023 20:13 UTC
    2 points
    0
    Parent
    In an easy world, boundaries are neutral, because you can set up corrigibility on the other side to eventually get aligned optimization there. The utility of boundaries is for worlds where we get values alignment or corrigibility wrong, and most of the universe eventually gets optimized in at least somewhat misaligned way.
    
    Slight misalignment concern also makes personal boundaries in this sense an important thing to set up first, before any meaningful optimization changes people, as people are different from each other and initial optimization pressure might be less than maximally nuanced.
    
    So it’s complementary and I suspect it’s a shard of human values that’s significantly easier to instill in this different-than-values role than either the whole thing or corrigibility towards it.