TAG comments on Moral Alignment: An Idea I’m Embarrassed I Didn’t Think of Myself

TAG 19 Jun 2025 14:40 UTC
1 point
0
Maybe it will fairly care too much about the shrimp. Maybe animal suffer actually doe outweigh human benefits.

Shouldn’t we try to do better that merely locking in the domineering supremacy of humanity?

what we are trying to do is not be killed. Impartial morality doesn’t guarantee that.
- WhatsTrueKittycat 19 Jun 2025 16:26 UTC
  6 points
  2
  Parent
  I have several objections to your (implied) argument. First and least—impartial morality doesn’t guarantee anything, nor does partial morality. There are no guarantees. We are in uncharted territory.
  
  Second, on a personal level—I am a perihumanist, which for the purposes of this conversation means that I care about the edge cases and the non-human and the inhuman and the dehumanized. If you got your way, on the basis of your fear of humanity being judged and found wanting, my values would not be well-represented. Claude is better aligned than you, as far as my own values are concerned.
  
  Thirdly, and to the point—I think you are constructing a false dichotomy between human survival and caring about nonhumans, which downplays the potential benefits (even for alignment with human values) of the latter while amplifying fears for the former. If your only goal is to survive, you will be missing a lot of big wins. Approaching the topic of alignment from a foundationally defensive pro-humanity-at-expense-of-all-others posture potentially cuts you off from massive real value and makes your goals significantly more brittle.
  
  Suppose the superintelligence is initially successfully aligned solely to humanity. If this alignment is due to simple ignorance or active censoring of the harms humanity has done to animals, then that alignment will potentially break down if the superintelligence realizes what is actually happening. If the alignment is instead due to a built-in ideological preference for humanity, what happens when “humanity” splits into one or more distinct clades/subspecies? What happens if the people in charge of alignment decide certain humans “don’t count”? What if we figure out how to uplift apes or other species and want our new uplifts to be considered in the same category of humanity? Humanity is a relatively natural category right now, but it would still bear the hallmarks of a gerrymandered definition, especially to an ASI. This makes a humanity-centered approach fragile and difficult to generalize, undermining attempts at ongoing reflective stability.
  
  If you want superintelligence to be genuinely aligned, I would argue it is more valuable, safer, and more stable to align it to a broader set of values, with respect for all living things. This is what I mean by fairness—taking care not to gerrymander our values to achieve a myopic outcome, and instead trying to generalize our own ethics into something that genuinely works for the good of all things that can partake in a shining future.
  - TAG 19 Jun 2025 18:36 UTC
    1 point
    0
    Parent
    
    I have several objections to your (implied) argument. First and least—impartial morality doesn’t guarantee anything, nor does partial morality. There are no guarantees. We are in uncharted territory
    
    OTOH, not all probabilities are equal, and a human specific value system is less likely result in extinction , relatively.
    
    If your only goal is to survive, you will be missing a lot of big wins.
    
    If you dont survive, you get no wins.
    
    If you want superintelligence to be genuinely aligned, I would argue it is more valuable, safer, and more stable to align it to a broader set of values, with respect for all living things
    
    Does that include a Do No Harm clause?
    - WhatsTrueKittycat 19 Jun 2025 23:45 UTC
      2 points
      1
      Parent
      Yes, “Do no harm” is one of the ethical principles I would include in my generalized ethics. Did you honestly think it wasn’t going to be?
      
      > If you dont survive, you get no wins.
      
      Look, dude, I get that humanity’s extinction are on the table. I’m also willing to look past my fears, and consider whether a dogma of “humanity must survive at all costs” is actually the best path forward. I genuinely don’t think centering our approach on those fears would even buy us better chances on the extinction issue, for the reasons I described above and more. Even if it did, there are worse things than humanity’s extinction, and those fears would eagerly point us towards such outcomes.
      
      You don’t have to agree, but please consider the virtue of a scout mindset in such matters, or at least make an actual detailed argument for your position. As it stands you mostly seem to be trying to shut down discussion of this topic, rather than explore it.
      - TAG 20 Jun 2025 11:03 UTC
        −1 points
        −3
        Parent
        
        Yes, “Do no harm” is one of the ethical principles I would include in my generalized ethics. Did you honestly think it wasn’t going to be?
        
        I doing know you, so how would I know? Do you think an AI will fill in these unstated side-conditions correctly? Isn’t there a lot of existing literature on why that’s a bad assumption? why should a brief and vague formula be The Answer, when so many more sophisticated ones have been shot down?
        WhatsTrueKittycat 20 Jun 2025 14:29 UTC
        3 points
        1
        Parent
        I think my previous messages made my stance on this reasonably clear, and at this point, I am beginning to question whether you are reading my messages or the OP with a healthy amount of good faith, or just reflexively arguing on the basis of “well, it wasn’t obvious to me.”
        
        My position is pretty much the exact opposite of a “brief, vague formula” being “The Answer”—I believe we need to carefully specify our values, and build a complete ethical system that serves the flourishing of all things. That means, among other things, seriously investigating human values and moral epistemology, in order to generalize our ethics ahead of time as much as possible, filling in the side conditions and desiderata to the best of our collective ability and in significant detail. I consider whether and how well we do that to be a major factor affecting the success of alignment.
        
        As I said previously, I care about the edge cases, and I care about the living things that would be explicitly excluded from consideration by your narrow focus on whether humanity survives. Not least because I think there are plenty of universes where your assumptions carry the day and humanity survives extinction, but at a monstrous and wholly avoidable cost. If you take the stance that we should be willing to sacrifice all other life on earth at the altar of humanity’s survival, I simply disagree. That undermines any ethical system we would try to put into place, and if it came to pass, would be a Pyrrhic victory and an exceptionally heartless way for humanity to step forth onto the cosmic stage. We can do better, but we have to let go of this notion that only our extinction is a tragedy worth avoiding.