TAG comments on Moral Alignment: An Idea I’m Embarrassed I Didn’t Think of Myself

TAG 19 Jun 2025 18:36 UTC
1 point
0

I have several objections to your (implied) argument. First and least—impartial morality doesn’t guarantee anything, nor does partial morality. There are no guarantees. We are in uncharted territory

OTOH, not all probabilities are equal, and a human specific value system is less likely result in extinction , relatively.

If your only goal is to survive, you will be missing a lot of big wins.

If you dont survive, you get no wins.

If you want superintelligence to be genuinely aligned, I would argue it is more valuable, safer, and more stable to align it to a broader set of values, with respect for all living things

Does that include a Do No Harm clause?
- WhatsTrueKittycat 19 Jun 2025 23:45 UTC
  2 points
  1
  Parent
  Yes, “Do no harm” is one of the ethical principles I would include in my generalized ethics. Did you honestly think it wasn’t going to be?
  
  > If you dont survive, you get no wins.
  
  Look, dude, I get that humanity’s extinction are on the table. I’m also willing to look past my fears, and consider whether a dogma of “humanity must survive at all costs” is actually the best path forward. I genuinely don’t think centering our approach on those fears would even buy us better chances on the extinction issue, for the reasons I described above and more. Even if it did, there are worse things than humanity’s extinction, and those fears would eagerly point us towards such outcomes.
  
  You don’t have to agree, but please consider the virtue of a scout mindset in such matters, or at least make an actual detailed argument for your position. As it stands you mostly seem to be trying to shut down discussion of this topic, rather than explore it.
  - TAG 20 Jun 2025 11:03 UTC
    −1 points
    −3
    Parent
    
    Yes, “Do no harm” is one of the ethical principles I would include in my generalized ethics. Did you honestly think it wasn’t going to be?
    
    I doing know you, so how would I know? Do you think an AI will fill in these unstated side-conditions correctly? Isn’t there a lot of existing literature on why that’s a bad assumption? why should a brief and vague formula be The Answer, when so many more sophisticated ones have been shot down?
    - WhatsTrueKittycat 20 Jun 2025 14:29 UTC
      3 points
      1
      Parent
      I think my previous messages made my stance on this reasonably clear, and at this point, I am beginning to question whether you are reading my messages or the OP with a healthy amount of good faith, or just reflexively arguing on the basis of “well, it wasn’t obvious to me.”
      
      My position is pretty much the exact opposite of a “brief, vague formula” being “The Answer”—I believe we need to carefully specify our values, and build a complete ethical system that serves the flourishing of all things. That means, among other things, seriously investigating human values and moral epistemology, in order to generalize our ethics ahead of time as much as possible, filling in the side conditions and desiderata to the best of our collective ability and in significant detail. I consider whether and how well we do that to be a major factor affecting the success of alignment.
      
      As I said previously, I care about the edge cases, and I care about the living things that would be explicitly excluded from consideration by your narrow focus on whether humanity survives. Not least because I think there are plenty of universes where your assumptions carry the day and humanity survives extinction, but at a monstrous and wholly avoidable cost. If you take the stance that we should be willing to sacrifice all other life on earth at the altar of humanity’s survival, I simply disagree. That undermines any ethical system we would try to put into place, and if it came to pass, would be a Pyrrhic victory and an exceptionally heartless way for humanity to step forth onto the cosmic stage. We can do better, but we have to let go of this notion that only our extinction is a tragedy worth avoiding.