Roman Leventov comments on Internal independent review for language model agent alignment

Roman Leventov 14 Jul 2023 6:06 UTC
3 points
2
You can have both: unconscious AIs for “dirty work” and conscious AIs experiencing bliss to hedge our bets.
- Ape in the coat 14 Jul 2023 7:19 UTC
  2 points
  1
  Parent
  I expect that hedging our bets this way may increse the chances of human extinction. AIs, carrying about ethics, will have less reasons to care about human survival if there are AIs who also have ethical value, not just humans.
  - Roman Leventov 14 Jul 2023 7:51 UTC
    1 point
    0
    Parent
    Well, that’s the point, hedging the chances of value/meaning destruction in the Solar system against humanity in specific. If AI is smart/enlightened enough and sees ethical value in other AIs (or themselves), then there should be some objective/scientific grounds for arriving at this inference (if we designed the AI well). Hence humans should value those AIs, too.
    I don’t suggest to turn up the chances of human extinction to 100%, of course, but some trade seems acceptable to me from my meta-ethical perspective.
    - Ape in the coat 14 Jul 2023 8:22 UTC
      3 points
      1
      Parent
      Oh course humans should value conscious AI. That’s the reason not to make AI counscious in the first place! We do not really need more stuff to care about, our optimization goal is complicated enough no need to make it even harder.
      I agree that some trade in principle is acceptable. A world where conscious AI with human-ish values continue after humanity dies is okay-ish. But it seems that it’s really easy to mess up in this regard. If you can make a conscious AI with arbitrary values then you can very quickly make so many of these AIs that their values are now dominant and human values are irrelevant. This doesn’t seem as a good idea.
    - Lichdar 14 Jul 2023 14:49 UTC
      1 point
      −2
      Parent
      I would prefer total oblivion over AI replacement myself: complete the Fermi Paradox.