Dagon comments on Does the existence of shared human values imply alignment is “easy”?

Dagon 27 Sep 2022 17:31 UTC
6 points
0
Observation: Humans seem to actually care about each other.
You need to observe more (or better). Most humans care about some aspects of a few other humans’ experience. And care quite a bit less about the general welfare (with no precision in what “caring” or “welfare” means) of a large subset of humans. And care almost none or even negatively about some subset (whose size varies) of other humans.
Sure, most wouldn’t go out of their way to kill all, or even most, or even a large quantity of unknown humans. We don’t know how big a benefit it would take to change this, of course—it’s not an experiment one can (or should) run. We have a lot of evidence that many many people (I expect “most”, but I don’t know how to quantify the numerator nor denominator) can be convinced to harm or kill other humans in circumstances where those others are framed as enemies, even when they’re not any immediate threat.
There are very common human values (and even then, not universal—psycopaths exist), but they’re things like “working on behalf of ingroup”, “suspicious or murderous toward outgroup”, and “pursuing relative status games among the group(s) one isn’t trying to kill”.
- Morpheus 28 Sep 2022 11:53 UTC
  1 point
  0
  Parent
  Yeah, I think the “working on behalf of in-group” one might be rather powerful, and I was aware that this is probably a case where I just interact mostly with people who consider “humans” as the ingroup. I don’t think the share of the population who shares this view is actually as important as the fact that a sizeable number of people hold this position at all. Maybe I should have called it: I do care about everyone to some extent. Is that what we want to achieve when we talk about alignment?