Ericf comments on The Pointers Problem: Human Values Are A Function Of Humans’ Latent Variables

Ericf 19 Nov 2020 18:39 UTC
4 points
With reference specifically to this:
The happiness of people I will never interact with is a good example of this. There may be people in the world whose happiness will not ever be significantly influenced by my choices. Presumably, then, my choices cannot tell us about how much I value such peoples’ happiness. And yet, I do value it.
and without considering any other part of the structure, I have an alternate view:
It is possible to determine if and how much you value the happiness (or any other attribute) of people you will never interact with by calculating
1. What are the various things you, personally, could have done in the past [time period], and how would they have affected each of the people, plants, animals, ghosts, etc. that you might care about?
2. What things did you actually do?
3. How far away from your maximum impact / time were you for each entity you could have affected. (scaled in some way tbd)
4. Derive values and weights from that. For example, if I donate $100 to Clean Water for Africa, that implies that I care about Clean Water & Africa more than I care about AIDS and Pakistan, and the level there depends on how much $100 means to me. If that’s ten (or even two) hours of work to earn it that’s a different level of commitment than if it represents 17 minutes of owning millions in assets.
5. Run the calculation for all desired moral agents, to average out won’t-ever-see-them effects.
- StellaAthena 21 Nov 2020 14:44 UTC
  4 points
  Parent
  Derive values and weights from that. For example, if I donate $100 to Clean Water for Africa, that implies that I care about Clean Water & Africa more than I care about AIDS and Pakistan, and the level there depends on how much $100 means to me. If that’s ten (or even two) hours of work to earn it that’s a different level of commitment than if it represents 17 minutes of owning millions in assets.
  This will very quickly lead to incorrect conclusions, because people don’t act according to their values (especially for things that don’t impact their day to day lives like international charity). The fact that you donated $100 to Clean Water for Africa does not mean that you value that more than AIDS in Pakistan. You personally may very well care about about clean water and/or Africa more than AIDS and/or Pakistan, but if you apply this sort of analysis writ large you will get egregiously wrong answers. Scott Alexander’s “Too Much Dark Money in Almonds” describes one facet of this rather well.
  
  Another facet is that how goods are bundled matters. Did I spend $15 on almonds because I value a) almonds b) nuts c) food d) sources of protein e) snacks I can easily eat while I drive f) snacks I can put out at parties… etc. And more importantly, which of those things do I care about more than I care about Trump losing the election?
  
  Elizabeth Anscombe’s book Intention does a good job analyzing this. When we make actions, we are not making those actions based on the state of the world we are making those actions based on the state of the world under a particular description. One great example she gives is walking into a room and kissing a woman. Did you intend to a) kiss your girlfriend b) kiss the tallest women in the room c) kiss the woman closest to the door wearing pink d) kiss the person who got the 13th highest mark on her history exam last week e) …
  
  The answer is (typically) a. You intended to kiss your girlfriend. However to an outside observer who doesn’t already have a good model of humanity at large, if not a model of you in particular, it’s unclear how they’re supposed to tell that. Most people who donate to Clean Water for Africa don’t intend to be choosing that over AIDS in Pakistan. Their actions are consistent with having that intention, but you can’t derive intentionality from brute actions.
  - Ericf 21 Nov 2020 18:31 UTC
    1 point
    Parent
    I agree with your comment, but I think it’s a scale thing. If I analyze every time you walk into a room, and every time you kiss someone, I can derive that you kiss [specific person] when you see them after being apart. And this is already being done in corporate contexts with Deep Learning for specific questions, so it’s just a matter of computing power, better algorithms, and some guidance at to the relevant questions and variables.
- johnswentworth 19 Nov 2020 19:12 UTC
  3 points
  Parent
  You’ve mostly understood the problem-as-stated, and I like the way you’re thinking about it, but there’s some major loopholes in this approach.
  First, I may value the happiness of agents who I cannot significantly impact via my actions—for instance, prisoners in North Korea.
  Second, the actions we chose probably won’t provide enough data. Suppose there are n different people, and I could give any one of them $1. I value these possibilities differently (e.g. maybe because they have different wealth/cost of living to start with, or just because I like some of them better). If we knew how much I valued each action, then we’d know how much I valued each outcome. But in fact, if I chose person 3, then all we know is that I value person 3 having the dollar more than I value anyone else having it; that’s not enough information to back out how much I value each other person having the dollar. This sort of underdetermination will probably be the usual result, since the choice-of-action contains a lot less bits than a function mapping the whole action space to values.
  Third, and arguably most important: “run the calculation for all desired moral agents” requires first identifying all the “desired moral agents”, which is itself an instance of the problem in the post. What the heck is a “moral agent”, and how does an AI know which ones are “desired”? These are latent variables in your world-model, and would need to be translated to something in the real world.
  - Ericf 20 Nov 2020 14:51 UTC
    3 points
    Parent
    I was attempting to answer the first point, so let me rephrase: Even though your ability to affect prisoners in North Korea is miniscule, we can still look at how much of it you’re doing. Are you spending any time seeking out ways you could be affecting them? Are you voting for and supporting and lobbying politicians who are more likely to use their greater power to affect the NK prisoner’s lives? Are you doing [unknown thing that the AI figures out would affect them]? And, also, are you doing anything that is making their situation worse? Or any other of the multiple axis of being, since happiness isn’t everything, and even happiness isn’t a one-dimentional scale.
    
    “Who counts as a moral agent? (And should they all have equal weights)” Is a question of philosophy, which I am not qualified to answer. But “who gets to decide the values to teach” it’s one meta-level up from the question of “how do we teach values”, so I take it as a given for the latter problem.
    - StellaAthena 20 Jun 2021 12:55 UTC
      2 points
      Parent
      This analysis falls apart when we take things to their logical extreme: I care about the happiness of human who are time-like separated from me.