Upcoming stability of values

Stuart_Armstrong15 Mar 2018 11:36 UTC

15 points

Philosophy Robust Agents Human Values Value Drift

What would you say to someone old who hadn’t changed their values since they were five years old?

What would you say to anyone old who hadn’t changed their values since they were eighteen years old?

You’d probably have cause to pity the second and seriously worry about the first. The process of learning, and ageing, inevitably reshape our values and preferences, and we have well-worn narratives about how life circumstances change people.

But we may be entering a whole new era. Human values are malleable, and super-powered AIs may become adept at manipulating them, possibly at the behest of other humans.

Conversely, when we start becoming able to fine-tune our own values, people will start to stabilise their own values, preventing value drift. Especially if human lifespan increases, there will be a strong case to keeping your values close, and not allowing a random walk until it hits an attractor. The more we can self-modify, the more the argument about convergent instrumental goals will apply to us—including stability of terminal goals.

So, assuming human survival, I expect that we can look forward to much greater stability of values in the future, with humans making their values fixed, if only to protect themselves against manipulation.

Possible Consequences

In such a world, the whole narrative of human development will change, with “stages of life” marked more by information, wealth, or position, than by changes in values. Nick Bostrom once discussed “super-babies”—entities that preserved the values of babies but had the intelligence of adults. Indeed, many pre-adolescents would object to going through adolescence, and this is unlikely to be formed on all of them. So we may end up with perpetual pre-adolescents, babies created with adult values, or a completely different maturation process, which didn’t involve value changes.

Thus, unlike today, creators/parents will be able to fix the values of their offspring with little risk that these values would change. There are many ways this could go wrong, the most obvious being the eternal conservation of pernicious values and the potential splintering of humanity into incompatible factions—or strict regulations on the creation of new entities, to prevent that happening.

In contrast, interactions between different groups may become more relaxed than previously. Change of values through argumentation or social pressure would no longer be options (and I presume these future humans would be able to cure themselves of patterns of interactions they dislike, such as feeling the need to respond to incendiary comments). So interactions would be between beings that know for a fact they could never convince each other of moral facts, thus removing any need to convince, shame, or proselytise in conversation.

It’s also possible that divergent stable values may have less consequences than we think. Partly because there are instrumental reasons for people to compromise on values when interacting with each other. But also because it’s not clear how much of human value divergence is actually divergence in factual understanding, or mere tribalism. Factual divergences are much harder to sustain artificially, and tribalism is likely to transform into various long term contracts.

On a personal note, if I had full design capabilities over my own values, I’d want to allow for some slack and moral progress, but constrain my values not to wander too far from their point of origin.

What links here?

Stuart_Armstrong15 Mar 2018 11:36 UTC

15 points

15 comments2 min readLW link

Philosophy Robust Agents Human Values Value Drift

Qiaochu_Yuan 15 Mar 2018 14:10 UTC
6 points
0
You’d probably have cause to pity the second and seriously worry about the first.
Not really, seems fine to me. As far as I know Duncan would self-describe as not having changed his values since he was twelve and I deeply admire him for that.
- norswap 15 Mar 2018 16:31 UTC
  2 points
  0
  Parent
  That’s a nitpick. He said you’d **probably** have cause to pity him, and indeed, except in rare ubermensch I think that would be the case.
  - Qiaochu_Yuan 15 Mar 2018 18:40 UTC
    12 points
    0
    Parent
    Okay, so one thing I actually want here is for Stuart to clarify what he means by “values.” One thing you might mean by a person’s values changing as they get older is something like, I used to value eating as much ice cream as possible, and now I value reading books, or something. But this is pretty far on the instrumental side of instrumental vs. terminal values. One thing you might mean by a person’s values staying the same as they get older, more in the terminal values direction, is something like, I used to value having fun and now I still value having fun.
    Instrumental values can change all the time in response to learning more about how the world works and what sorts of strategies do or do not get you your terminal values, but that’s orthogonal to the question of whether your terminal values are drifting (to the extent that it even makes sense to ask this question of a human) and whether that’s good or not.
  - johnswentworth 15 Mar 2018 17:05 UTC
    5 points
    0
    Parent
    Personally, I usually like the values of five-year-olds better than the values of adults. The five-year-olds haven’t had the ambition beaten out of them yet, they at least still have their sights aimed high. They want to be astronauts or whatever. You talk to the average adult over thirty, and their life goals amount to “impress friends/family, raise the kids well, prep for retirement, have some fun”.
    Side note: I remember lying in bed worrying about this back in sixth grade. I promised myself I wouldn’t abandon my ambitions when I got older. Turns out I broke that promise; I decided my childhood ambitions weren’t ambitious enough. It just never occurred to me until high school that “don’t die at all” could be on the table.
    - norswap 15 Mar 2018 17:20 UTC
      4 points
      0
      Parent
      By and large 5 years old don’t have such lofty values (well lofty for you at least). And they are incredibly cruel—maybe you weren’t but then you were an outlier. I suspect you were probably less kind and more selfish than you are now, even though you didn’t probably didn’t realize it at the time (and probably couldn’t, and that’s precisely why children tend to be like this, it’s incredibly hard for them to course correct without outside intervention).
Pattern 19 Jun 2019 21:21 UTC
3 points
0
This invites the question—“why do we change our values” or “when is it good to change values”. (While that seems to depend on the definition of “values”, it seems worth engaging with this question, as is.)
What would you say to someone old who hadn’t changed their values since they were five years old?
What would you say to anyone old who hadn’t changed their values since they were eighteen years old?
Perhaps my answer depends on their age (or their values). If someone is 5 years old, and a day, how much change should we expect? 18 years and a day?
Maybe the key factor is information. While we don’t expect every day to be (very) life changing, we expect a life changing day, to have an effect. In this sense value stability isn’t valued. That is, as we acquire more information*, values should change (if the information is relevant, and different, or suggests change). So what we want might be more information (which is true). On the other hand, would you want to have a lot of life changing days, one after another? To some degree, stability and resources enable adjustments. Beliefs and habits may take time to change, and major life changes can be stressful. It is one thing to seek information, it would be another to live in an externally imposed (sonic) deluge of information.
It is worth noting both that 1) as time goes on, and specifically as one acquires more information, the evidence that should be needed to shift beliefs changes, namely increases, 2) No change means no growth. To have one’s values frozen at the age of 100, and to still be the same at 200, seems a terrible thing.
(Meta-values might change less than lower level values, if there’s less things that affect them, or that might be a result of meta-value change precipitating lower level value change, so delta L ⇐ delta M because delta M → delta L. How things might work in the other direction isn’t as clear—would lots of value change cause change in the level above it?)
It is tricky to account for manipulation though. When is disseminating true information manipulative? (Cherry picking?)
Another factor might be something like exploration or ‘preventing boredom’. Nutrition aside, while we might have a favorite food, eating too much of it, for too many days in a row may be unappealing (in advance, or in hindsight). Perhaps you have a desire to travel, to see new things; or to change in certain ways—a new skill you want to learn, a new habit to make, or new goals to achieve. (Still sounds like growth, though things we create which outlast us can also be about growing something other than ourselves.)
*No double counting, etc. On the other hand, if we’ve learned more/grown/changed, we might explore the new implications of “old” information. This isn’t easy to model, outside of noticing recurring failure modes.
- Stuart_Armstrong 22 Jun 2019 8:31 UTC
  4 points
  0
  Parent
  Information can (and should) change your behaviour, even if it doesn’t change your values. Becoming a parent should change your attitude to various things whose purpose you didn’t see till then! And values can prefer a variety of experiences, if we cash our boredom properly.
  
  The problem is that humans mix information and values together in highly complicated, non-rational ways.
ESRogs 15 Mar 2018 17:37 UTC
3 points
0
Conversely, when we start becoming able to fine-tune our own values, people will start to stabilise their own values, preventing value drift.
When I read this I assume you have in mind a point in the future when we’re uploads, or have broad control over biology. Which makes me surprised to then read this part:
So we may end up with perpetual pre-adolescents, babies created with adult values, or a completely different maturation process, which didn’t involve value changes.
Are you imagine a future where we both can engineer our bodies / minds, and also still go through (or at least start to go through) the normal human physical maturation process from baby to adult (complete with all the hormonal changes and growth spurts, etc.)?
This seems sort of incongruous (anachronistic?) to me.
- Stuart_Armstrong 16 Mar 2018 3:57 UTC
  4 points
  0
  Parent
  That seems still plausible to me under the biology scenario (at least early on), not so much for uploads.
  
  But stability of value is one of the reasons I expect we’ll design a different maturation process, sooner than we otherwise would.
Richard_Kennaway 22 Jun 2019 16:14 UTC
1 point
0
Especially if human lifespan increases, there will be a strong case to keeping your values close, and not allowinga random walk until it hits an attractor.
In other words, be an attractor for your current values already. But at what age should one decide that here, at last, is where I am going to fix myself like a sea squirt on the landscape of values?
avturchin 16 Mar 2018 17:19 UTC
1 point
0
I think that it is possible a situation where meta-values are stable, but lower level values are changing. Meta-values are values about the ways how my values should change. For example, I prefer that they will not change against my will by some secret method of hypnosis. But I also prefer that they will change according their natural evolution, as I don’t want to stuck in some repetitive behaviour.
- Vladimir_Nesov 22 Jun 2019 21:21 UTC
  6 points
  0
  Parent
  These “meta-values” you mention are just values applied to appraisal of values. So in these terms it’s possible to have values about meta-values and to value change in meta-values. Value drift becomes instrumentally undesirable with greater power over the things you value, and this argument against value drift is not particularly sensitive to your values (or “meta-values”). Even if you prefer for your values to change according to their “natural evolution”, it’s still useful for them not to change. For change in values to be a good decision, you need to see value drift as more terminally valuable than its opportunity cost (decrease in value of the future according to your present values, in the event that your present values undergo value drift).
  - Stuart_Armstrong 25 Jun 2019 12:27 UTC
    2 points
    0
    Parent
    Yep. But I note that many people seem to value letting their values drift somewhat, so that needs to be taken into account.
- Stuart_Armstrong 16 Mar 2018 17:49 UTC
  2 points
  0
  Parent
  What would you want your values to do if you lived two thousand+ years? And did you have a firm answer to that question before I asked it? Would most people?
  - avturchin 16 Mar 2018 19:19 UTC
    1 point
    0
    Parent
    I prefer that I would still have meta-meta value to be alive. If it holds and works, I can enjoy the play of all possible values and their combinations on lower levels. I have been thinking before about it, may be not in the exact the same wording.
    Most people never think in this way, but their preference may be learned from experiments—for example, it is well known how much would people pay for preserving their life for 1 year.
    It was estimated based on meta-analysis of 42 studies, that humans are ready to pay between $100K-400K for QALY, that is only for one year (Hirth, Chernew, Miller, Fendrick, & Weissert, 2000), while the median household income at the time of the study was only $37,000 in inflation non-adjusted dollars (US Census Bureau, 1997).