[Question] Whence unchangeable values?

ihatenumbersinusernames71 Feb 2026 3:49 UTC

10 points

Some values don’t change. Maybe sometimes that’s because a system isn’t “goal seeking.” For example, AlphaZero doesn’t change its value of “board-state = win.” (Thankfully! Because if that changed to “board-state = not lose,” then a reasonable instrumental goal might be to just kill its opponent.)

But I’m a goal seeking system. Shard theory seems to posit terminal values that constantly pop up and vy for position in humans like me. But certain values of mine seem impossible to change. Like, I can’t decide to desire my own misery/pain/suffering.

So if terminal values aren’t static, what about the values that are? Are these even more terminal? Or is it something else?

ihatenumbersinusernames71 Feb 2026 3:49 UTC

10 points

6 comments1 min readLW link

StanislavKrym 1 Feb 2026 4:31 UTC
2 points
0
We had many posts trying to answer this question. One of the candidates is the master-slave model of the master setting the slave’s shorter-term allegedly terminal values so that the slave did actions satisfying the master’s longer-term values, which are approval from a circle of people, health, sex, power and proxies like sweet food.
That being said, approval from a circle of people is itself hard to define. For example, it could change with a change of the circle. Or the role of the circle could be played by media forming the user’s opinions on some subjects with no feedback from the user. An additional value could be consistence of the worldview with lived experiences...
- ihatenumbersinusernames7 1 Feb 2026 17:59 UTC
  1 point
  0
  Parent
  Thanks for the links! In addition to Shard Theory, I have seen Steven’s work and it is helpful. Both approaches seem to suggest human terminal values change...I don’t know what they’d say about the idea that some (human) terminal values are unchanging.
  If Evolution is the master and humans are the slave in Wei Dai’s model, that seems to suggest that we don’t have unchangeable terminal values. But while the concept makes sense at the evolutionary scale, it doesn’t make sense to me that it implies within-lifespan terminal value changeability (or really any values...if I want pizza for dinner, evolution can’t suddenly make me want burgers). What do you think?

Dagon 1 Feb 2026 17:06 UTC
3 points
−1

Some values don’t change. Citation needed. I can’t think of anything that could be reasonably classified as a “value” that is unchanging in humans. And I don’t know of any other entities to which “values” can yet be applied.

I’m a goal seeking system. Even less clear. Actually, I don’t know you, so maybe it’s true for you. It’s absolutely not true for me. I’m an illegible, variable, meaning-seeking (along with other, less socially-acceptible-to-admit -seeking) thing.

In real-world entities, the model of terminal values and goal-seeking is highly suspect.
- ihatenumbersinusernames7 2 Feb 2026 19:11 UTC
  1 point
  0
  Parent
  Some values don’t change. Citation needed.
  Just a few values (of mine, at least) that have never changed:
  Having fun
  Learning (Having a more accurate map of the territory)
  Physical / mental / financial / relational health
  Many forms of freedom in pursuing my goals
  - Dagon 2 Feb 2026 22:12 UTC
    2 points
    0
    Parent
    I’d expect that most specifics about those topics, and their relative priority to other things you’re currently seeking and making tradeoffs against, have changed and will change significantly.
    
    The amount of generalization that would make a value unchanging also makes it useless for prediction or decision-making.
- ihatenumbersinusernames7 1 Feb 2026 17:46 UTC
  1 point
  0
  Parent
  And I don’t know of any other entities to which “values” can yet be applied.
  So if AlphaZero doesn’t have values, according to you, how would you describe its preference that “board state = win?”
  And why do you say that “values” can be applied to humans? What makes us special?