the gears to ascension comments on 6 reasons why “alignment-is-hard” discourse seems alien to human intuitions, and vice-versa

the gears to ascension 25 Dec 2025 9:24 UTC
3 points
0
I think if this was going to stick, you would already be seeing other people using it here. The fact that it didn’t quickly spread is a bad sign for the evaluation your readers have had of what they think of it.

For myself, I find the term clunky. I don’t think you’re wrong to want to talk about it, but the term on its own already uses three of your five words for mass communication, they’re rare words, they’re long words, and the meaning of each in context is a bit odd. Also, they rely on people having a habit of trying to generalize. Most of those drawbacks are easy to work around on lesswrong; but then there’s the much more important reason the term doesn’t work, which is simply that it’s not necessary to memorize—if I were to use a three word phrase to describe consequence-dependent processes, I have an infinite wellspring of rephrases of those three words at hand in my head, and which rephrase I use depends on exactly which combination of subtle meanings I want to refer to right now.

The flipside of this is that I do agree with you that consequence-steering processes are a core source of concern and are general between humans and AIs, that there’s an unsolved problem of how to specify goodness in a way that still means “good things” if put in a spreadsheet (perhaps one that is gigabytes large) and number-go-up’ed about.
- TristanTrim 4 Jan 2026 2:19 UTC
  1 point
  0
  Parent
  I think if this was going to stick, you would already be seeing other people using it here. The fact that it didn’t quickly spread is a bad sign for the evaluation your readers have had of what they think of it.
  Unfortunately I am guided by my inside view, so I will continue discussing OISs until people do start using the term or until I come to understand the flaws in the terminology. By discussing it with me you are helping with this process, so thank you : )
  I find the term clunky [...] rare words [...] meaning of each in context is a bit odd.
  I would love to hear more thoughts on this. I examined many other sets of words before settling on these ones. If you are interested I can discuss why I think they are better than any of the examples you suggested
  - Although I like the way “consequence” implies the involvement of causality, I think “outcome” is preferable to “consequence” because I want to ground the terminology in formal mathematics and would like to leverage the term “outcome” from probability theory.
  - The term “process” is one that I spent a good amount of time considering, especially in the phrase “decision process”, but I ended up preferring the term “system” because of the implication that we should be thinking not only about actions, but about objects and the actions those objects can perform. An OIS is a physical thing. No OIS exists without being instantiated by some part of reality.
  - I prefer “influencing” over “steering” because “steering” implies especially competent influence which is explicitly incorrect when reasoning about multiple agents operating in the same environment with incompatible goals and similar levels of capabilities. It is not true that either agent steers. Both agents influence.
  - I find the phrase “consequence-dependent processes (CDP)” very interesting. With “dependent” in the place of “influencing” or “steering”, CDP seems reminiscent of the outcome pump discussed in The Hidden Complexity of Wishes and My Naturalistic Awakening. Although notably, CDP doesn’t seem to imply that the process is causing the consequence to become more likely or certain, rather, it is just some kind of acausality based on the consequence. I don’t know if this is what you meant to imply, but it is certainly different from an OIS. While the CDP is acausal and doesn’t necessarily affect outcome likelihoods, OIS operate according to the causal rules of our world explicitly making some outcomes more likely than others.
  rely on people having a habit of trying to generalize.
  The art of abstraction involves generalization and specification. I love both and wish more people would delight in carefully constructed abstraction. Rather than rely I would say OIS terminology is maybe trying to promote people generalizing.
  I have an infinite wellspring of rephrases of those three words at hand in my head, and which rephrase I use depends on exactly which combination of subtle meanings I want to refer to right now.
  I think this is a flaw, not a feature. My goal with creating a standard set of terminology is (among other things) to avoid the ambiguity of subtle rephrasings and to create the shorthand words “OIS” and “OISs” pronounced “oh-ee” and “oh-ees” to make it easier to articulately discuss specific sets of important general phenomena.
  there’s an unsolved problem of how to specify goodness in a way that still means “good things” if put in a spreadsheet (perhaps one that is gigabytes large) and number-go-up’ed about.
  I strongly agree. I think this is basically Goodhart’s law. My thinking about and talking about OIS is very much a result of trying to think about how to solve the generalized AI Alignment Problem of representing what we want in a sufficiently accurate, articulate, and precise encoding that it can be used as the preferences for an arbitrarily capable OIS without that OIS becoming misaligned.
  Thanks again. I appreciate your critical engagement.