Stuart_Armstrong’s Shortform

Stuart_Armstrong30 Sep 2019 12:08 UTC

LW: 9 AF: 5

11 comments1 min readLW link

Stuart_Armstrong 6 Feb 2020 13:01 UTC
13 points
0
Lexicographical preference orderings seem to come naturally to humans. Sentiments like “no amount of money is worth one human life” are commonly expressed.

Now, that particular sentiment is wrong because money can be used to purchase human lives.

The other problem comes from using probability and expected utility, which makes anything lexicographically second completely worthless in all realistic cases. It’s one thing to say that you prefer apples to pears lexicographically when there are ten of each lying around and everything is deterministic (just take the ten apples first then the ten pears afterwards). But does it make sense to say that you’d prefer one chance in a trillion of extending someone’s life by a microsecond, over a billion euros of free consumption?

So this short post will propose a more sensible, smoothed version of lexicographical ordering, suitable to capture the basic intuition, but usable with expected utility.

If the utility $U$ has lexicographical priority and $V$ is subordinate to it, then choose a value $a$ and maximise:

$W = U + \frac{a}{2} tanh (V) .$

In that case, increases in expected $V$ always cause non-trivial increases in expected $W$ , but an increase in $U$ of $a$ will always be more important than any possible increase in $V$ .
What links here?
- If I were a well-intentioned AI… IV: Mesa-optimising by Stuart_Armstrong (2 Mar 2020 12:16 UTC; 26 points)
- Dagon 6 Feb 2020 16:10 UTC
  4 points
  0
  Parent
  This seems related to scope insensitivity and availability bias. No amount of money (that I have direct control of) is worth one human life ( in my Dunbar group). No money (which my mind exemplifies as $100k or whatever) is worth the life of my example human, a coworker. Even then, its false, but it’s understandable.
  
  More importantly, categorizations of resources (and of people, probably) are map, not territory. The only rational preference ranking is over reachable states of the universe. Or, if you lean a bit far towards skepticism/solopcism, over sums of future experiences.
  - Stuart_Armstrong 7 Feb 2020 11:02 UTC
    4 points
    0
    Parent
    Preferences exist in the map, in human brains, and we want to port them to the territory with the minimum of distortion.
    - Dagon 7 Feb 2020 17:20 UTC
      4 points
      0
      Parent
      Oh, wait. I’ve been treating preferences as territory, though always expressed in map terms (because communication and conscious analysis is map-only). I’ll have to think about what it would mean if they were purely map artifacts.
Stuart_Armstrong 25 Sep 2020 14:05 UTC
11 points
0
This is a link to “An Increasingly Manipulative Newsfeed” about potential social media manipulation incentives (eg FaceBook).

I’m putting the link here because I keep losing the original post (since it wasn’t published by me, but I co-wrote it).
Stuart_Armstrong 30 Sep 2019 12:08 UTC
6 points
0
Bayesian agents that knowingly disagree

A minor stub, caveating the Aumann’s agreement theorem; put here to reference in future posts, if needed.

Aumann’s agreement theorem states that rational agents with common knowledge of each other’s beliefs cannot agree to disagree. If they exchange their estimates, they will swiftly come to an agreement.

However, that doesn’t mean that agents cannot disagree, indeed they can disagree, and know that they disagree. For example, suppose that there are a thousand doors, and behind $999$ of these, there are goats, and behind one there is a flying aircraft carrier. The two agents are in separate rooms, and a host will go into each room and execute the following algorithm: they will choose a door at random among the $999$ that contain a goat. And, with probability $99 %$ , they will tell that door number to the agent; with probability $1 %$ , they will tell the door number with the aircraft carrier.

Then each agent will have probability $1 %$ of the named door being the aircraft carrier door, and $(99 / 999) % = (11 / 111) %$ probability on each of the other doors; so the most likely door is the one named by the host.

We can modify the protocol so that the host will never name the same door to each agent (roll a D100; if it comes up 1, tell the truth to the first agent and lie to the second; if it comes up 2, do the opposite; anything else means tell a different lie to either agent). In that case, each agent will have a best guess for the aircraft carrier, and the certainty that the other agent’s best guess is different.

If the agents exchanged information, they would swiftly converge on the same distribution; but until that happens, they disagree, and know that they disagree.
What links here?
- A test for symbol grounding methods: true zero-sum games by Stuart_Armstrong (26 Nov 2019 14:15 UTC; 22 points)
Stuart_Armstrong 5 Feb 2021 14:12 UTC
4 points
0
AF
Partial probability distribution

A concept that’s useful for some of my research: a partial probability distribution.

That’s a $Q$ that defines $Q (A ∣ B)$ for some but not all $A$ and $B$ (with $Q (A) = Q (A ∣ Ω)$ for $Ω$ being the whole set of outcomes).

This $Q$ is a partial probability distribution iff there exists a probability distribution $P$ that is equal to $Q$ wherever $Q$ is defined. Call this $P$ a full extension of $Q$ .

Suppose that $Q (C ∣ D)$ is not defined. We can, however, say that $Q (C ∣ D) = x$ is a logical implication of $Q$ if all full extension $P$ has $P (C ∣ D) = x$ .

Eg: $Q (A)$ , $Q (B)$ , $Q (A \cup B)$ will logically imply the value of $Q (A \cap B)$ .
What links here?
- Vanessa Kosoy 16 Feb 2021 23:44 UTC
  4 points
  0
  Parent
  This is a special case of a crisp infradistribution: $Q (A | B) = t$ is equivalent to $Q (A \cap B) = t Q (B)$ , a linear equation in $Q$ , so the set of all $Q$ ’s satisfying it is convex closed.
  - Stuart_Armstrong 17 Feb 2021 9:31 UTC
    2 points
    0
    Parent
    Thanks! That’s useful to know.
- Diffractor 4 Apr 2021 1:04 UTC
  LW: 3 AF: 3
  0
  AF Parent
  Sounds like a special case of crisp infradistributions (ie, all partial probability distributions have a unique associated crisp infradistribution)
  
  Given some $Q$ , we can consider the (nonempty) set of probability distributions equal to $Q$ where $Q$ is defined. This set is convex (clearly, a mixture of two probability distributions which agree with $Q$ about the probability of an event will also agree with $Q$ about the probability of an event).
  
  Convex (compact) sets of probability distributions = crisp infradistributions.
Stuart_Armstrong 21 Jul 2021 10:44 UTC
2 points
0
Here are a few examples of model splintering in the past:
1. The concept of honour; which includes concepts such as: “nobility of soul, magnanimity, and a scorn of meanness” [...] personal integrity [...] reputation [...] fame [...] privileges of rank or birth [...] respect [...] consequence of power [...] chastity”. That is a grab-bag of different concepts, but in various times and social situations, “honour” was seen as single, clear concept.
2. Gender. We’re now in a period where people are questioning and redefining gender, but gender has been splintering for a long time. In middle class Victorian England, gender would define so much about a person (dress style, acceptable public attitudes, genitals, right to vote, right to own property if married, whether they would work or not, etc...). In other times (and in other classes of society, and other locations), gender is far less informative.
3. Consider a Croat, communist, Yugoslav nationalist in the 1980s. They would be clear in their identity, which would be just one thing. Then the 1990s come along, and all these aspects come into conflict with each other.
Here are a few that might happen in the future; the first two could result from technological change, while the last could come from social change:
1. A human subspecies created who want to be left alone without interactions with others, but who are lonely and unhappy when solitary. This splinters preferences and happiness (more than they are today), and changes the standard assumptions about personal freedom and
2. A brain, or parts of a human brain, that loop forever through feelings of “I am am happy” and “I want this moment to repeat forever”. This splinters happiness-and-preferences from identity.
3. We have various ages of consent and responsibility; but, by age 21, most people are taken to be free to make decisions, are held responsible for their actions, and are seen to have a certain level of understanding about their world. With personalised education, varying subcultures, and more precise psychological measurements, we might end up in a world where “maturity” splinters into lots of pieces, with people having different levels of autonomy, responsibility, and freedom in different domains—and these might not be particularly connected with their age.