Thomas Kwa comments on The Hidden Complexity of Wishes

Thomas Kwa 24 Jan 2024 2:00 UTC
4 points
3
Even beyond that, I think “prior probability of a thing happening” is one kind of outcome pump, but the post does not specify that as the kind of outcome pump it’s talking about.
Disagree. The Outcome Pump is explicitly described as conditioning the future trajectory of the universe according to the reset function:
The Outcome Pump is not sentient. It contains a tiny time machine, which resets time unless a specified outcome occurs. For example, if you hooked up the Outcome Pump’s sensors to a coin, and specified that the time machine should keep resetting until it sees the coin come up heads, and then you actually flipped the coin, you would see the coin come up heads. (The physicists say that any future in which a “reset” occurs is inconsistent, and therefore never happens in the first place—so you aren’t actually killing any versions of yourself.)
Also because the Outcome Pump is not sentient, it cannot be actively interested in subverting your wish. Eliezer claims “The Outcome Pump is a genie of the second class. No wish is safe.”, implying that the subversion effect will happen even with the non-sentient, quantilizer-like Outcome Pump. It may happen that future AIs are unsafe, but this will be because they apply too much optimization.
- habryka 24 Jan 2024 2:05 UTC
  4 points
  0
  Parent
  Yeah, see my response to Richard. I was wrong about the Outcome Pump not being specified, but think that your use of “probability” in the top-level comment is still wrong. Clearly the outcome pump would not sample from your prior over likely events.
  It would sample from some universal prior over events (this is playing fast-and-loose with quantum mechanics, but a reasonable interpretation might be sampling from the quantum wave-function, if you take a more Copenhagen perspective). Almost any universal prior here would be very oddly shaped, so that indeed you would observe the kinds of things that Eliezer is talking about.
  - Thomas Kwa 24 Jan 2024 2:25 UTC
    7 points
    1
    Parent
    I thought it was sampling from the quantum wavefunction, and still I think my argument works, unless this was a building that was basically deterministically going to kill your mother if you run physics from that point forward, or already had hazardous materials with a significant chance of exploding. I agree that you can’t use your own prior probabilities.
    Maybe I’m wrong about how much quantum randomness can influence events at a 5 minute timescale and the universe is actually very deterministic? If it’s very little such that you have to condition very hard to get anything to happen, then maybe the building does explode, but I’m not really sure what would happen.
    - Martin Randall 13 Dec 2024 3:56 UTC
      6 points
      3
      Parent
      I liked this discussion but I’ve reread the text a few times now, and I don’t think this fictional Outcome Pump can be sampling from the quantum wavefunction. The post gives examples that work with classical randomness, and not so much with quantum randomness. Most strikingly:
      
      … maybe a powerful enough Outcome Pump has aliens coincidentally showing up in the neighborhood at exactly that moment.
      
      The aliens coincidentally showing up in the neighborhood is a surprise to the user of the Outcome Pump, but not to the aliens who have been traveling for a thousand years to coincidentally arrive at this exact moment. They could be from the future, but the story allows time rewinding, not time travel. It’s not sampling from the user’s prior, because the user didn’t even consider the gas main blowing up.
      
      I think the simplest answer consistent with the text is that the Outcome Pump is magic, and sampling from what the user’s prior “should be”, given their observations.
    - habryka 24 Jan 2024 2:29 UTC
      2 points
      0
      Parent
      As I said, the best approximation I have is “move particles the smallest joint distance from my highest prior configuration”. Some particles are in people’s brains, but changing people’s beliefs or intentions seems like it’s very unlikely to happen via this operation, since my guess is the brain is highly redundant and works on ion channels that would require actually a quite substantial amount of matter to be displaced (comparatively). Very locally causing a chemical cain reaction somewhere seems easier, though that’s just a guess.
      I am not really sure what happens here, since I think overall physics is highly deterministic even taking into account quantumness, and my guess is for a macro-level outcome here you would need to go very quickly into astronomically low probabilities if you sample from the wave-function, and I don’t trust my reasoning for what happens in 0.00000000000000000000001% scenarios.
      My best guess is something pretty close to what Eliezer describes happens, but I couldn’t prove it to you.
      - Richard_Ngo 25 Jan 2024 1:25 UTC
        2 points
        0
        Parent
        my guess is the brain is highly redundant and works on ion channels that would require actually a quite substantial amount of matter to be displaced (comparatively)
        Neurons are very small, though, compared with the size of a hole in a gas pipe that would be necessary to cause an explosive gas leak. (Especially because you then can’t control where the gas goes after leaking, so it could take a lot of intervention to give the person a bunch of away-from-building momentum.)
        I would probably agree with you if the building happened to have a ton of TNT sitting around in the basement.
        habryka 25 Jan 2024 2:15 UTC
        2 points
        0
        Parent
        Oh, I was definitely not thinking of a hole in a gas pipe. I was expecting something much much subtler than that (more like very highly localized temperature-increases which then chain-react). You are dealing with omniscient levels of consequence-control here.