drocta comments on A Nonconstructive Existence Proof of Aligned Superintelligence

drocta 16 Sep 2024 20:27 UTC
9 points
0
If your argument is, “if it is possible for humans to produce some (verbal or mechanical) output, then it is possible for a program/machine to produce that output”, then, that’s true I suppose?
I don’t see why you specified “finite depth boolean circuit”.
While it does seem like the number of states for a given region of space is bounded, I’m not sure how relevant this is. Not all possible functions from states to {0,1} (or to some larger discrete set) are implementable as some possible state, for cardinality reasons.
I guess maybe that’s why you mentioned the thing along the lines of “assume that some amount of wiggle room that is tolerated” ?
One thing you say is that the set of superintelligences is a subset of the set of finite-depth boolean circuits. Later, you say that a lookup table is implementable as a finite-depth boolean circuit, and say that some such lookup table is the aligned superintelligence. But, just because it can be expressed as a finite-depth boolean circuit, it does not follow that it is in the set of possible superintelligences. How are you concluding that such a lookup table constitutes a superintelligence? It seems
Now, I don’t think that “aligned superintelligence” is logically impossible, or anything like that, and so I expect that there mathematically-exists a possible aligned-superintelligence (if it isn’t logically impossible, then by model existence theorem, there exists a model in which one exists… I guess that doesn’t establish that we live in such a model, but whatever).
But I don’t find this argument a compelling proof(-sketch).
- Roko 17 Sep 2024 10:11 UTC
  2 points
  0
  Parent
  
  if it isn’t logically impossible
  
  Until I wrote this proof, it was a live possibility that aligned superintelligence is in fact logically impossible.
- Roko 17 Sep 2024 9:50 UTC
  2 points
  0
  Parent
  
  Not all possible functions from states to {0,1} (or to some larger discrete set) are implementable as some possible state, for cardinality reasons
  
  All cardinalities here are finite. The set of generically realizable states is a finite set because they each have a finite and bounded information content description (a list of instructions to realize that state, which is not greater in bits than the number of neurons in all the human brains on Earth).
  - drocta 22 Sep 2024 2:40 UTC
    1 point
    0
    Parent
    Yes, I knew the cardinalities in question were finite. The point applies regardless though. For any set X, there is no injection from 2^X to X. In the finite case, this is 2^n > n for all natural numbers n.
    If there are N possible states, then the number of functions from possible states to {0,1} is 2^N , which is more than N, so there is some function from the set of possible states to {0,1} which is not implemented by any state.
    - Roko 7 Oct 2024 20:12 UTC
      2 points
      0
      Parent
      I never said it had to be implemented by a state. That is not the claim: the claim is merely that such a function exists.
      - drocta 20 Jan 2025 20:55 UTC
        1 point
        0
        Parent
        (Sorry for the late response, I hadn’t checked my LW inbox much since my previous comments.)
        If it were the case that such a function exists but cannot possibly be implemented (any implementation would be implementation as a state), and no other function satisfying the same constraints could possibly be implemented, that seems like it would be a case of it being impossible to have the aligned ASI. (Again, not that I think this is the case, just considering the validity of argument.)
        The function that is being demonstrated to exist is the lookup table that produces the appropriate actions, yes? The one that is supposed to be implementable by a finite depth circuit?
- Roko 17 Sep 2024 9:37 UTC
  2 points
  0
  Parent
  
  How are you concluding that such a lookup table constitutes a superintelligence?
  
  Isn’t it enough that it achieves the best possible outcome? What other criteria do you want a “superintelligence” to have?
  - drocta 22 Sep 2024 3:07 UTC
    1 point
    0
    Parent
    Not if the point of the argument is to establish that a superintelligence is compatible with achieving the best possible outcome.
    Here is a parody of the issue, which is somewhat unfair and leaves out almost all of your argument, but which I hope makes clear the issue I have in mind:
    “Proof that a superintelligence can lead to the best possible outcome: Suppose by some method we achieved the best possible outcome. Then, there’s no properties we would want a superintelligence to have beyond that, so let’s call however we achieved the best possible outcome, ‘a superintelligence’. Then, it is possible to have a superintelligence produce the best possible outcome, QED.”
    In order for an argument to be compelling for the conclusion “It is possible for a superintelligence to lead to good outcomes.” you need to use a meaning of “a superintelligence” in the argument, such that the statement “It is possible for a superintelligence to lead to good outcomes”, when interpreted with that meaning of “a superintelligence”, produces the meaning you want that sentence to have? If I argue “it is possible for a superintelligence, by which I mean computer with a clock speed faster than N, to lead to good outcomes”, then, even if I convincingly argue that a computer with a clock speed faster than N can lead to good outcomes, that shouldn’t convince people that it is possible for a superintelligence, in the sense that they have in mind (presumably not defined as “a computer with a clock speed faster than N”), is compatible with good outcomes.
    Now, in your argument you say that a superintelligence would presumably be some computational process. True enough! If you then showed that some predicate is true of every computational process, you would then be justified in concluding that that predicate is (presumably) true of every possible superintelligence. But instead, you seem to have argued that a predicate is true of some computational process, and then concluded that it is therefore true of some possible superintelligence. This does not follow.
    - Roko 7 Oct 2024 20:19 UTC
      2 points
      0
      Parent
      The problem with this is that people use the word “superintelligence” without a precise definition. Clearly they mean some computational process. But nobody who uses the term colloquially defines it.
      
      So, I will make the assertion that if a computational process achieves the best possible outcome for you, it is a superintelligence. I don’t think anyone would disagree with that.
      
      If you do, please state what other properties you think a “superintelligence” must have other than being a computational process achieves the best possible outcome.
      - drocta 20 Jan 2025 21:35 UTC
        1 point
        0
        Parent
        If you are interested in convincing people who so far think “It is impossible for the existence of an artificial superintelligence to produce desirable outcomes” otherwise, you should have a meaning of “an aritifical superintelligence” in mind that is like what they mean by it.
        If one suspects that it is impossible for an artificial superintelligence to produce desirable outcomes, then when one considers “among possible futures, the one(s) that have as good or better outcomes than any other possible future”, one would suppose that these perhaps are not ones that contain superintelligences. And, so, one would suppose that the computational process that achieves the best outcome, would perhaps not be a superintelligence.
        To convince such a person otherwise, you would have to establish that some properties that they consider characteristic of something being a superintelligence (which would probably be something like “is more intelligent and competent than any human” for some specified sense of “intelligent”) is compatible with achieving good (or maximally good) outcomes.
        If someone suspects that [insert name of some not-particularly-well-defined political ideology here] can’t ever lead to good outcomes, it would not convince them otherwise to go through the same argument except with “government procedure” or whatever in place of the actuators and such of the computer program, and say that:
        “Clearly it is possible for a government process to do the best that can be done by any government process. Then, such a government must count as [insert aforementioned name of a not very well-defined ideology] government, as it achieves all the things that someone who wants a [insert aforementioned name of a not very well-defined ideology] government could hope it would achieve. Therefore, it is possible for a [insert aforementioned name of a not very well-defined ideology] government to do the best that can be done by any government procedure.”
        , they would not find this compelling in the slightest! They would object that [insert aforementioned name of a not very well defined ideology] generally has properties P and Q, and that you haven’t established that the P or Q are compatible with achieving the best that a government can achieve.
        This would still be the case if P and Q are somewhat fuzzy concepts without a clear consensus on how to make them precise.
        And, they would be right to object to this. As, indeed, the argument does not demonstrate for even one single particular way of making P or Q precise show that such a precise-ification of it is compatible with the government reaching the best results that a government can obtain.
        ______
        To answer your question: for something to count as ASI in a reasonable sense of ASI, then it must be, for some reasonable sense of “more intelligent”, more intelligent than any human.
        If someone picked a sense of “more intelligent” that I considered reasonable, and demonstrated that having a computer program which is in that sense of “more intelligent” is more intelligent than all humans, isn’t incompatible with achieving the best possible outcomes, then I would say that, for a reasonable sense of “ASI”, they have demonstrated that there being an ASI is compatible with achieving the best possible outcome. (I might even say that they have demonstrated that, for that sense of ASI, that it is possible for an ASI to be aligned, though for that I think I might require that it be possible for the ASI (in that sense of ASI) to produce the outcome, not just be around at the same time.)