Linda Linsefors comments on What if Alignment is Not Enough?

Linda Linsefors 11 Mar 2024 17:46 UTC
13 points
5
I think point 5 is the main crux.

Please click agree or disagree on this comment if you agree or disagree (cross or check mark), since this is useful guidance for what part of this people should prioritise when clarifying further.
- William the Kiwi 16 Mar 2024 11:13 UTC
  2 points
  0
  Parent
  I also agree 5 is the main crux.
  In the description of point 5, the OP says “Proving this assertion is beyond the scope of this post,”, I presume that the proof of the assertion is made elsewhere. Can someone post a link to it?
  - Remmelt 18 Mar 2024 10:53 UTC
    2 points
    1
    Parent
    This answer will sound unsatisfying:
    If a mathematician or analytical philosopher wrote a bunch of squiggles on a whiteboard, and said it was a proof, would you recognise it as a proof?
    Say that unfamiliar new analytical language and means of derivation are used (which is not uncommon for impossibility proofs by contradiction, see Gödel’s incompleteness theorems and Bell’s theorem).
    Say that it directly challenges technologists’ beliefs about their capacity to control technology, particularly their capacity to constrain a supposedly “dumb local optimiser”: evolutionary selection.
    Say that the reasoning is not only about a formal axiomatic system, but needs to make empirically sound correspondences with how real physical systems work.
    Say that the reasoning is not only about an interesting theoretical puzzle, but has serious implications for how we can and cannot prevent human extinction.
    
    This is high stakes.
    
    We were looking for careful thinkers who had the patience to spend time on understanding the shape of the argument, and how the premises correspond with how things work in reality. Linda and Anders turned out to be two of these people, and we did three long calls so far (first call has an edited transcript).
    
    I wish we could short-cut that process. But if we cannot manage to convey the overall shape of the argument and the premises, then there is no point to moving on to how the reasoning is formalised.
    I get that people are busy with their own projects, and want to give their own opinions about what they initially think the argument entails. And, if the time they commit to understanding the argument is not at least ¹⁄₅ of the time I spend on conveying the argument specifically to them, then in my experience we usually lack the shared bandwidth needed to work through the argument.
    
    Saying, “guys, big inferential distance here” did not help. People will expect it to be a short inferential distance anyway.
    Saying it’s a complicated argument that takes time to understand did not help. A smart busy researcher did some light reading, tracked down a claim that seemed “obviously” untrue within their mental framework, and thereby confidently dismissed the entire argument. BTW, they’re a famous research insider, and we’re just outsiders whose response got downvoted – must be wrong right?
    Saying everything in this comment does not help. It’s some long-assessed plea for your patience.
    If I’m so confident about the conclusion, why am I not passing you the proof clean and clear now?!
    Feel free to downvote this comment and move on.
    
    Here is my best attempt at summarising the argument intuitively and precisely, still prompting some misinterpretations by well-meaning commenters. I feel appreciation for people who realised what is at stake, and were therefore willing to continue syncing up on the premises and reasoning, as Will did:
    
    The core claim is not what I thought it was when I first read the above sources and I notice that my skepticism has decreased as I have come to better understand the nature of the argument.
- Remmelt 12 Mar 2024 1:28 UTC
  1 point
  0
  Parent
  I agree that point 5 is the main crux:
  The amount of control necessary for an ASI to preserve goal-directed subsystems against the constant push of evolutionary forces is strictly greater than the maximum degree of control available to any system of any type.
  To answer it takes careful reasoning. Here’s my take on it:
  - We need to examine the degree to which there would be necessarily changes to the connected functional components constituting self-sufficient learning machinery (as including ASI)
    Changes by learning/receiving code through environmental inputs, and through introduced changes in assembled molecular/physical configurations (of the hardware).
    Necessary in the sense of “must change to adapt (such to continue to exist as self-sufficient learning machinery),” or “must change because of the nature of being in physical interactions (with/in the environment over time).”
  - We need to examine how changes to the connected functional components result in shifts in actual functionality (in terms of how the functional components receive input signals and process those into output signals that propagate as effects across surrounding contexts of the environment).
  - We need to examine the span of evolutionary selection (covering effects that in their degrees/directivity feed back into the maintained/increased existence of any functional component).
  - We need to examine the span of control-based selection (the span covering detectable, modellable simulatable, evaluatable, and correctable effects).