Hard Alignment spectra

Davidmanheim 4 Apr 2022 19:55 UTC
LW: 2 AF: 1
0
AF
In the post, I wanted to distinguish between two things you’re now combining; how hard alignment is, and how long we have. And yes, combining these, we get the issue of how hard it will be to solve alignment in the time frame we have until we need to solve it. But they are conceptually distinct.

And neither of these directly relates to takeoff speed, which in the current framing is something like the time frame from when we have systems that are near-human until they hit a capability discontinuity. You said “First off, takeoff speed and timing are correlated: if you think HLMI is sooner, you must think progress towards HLMI will be faster, which implies takeoff will also be faster.” This last implication might be true, or might not. I agree that there are many worlds in which they are correlated, but there are plausible counter-examples. For instance, we may continue with fast progress and get to HLMI and a utopian freedom from almost all work, but then hit a brick wall on scaling deep learning, and have another AI winter until we figure out how to make actually AGI which can then scale to ASI—and that new approach could lead to either a slow or a fast takeoff. Or we may have progress slow to a crawl due to costs of scaling input and compute until we get to AGI, at which point self-improvement takeoff could be near-immediate, or could continue glacially.

And I agree with your claims about why Eliezer is pessimistic about prosaic alignment—but that’s not why he’s pessimistic about governance, which is a mostly unrelated pessimism.
- Sammy Martin 5 Apr 2022 12:19 UTC
  LW: 5 AF: 2
  0
  AF Parent
  Like I said in my first comment, the in practice difficulty of alignment is obviously connected to timeline and takeoff speed.
  
  But you’re right that you’re talking about the intrinsic difficulty of alignment Vs takeoff speed in this post, not the in practice difficulty.
  
  But those are also still correlated, for the reasons I gave—mainly that a discontinuity is an essential step in Eleizer style pessimism and fast takeoff views. I’m not sure how close this correlation is.
  
  Do these views come apart in other possible worlds? I.e. could you believe in a discontinuity to a core of general intelligence but still think prosaic alignment can work?
  
  I think that potentially you can—if you think that still enough capabilities in pre-HLMI AI (pre discontinuity) to help you do alignment research before dangerous HLMI shows up. But prosaic alignment seems to require more assumptions to be feasible assuming a discontinuity, like that the discontinuity doesn’t occur before all the important capabilities you need to do good alignment research.
  - Davidmanheim 5 Apr 2022 14:56 UTC
    LW: 4 AF: 2
    0
    AF Parent
    I’m not sure I agree with the compatibility of discontinuity and prosaic alignment, though you make a reasonable case, but I do think there is compatibility between slower governance approaches and discontinuity, if it is far enough away.

Davidmanheim comments on AI Governance across Slow/​Fast Takeoff and Easy/​Hard Alignment spectra

Davidmanheim comments on AI Governance across Slow/Fast Takeoff and Easy/Hard Alignment spectra