johnswentworth comments on So You Want To Make Marginal Progress...

johnswentworth 8 Feb 2025 1:45 UTC
12 points
12
It’s not clear to me we’ll have (or will “need”) new paradigms before fully handing over all technical and strategic work to AIs which are capable enough to obsolete humans at all cognitive tasks.
If you want to not die to slop, then “fully handing over all technical and strategic work to AIs which are capable enough to obsolete humans at all cognitive tasks” not a thing which happens at all until the full superintelligence alignment problem is solved. That is how you die to slop.
- ryan_greenblatt 8 Feb 2025 2:24 UTC
  9 points
  0
  Parent
  
  “fully handing over all technical and strategic work to AIs which are capable enough to obsolete humans at all cognitive tasks”
  
  Suppose we replace “AIs” with “aliens” (or even, some other group of humans). Do you agree that doesn’t (necessarily) kill you due to slop if you don’t have a full solution to the superintelligence alignment problem?
  - johnswentworth 8 Feb 2025 2:43 UTC
    13 points
    7
    Parent
    Aliens kill you due to slop, humans depend on the details.
    The basic issue here is that the problem of slop (i.e. outputs which look fine upon shallow review but aren’t fine) plus the problem of aligning a parent-AI in such a way that its more-powerful descendants will robustly remain aligned, is already the core of the superintelligence alignment problem. You need to handle those problems in order to safely do the handoff, and at that point the core hard problems are done anyway. Same still applies to aliens: in order to safely do the handoff, you need to handle the “slop/nonslop is hard to verify” problem, and you need to handle the “make sure agents the aliens build will also be aligned, and their children, etc” problem.