ryan_greenblatt comments on So You Want To Make Marginal Progress...

ryan_greenblatt 8 Feb 2025 2:24 UTC
9 points
0

“fully handing over all technical and strategic work to AIs which are capable enough to obsolete humans at all cognitive tasks”

Suppose we replace “AIs” with “aliens” (or even, some other group of humans). Do you agree that doesn’t (necessarily) kill you due to slop if you don’t have a full solution to the superintelligence alignment problem?
- johnswentworth 8 Feb 2025 2:43 UTC
  13 points
  7
  Parent
  Aliens kill you due to slop, humans depend on the details.
  The basic issue here is that the problem of slop (i.e. outputs which look fine upon shallow review but aren’t fine) plus the problem of aligning a parent-AI in such a way that its more-powerful descendants will robustly remain aligned, is already the core of the superintelligence alignment problem. You need to handle those problems in order to safely do the handoff, and at that point the core hard problems are done anyway. Same still applies to aliens: in order to safely do the handoff, you need to handle the “slop/nonslop is hard to verify” problem, and you need to handle the “make sure agents the aliens build will also be aligned, and their children, etc” problem.