Alexander Gietelink Oldenziel comments on Some thoughts on automating alignment research

Alexander Gietelink Oldenziel 28 May 2023 13:48 UTC
−13 points
−7
If we solve the alignment problem than we solve alignment problem.

I agree with this true statement.
- RogerDearnaley 29 May 2023 4:48 UTC
  11 points
  1
  Parent
  If we can solve enough of the alignment problem, the rest gets solved for us.
  If we can get a half-assed approximate solution to the alignment problem, sufficient to semi-align a STEM-capable AGI value learner of about smart-human level well enough to not kill everyone, then it will be strongly motivated to solve the rest of the alignment problem for us, just as the ‘sharp left turn’ is happening, especially if it’s also going Foom. So with value learning, there is is a region of convergence around alignment.
  Or to reuse one of Eliezer’s metaphors, then if we can point the rocket on approximately the right trajectory, it will automatically lock on and course-correct from there.