rotatingpaguro comments on AI Alignment Metastrategy

rotatingpaguro 13 Jan 2024 5:14 UTC
1 point
0
All of those fields use math but don’t heavily rely on rigorously provable formulations of their problems.
Chicken and egg: is this evidence they are not mature enough to make friendly AI, or evidence that friendly AI can be made with that current level of rigor?
- Seth Herd 13 Jan 2024 23:17 UTC
  3 points
  0
  Parent
  I agree; practices in other fields aren’t evidence for the right approach to AGI.
  
  My point is that there’s no evidence that math IS the right approach, just loose intuitions and preferences.
  
  And the arguments for it are increasingly outdated. Yudkowsky originated those arguments, and he now thinks that stopping current AGI research, starting over and doing better math is the best approach, but still >99% likely to fail.
  
  Arguments against less-rigorous, more-ML-and-cogsci-like approaches are loose and weak. Therefore, those approaches are pretty likely to offer better odds of success than Yudkowsky’s plan at estimated 99%-plus failure. This is a big claim, but I’m prepared to make it and defend it. That’s the post I’m working on. In short, claims about the fragility of values and capabilities generalizing better than alignment are based on intuitions, and the opposite conclusions are just as easy to argue for. This doesn’t say which is right, it says that we don’t know yet how hard alignment is for deep network based AGI.