I personally don’t think human intelligence enhancement is necessary for solving AI alignment (though I may be wrong). I think we just need more time, money and resources to make progress.
In my opinion, the reason why AI alignment hasn’t been solved yet is because the field of AI alignment has only been around for a few years and has been operating with a relatively small budget.
My prior is that AI alignment is roughly as difficult as any other technical field like machine learning, physics or philosophy (though philosophy specifically seems hard). I don’t see why humanity can make rapid progress on fields like ML while not having the ability to make progress on AI alignment.
Diminishing returns kick in, and actual data from ever more advanced AI is essential to stay on the right track and eliminate incorrect assumptions. I also disagree that alignment could be “solved” before ASI is invented—we would just think we had it solved but could be wrong. If its just as hard as physics, then we would have untested theories, that are probably wrong, e.g. like SUSY would be help solve various issues and be found by the LHC which didn’t happen.
I don’t see why humanity can make rapid progress on fields like ML while not having the ability to make progress on AI alignment.
The reason normally given is that AI capability is much easier to test and optimise than AI safety. Much like philosophy, it’s very unclear when you are making progress, and sometimes unclear if progress is even possible. It doesn’t help that AI alignment isn’t particularly profitable in the short term.
I personally don’t think human intelligence enhancement is necessary for solving AI alignment (though I may be wrong). I think we just need more time, money and resources to make progress.
In my opinion, the reason why AI alignment hasn’t been solved yet is because the field of AI alignment has only been around for a few years and has been operating with a relatively small budget.
My prior is that AI alignment is roughly as difficult as any other technical field like machine learning, physics or philosophy (though philosophy specifically seems hard). I don’t see why humanity can make rapid progress on fields like ML while not having the ability to make progress on AI alignment.
ok I see how you could think that, but I disagree that time and more resources would help alignment much if at all, esp before GPT4.0. See here https://www.lesswrong.com/posts/7zxnqk9C7mHCx2Bv8/beliefs-and-state-of-mind-into-2025
Diminishing returns kick in, and actual data from ever more advanced AI is essential to stay on the right track and eliminate incorrect assumptions. I also disagree that alignment could be “solved” before ASI is invented—we would just think we had it solved but could be wrong. If its just as hard as physics, then we would have untested theories, that are probably wrong, e.g. like SUSY would be help solve various issues and be found by the LHC which didn’t happen.
The reason normally given is that AI capability is much easier to test and optimise than AI safety. Much like philosophy, it’s very unclear when you are making progress, and sometimes unclear if progress is even possible. It doesn’t help that AI alignment isn’t particularly profitable in the short term.