The value of constitutional AI is using simulations of humans to rate an AI’s outputs, rather than actual humans. This is a lot cheaper and allows for more iteration etc, but I don’t think this will work once AI’s become smarter than humans. At that point, the human simulations will have trouble evaluating AI’s just like humans do.
Of course getting really cheap human feedback is useful, but I want to point out that constitutional AI will likely run into novel problems as AI capabilities surpass human capabilities.
The value of constitutional AI is using simulations of humans to rate an AI’s outputs, rather than actual humans. This is a lot cheaper and allows for more iteration etc, but I don’t think this will work once AI’s become smarter than humans. At that point, the human simulations will have trouble evaluating AI’s just like humans do.
Of course getting really cheap human feedback is useful, but I want to point out that constitutional AI will likely run into novel problems as AI capabilities surpass human capabilities.