What if there are some groups of (intelligent, motivated, conscientious, etc.) humans that could solve the alignment problem but other groups cannot? As I understand your idea is to make the AIs mimic one of the groups in order to scale up their progress. But maybe the group you mimic is one of the ones that can’t solve it, while other groups can.
Also I don’t find it obvious that an AI that is trained to mimic the actions of a human will have the same consequences as that human will, because the AI might not deploy those actions in a strategically appropriate way.
What if there are some groups of (intelligent, motivated, conscientious, etc.) humans that could solve the alignment problem but other groups cannot? As I understand your idea is to make the AIs mimic one of the groups in order to scale up their progress. But maybe the group you mimic is one of the ones that can’t solve it, while other groups can.
Also I don’t find it obvious that an AI that is trained to mimic the actions of a human will have the same consequences as that human will, because the AI might not deploy those actions in a strategically appropriate way.