AI companies would probably answer that they only need to align weaker models and then bootstrap to an aligned superintelligence. You didn’t talk about that in your meta plan though.
I’ve seen that and the claims to do that. This seems to essentially be horseshit though. Also, what i read of the ‘superalignment’ paper claiming this seemed to be basically a method to get better synthetic data while vibing out about safety to make employees and recruits feel good.
Yeah I agree that it’s not a good plan. I just think that if you’re proposing your own plan, your plan should at least mention the “standard” plan and why you prefer to do something different. Like give some commentary on why you don’t think alignment bootstrapping is a solution. (And I would probably agree with your commentary.)
I’ve seen that and the claims to do that. This seems to essentially be horseshit though. Also, what i read of the ‘superalignment’ paper claiming this seemed to be basically a method to get better synthetic data while vibing out about safety to make employees and recruits feel good.
Yeah I agree that it’s not a good plan. I just think that if you’re proposing your own plan, your plan should at least mention the “standard” plan and why you prefer to do something different. Like give some commentary on why you don’t think alignment bootstrapping is a solution. (And I would probably agree with your commentary.)