plex comments on Why ASI Alignment Is Hard (an overview)

plex 3 Oct 2025 16:29 UTC
4 points
0
Generally very solid distillation, you seem to get the problem unusually well!
The biggest dangers come from optimization, and I’m just not convinced that ASI will be an optimizer of anything, even its instrumental subgoals.
suggest checking these references:
and maybe some of
The Parable of Predict-O-Matic, Why Tool AIs Want to Be Agent AIs, Averting the convergent instrumental strategy of self-improvement, Averting instrumental pressures, and other articles under arbital corrigibility.
(the people who think ASI won’t be a long-range agent are missing key bits)
also, relatedly, might you be interested in helping design or maybe even write material for a course which is meant to give people a strong basis in the technical challenges of alignment? current version https://agentfoundations.study/