Linch comments on Eliezer’s Unteachable Methods of Sanity

Linch 9 Dec 2025 0:18 UTC
14 points
8
Re
I’m not convinced though that ASI will bother to kill us or, if it does, very immediately.
I don’t think we’re certainly doomed (and have shallower models than Eliezer and some others here), but for me the strongest arguments for why things might go very badly:
1. An agent that wants other things might find their goals better achieved by acquiring power first. “If you don’t know what you want, first acquire power.” Instrumental convergence is a related concept.
2. There is and will continue to be strong training/selection effects for agency and not just unmoored intelligence for AI in the upcoming years. Ability to take autonomous actions is both economically and militarily useful.
3. In a multipolar/multiagent setup with numerous powerful AIs flying around, the more ruthless ones are more likely to win and accumulate more power. So it doesn’t matter if some fraction of AIs wirehead, become Buddhist, are bad at long-term planning, have very parochial interests etc, as long as some powerful AIs want to eliminate or subjugate humanity for their purposes, and the remaining AIs/rest of humanity don’t coordinate to stop them in time.
This arguments are related to each other, and not independent. But note also that they don’t have to all be true for very bad things to happen. For example, even if (2) is mostly false and labs mostly make limited, non-agentic, AIs, (3) can still apply and a small number of agent ASIs roll over the limited AIs and humanity.
And of course this is not an exhaustive list of possible reasons for AI takeover.