How singleton contradicts longtermism
As is well known, long-termism rests on three core assumptions:
1. The moral equality of generations.
2. The vast potential of the future.
3. The ability to influence the future.
While the third assumption is commonly criticized, the first and second points receive far less attention for some reason, especially in the context of the most likely ASI development scenario. Talk of myriads of meaningful lives makes little sense if we stop imagining a utopian, densely populated galaxy and instead consider the motivations of the agent that will be shaping that galaxy.
In most development models, the first agent to achieve superintelligence (ASI) will become a singleton. Its behavior will, with high probability, be determined by instrumental convergence.
An ASI will see humanity, and any other independent agents, as a potential threat to achieving its goals. Any other agent with a different value system or set of goals is a risk. The most effective way to manage risks is to eliminate them. Therefore, a singleton will strive to prevent the emergence of any new agents it cannot 100% control, or at the very least, minimize it.
Even if its goal is ‘aligned’, it should be understood that under real-world conditions, an aligned agent might commit terrible acts simply because in doing so it would, for example, avoid far more terrible suffering during the period it would have spent implementing and searching for expected alternative solutions.
Any external constraints imposed on an ASI (“do no harm,” “do not destroy other life forms”) will either be bypassed if it is unaligned, or they will become the cause of paradoxical and even more destructive actions if it is aligned but forced to operate within suboptimal rules. Simply put, constraints are more likely to lead to greater suffering due to inefficiency than to prevent it.
Thus, the entire argument around longtermism is predicated on an ASI deciding (not us!) to prioritize working on new, unaligned agents over cost reduction and greater safety. And for the first hypothesis to be true in these agents, they would need to be conscious, which in strict terms is not necessary and would therefore be absent.
I believe that this must be weighed in the context of modern long-termism, which is likely to assume the uncontrolled proliferation of unnecessary agents. My estimate is that the world will likely become ‘empty’ morally.
What work have I missed that contradicts this?
You could be a longtermist and still regard a singleton as the most likely outcome. It would just mean that a human-aligned singleton is the only real chance for a human-aligned long-term future, and so you’d better make that your priority, however unlikely it may be. It’s apparent that a lot of the old-school (pre-LLM) AI-safety people think this way, when they talk about the fate of Earth’s future lightcone and so forth.
However, I’m not familiar with the balance of priorities espoused by actual self-identified longtermists. Do they typically treat a singleton as just a possibility rather than an inevitability?