One lens that seems useful here is a negative alignment tax.
Some alignment work increases reliability, observability, and control of AI systems. Those properties increase the economic value of deploying AI, which creates incentives for organizations to invest in alignment capabilities as systems scale. That creates positive selection pressure for alignment work itself.
This dynamic also produces an ecosystem effect. As alignment driven companies scale, alignment knowledge compounds inside teams, talent pipelines form around safety work, and capital flows toward technologies that make AI systems more understandable and governable.
Safety products matter partly because of the tools they create and partly because they change the selection pressures shaping the AI ecosystem.
I wrote about there negative alignment tax here and about alignment driven startups here. The combination of those two ideas seems like one of the strongest arguments for AGI safety products.
Great point!
The reasoning in the original comment, while coming from a place of genuine moral seriousness, substitutes moral purity for causal modeling. Advanced AI development continues under competitive pressure regardless of whether alignment researchers participate. Opting out just weakens alignment properties in the systems that get deployed anyway. This is differential technological development where the selection effect runs in exactly the direction we should least want.
There is a world where alignment researchers refuse to touch anything funded by the Department of War. In that world, does the Department of War stop building AI systems? No. Obviously not. They build them anyway, with whatever alignment properties the remaining talent pool manages to produce, which is to say, fewer and worse ones. You have now brought about the exact outcome you were trying to prevent, and you did it by optimizing for the feeling of clean hands.
The Department of War has both the incentive and the budget to solve alignment in ways the frontier labs currently don’t, because they can’t deploy systems that pursue hidden objectives or behave unpredictably under distribution shift.
The proposal to instead endow AI with “with strong and firm moral principles, like the values of peace and lawful behavior” is great and all but it is not a technical proposal. It is a wish. Wishes do not constrain optimization processes. If they did, we would not need an alignment research community at all. We could simply write “be good” in the loss function and go home.
Instead of optimizing for clean hands, we should be asking “does this research, if successful, reduce the probability of catastrophic outcomes from advanced AI systems?” At this point, that’s really all that matters.