Imagine writing a policy for an AI company.
The CEO trusts you and will approve your policy, if you show it’s based on 8 premises:
AI can be scaled in capability. It can offer benefits and power to those who wield it. Also it could become unsafe for all of humanity.
There are ‘bad guys’ scaling AI: people who are scaling on for temporary benefits or strategic power advantages, even if it leads to everyone dying.
There are also ‘good guys’: people who are developing AI to stay safe for humanity.
If we pause scaling because we think AI is becoming less safe, the other ‘bad guys’ will want to keep scaling regardless.
If we fall behind, we lose the power to try make the most advanced AI safer.
If we keep scaling ahead of ‘bad guys’, we can try make and distribute a safer version of the most advanced AI than what those other ‘bad guys’ would make.
But other ‘bad guys’ will learn from us and catch up, developing their own version of the most advanced AI faster than they would otherwise have.
If ever, anyone builds an existentially unsafe version, everyone dies.
The CEO wants their company to be the ‘good guys’. They want a policy to not only use, but also advocate/lobby for hard so other companies also start acting as the ‘good guys’.
What policy do you write?
Fast-follow on frontier capabilities; compete on useful applications and convenient integration. This works all the way up to recursive self improvement or superintelligence, at which point none of this matters.
Lobby for regulation that slows down everyone. Justify this practice to your stockholders by pointing out that your competitive advantage is in applications and so fair regulation hits your competitors harder.
Publish scary demos and evals. Show them in the context of your competitors models first, with an acknowledgement that the issues also apply to your products. This won’t change anything directly, but it will be useful ammunition for external activists who can push for things you can’t.
Encourage (or fail to effectively discourage) union organizing among your employees so that you are not bound to hyper-optimizing for stockholder short term ROI. Maintain a collaborative relationship with union leadership, since you both have an interest in the company’s success. There is no guarantee that union leadership will be as safety conscious as you are, but it is impossible for them to be less safety conscious than stockholders, so this is a net win regardless.
Write a list of voluntary commitments, not just for slowing down but also regarding lobbying, based on what you would like to see if you had full consensus with your significant competitors. Include effective measures for monitoring and enforcement. Don’t worry about competitiveness, that doesn’t matter because none of this is binding until a critical mass has signed on. When it looks good, circulate the agreement to your competitors and invite their feedback, as long as that feedback remains within the spirit of the agreement. Let them actually reject the agreement rather than just assuming that they will. Maintain a paper trail. If possible, publicize any rejection and use it against them.
Continue to scale, at or slightly behind the frontier. Complain loudly about the race dynamics you’re in, and advocate to legislation to show down. Make scary capabilities demos to support your case.
My sad guess is that a demo won’t be enough. Some actual real harm will have to happen to create the crisis necessary to precipitate action like a pause.