Could an AI company legally pre-commit not to race, ensuring that their models were never more than second best and self-destructing the company if its models take the lead?
I think probably not. It’s really hard to prevent the owners of a company from doing what they want, especially if the company is important to the economy and/or national security (and I assume any near-frontier AIs lab would be).
Some pre-commitment methods and their problems:
If you make the pre-commitment part of the charter, the board can just vote to change the charter. Even if the charter says they can’t, a judge would probably let them anyway, as long as the shareholders agreed.
If the company is owned by a non-profit tasked with enforcement, the board of the non-profit can just decide not to enforce the pre-commitment.
If the pre-commitment method triggers the destruction of model weights or other assets (like GPUs), the government probably won’t allow it.
Especially if it prevents creditors from getting repaid.
A pre-commitment method that transfers value to creditors might work, but is easily defeated by restructuring the relevant debt.
Anything that destroys the value of current equity holders’ equity is risky in front of a judge because companies generally aren’t allowed to intentionally destroy shareholder value[1].
The only thing I think might work legally is to issue a bunch of non-voting non-dilutable restricted shares (like 90% of the company) to someone like Eliezer, locked up with the racing condition[2] as a trigger to convert them to normal shares. Legally, Eliezer is the owner of the company the whole time, so a judge would probably allow his shares to unlock.
The problem is that now Eliezer has billions of reasons to talk himself into why racing would be good this time (even before the trigger event, since he can always make a deal with the board).. so we’re back to ownership by another entity that might change its mind[3].
Oh did I mention that you need the pre-commitment trigger to be unambigous while ensuring that it never triggers by mistake, and that’s actually pretty hard too?
I can think of plenty of reasons for the normal downvote, but I’m confused about the disagree vote. Does someone think there is a way to make this work? I’m guessing “start another AI company but better this time” is still a bad idea for the obvious reasons but I got nerd-sniped by the legal question.
Could an AI company legally pre-commit not to race, ensuring that their models were never more than second best and self-destructing the company if its models take the lead?
I think probably not. It’s really hard to prevent the owners of a company from doing what they want, especially if the company is important to the economy and/or national security (and I assume any near-frontier AIs lab would be).
Some pre-commitment methods and their problems:
If you make the pre-commitment part of the charter, the board can just vote to change the charter. Even if the charter says they can’t, a judge would probably let them anyway, as long as the shareholders agreed.
If the company is owned by a non-profit tasked with enforcement, the board of the non-profit can just decide not to enforce the pre-commitment.
If the pre-commitment method triggers the destruction of model weights or other assets (like GPUs), the government probably won’t allow it.
Especially if it prevents creditors from getting repaid.
A pre-commitment method that transfers value to creditors might work, but is easily defeated by restructuring the relevant debt.
Anything that destroys the value of current equity holders’ equity is risky in front of a judge because companies generally aren’t allowed to intentionally destroy shareholder value[1].
The only thing I think might work legally is to issue a bunch of non-voting non-dilutable restricted shares (like 90% of the company) to someone like Eliezer, locked up with the racing condition[2] as a trigger to convert them to normal shares. Legally, Eliezer is the owner of the company the whole time, so a judge would probably allow his shares to unlock.
The problem is that now Eliezer has billions of reasons to talk himself into why racing would be good this time (even before the trigger event, since he can always make a deal with the board).. so we’re back to ownership by another entity that might change its mind[3].
Contrary to popular belief, companies aren’t required to maximizing shareholder value, but minimizing shareholder value is still frowned-upon.
Oh did I mention that you need the pre-commitment trigger to be unambigous while ensuring that it never triggers by mistake, and that’s actually pretty hard too?
Plus I suspect any entity you’d actually trust as the anchor to this pre-commitment mechanism would be unwilling to take part.
I can think of plenty of reasons for the normal downvote, but I’m confused about the disagree vote. Does someone think there is a way to make this work? I’m guessing “start another AI company but better this time” is still a bad idea for the obvious reasons but I got nerd-sniped by the legal question.