[Question] Is there an analysis of the common consideration that splitting an AI lab into two (e.g. the founding of Anthropic) speeds up the development of TAI and therefore increases AI x-risk?

I’m asking because I can think of arguments going both ways.

Note: this post is focused on the generic question “what to expect from an AI lab splitting into two” more than on the specifics of the OpenAI vs Anthropic case.

Here’s the basic argument: after splitting, the two groups of individuals are now competing instead of cooperating, with two consequences:

  • they will rush faster toward TAI, speeding up TAI timelines (while outside safety work isn’t correspondingly sped up);

  • by doing so, they will differentially neglect their own safety work compared to their capabilities work.

However, there are some possible considerations against this frame:

  1. the orgs are competing on safety as well as capabilities, in anticipation of future regulations as well as potentially in order to attract (especially future) talent;

  2. in general, there is a case to be made that splitting a company into two will decrease the total productivity of its individuals (think about the limit of all individuals working separately), though this will depend on many attributes of the company and the splitting;

  3. intuitively, I would expect a competition mindset to increase the speed of straightforward engineering (scaling), but decrease the speed of serendipity (e.g. discovering a new more powerful AI paradigm). So the relevant consequence on AI x-risk might depend crucially on whether reaching TAI is only a matter of scaling or if we are still at least one breakthrough away;

  4. the existence of a competing lab is inevitable (so except for the earlier founding, the splitting will be good if the competing lab has a better safety-to-capabilities profile than the one that would have been created otherwise);

  5. the competition mindset results in an earlier end to the openness of the capabilities research, slowing down the rest of the world.

I expect that there exists empirical literature that could help dig deeper into points 2 and 3.

Is there an existing more thorough analysis of this, or can you give other arguments in one way or the other?

No answers.
No comments.