Labs should be explicit about why they are building AGI

Three of the big AI labs say that they care about alignment and that they think misaligned AI poses a potentially existential threat to humanity. These labs continue to try to build AGI. I think this is a very bad idea.

The leaders of the big labs are clear that they do not know how to build safe, aligned AGI. The current best plan is to punt the problem to a (different) AI,[1] and hope that can solve it. It seems clearly like a bad idea to try and build AGI when you don’t know how to control it, especially if you readily admit that misaligned AGI could cause extinction.

But there are certain reasons that make trying to build AGI a more reasonable thing to do, for example:

  • They want to build AGI first because they think this is better than if a less safety-focused lab builds it

  • They are worried about multi-polar scenarios

  • They are worried about competition from other nations, specifically from China

  • They think one needs to be able to play with the big models in order to align the bigger models, and there is some other factor which means we will soon have bigger models we need to align

I think the labs should be explicit that they are attempting to build AGI[2], and that this is not safe, but there are specific reasons that cause them to think that this is the best course of action. And if these specific reasons no longer hold then they will stop scaling or attempting to build AGI. They should be clear about what these reasons are. The labs should be explicit about this to the public and to policy makers.

I want a statement like:

We are attempting to build AGI, which is very dangerous and could cause human extinction. We are doing this because of the specific situation we are in.[3] We wish we didn’t have to do this, but given the state of the world, we feel like we have to do this, and that doing this reduces the chance of human extinction. If we were not in this specific situation, then we would stop attempting to build AGI. If we noticed [specific, verifiable observations about the world], then we would strongly consider stopping our attempt to build AGI.

Without statements like this, I think labs should not be surprised if others think they are recklessly trying to build AGI.

  1. ^

    Either an automated alignment researcher, or something to do with scalable oversight

  2. ^

    Or scale AI systems to levels that are not known to be safe

  3. ^

    It is important that they actually specify what the situation is that forces them to build AGI.