If no near-term alignment strategy, research should aim for the long-term

This is a small point about alignment strategy that gets mentioned occasionally but hasn’t been stated explicitly as far as I can tell [1].

If there are no paths to alignment that can be implemented in the near-term, research should focus on building for the long-term. This is independent of AI timelines or existential risk.

In other words, if a researcher was convinced that:

  1. AI will lead to high chance of existential catastrophe in the near-term

  2. There are no approaches that can reduce the chance of existential catastrophe in the near-term

Then they are better off focusing on bigger projects that may pan out over a longer time period, even if they are unlikely to complete those projects due to an existential catastrophe. This is because, by assumption, work on near-term approaches is useless [2].

I think this implies that establishing a research community, recruiting researchers, building infrastructure for the field, and foundational work all have value even if we are pessimistic about AI risk.

That being said, it’s uncertain how promising different approaches are, and some may plausibly be implemented in the near-term, so it makes sense to fund a diversity of projects [3]. It’s also better to attempt near-term projects even if they are not promising rather than do nothing to try to avert a catastrophe.

Notes:

  1. For an example of this point being made in passing, consider the end of rvnnt’s comment here.

  2. I don’t actually hold this view.

  3. Assuming these projects don’t interfere with each other, but that’s a problem for another day.