I don’t think this contradicts your stated opinion as I understand it, but I think a few things are worth noting (though in some I speak to an extent from ignorance):
The mindset that goes into high reliability engineering (HRE) could carry over to applications of AI that are somewhat narrow but not narrow enough to make the AI fail to add significant utility. For example, a lot of general-ish agent deployments are AFAIK catastrophically unsafe/insecure by default right now (openclaw, a bunch of agentic apps that can be prompt-injected in such a way that some upstream state changes that should not change, etc...). If the general culture changed, this would mitigate risk (at least banal risks caused by humans, which can still be pretty bad: bad enough to cause extinction).
Narrow AI can still have a lot of utility. I’m thinking here of AI that is in some sense bounded in what it can do in space, but which can accelerate progress in time. For example: an AI that is superhuman at mathematical proofs (that are verifiable) can help accelerate mathematical research but it doesn’t have a much wider impact on other parts of the world (unless someone uses its proofs). Similarly, an AI that can quickly implement software in sandboxes could be similar. AI that can search across all of human knowledge or simulate outcomes without taking actions would also fall in this category and be immensely useful and increase economic output. Obviously, the AI has to “not break out”. Generally, by bounded in “space” I mean that the set of things they can impact in the physical world (i.e. “space”) is very small (i.e. maybe the memory/disk on/of a computer system). I think there’s enough such settings that you could have a future (if we had better coordination) where everyone gets access to great AIs, but only a few people get access to super duper genius AIs and only apply them very carefully in controlled, narrow, settings (where they would use HRE).
HRE probably is critical to the “not breaking out” part of (2) and probably important for the best possible initial deployments of not-quite-AGI-but-almost-there AIs that we are likely to see in then near future.
It’s reasonable to think that “not breaking out” is hopeless but making an effort may delay it such that alignment (and other salient technologies) will progress to the point that there is a lower chance that things to dreadfully wrong.
I think this is accurate. Relatedly (but not as a retort or anything, I’m curious what people think) is it bad or immoral (or something along those lines) for individuals who are passionate (via intellectual interest, etc...) about specific AI safety verticals to pursue them?
For example, there are plenty of people who like science of deep learning/interpretability/vaguely things that involve understanding how DNNs “work”. It’s probably pretty easy to argue that this is unlikely to be the highest impact to reduce existential risk. Even if I’m wrong in thinking that this is easy to argue, then you could probably argue it for a lot of specific projects people are engaging in. However, I think it’s an easy to tell yourself that you are doing such research because of and to mitigate existential risk.
In such a (hypothetical) set of scenarios, let’s assume these individuals would be actively unhappy and slightly worse than their peers doing direct/straightforward DIP-like policy/politics/outreach work. So basically these hypothetical people need to pick between something useful and something fun.
So the question becomes what should these people do?
More broadly, I think doing straightforward work that is more likely to solve the problems of our society just tends not to be what people are individually interested in doing and thus we see “The Spectre” and its cousins in AI safety and probably other fields too (my generalization here is to situations where people do X because of Y but claim its because of Z where Y is provides a bit more individual utility and Z provides a bit more societal utility).
EDIT: I don’t actually know much about Control AI and I see some people are contesting some assumptions here (such as the fact that Control AI has been successful). I don’t have time to read this. I think broadly that doesn’t actually matter to my post here, because my post is (1) not about Control AI in particular, but instead about cases in AIS where you have X, Y, and Z as described ^, (2) on priors this 4-step plan stuff seems kind of self-evidently what needs to be done to get a lot of needed actions to occur (i.e. for regulation, etc...).