Nice post! In particular, I like your reasoning about picking research topics:
The main way I can see present-day technical research benefiting existential safety is by anticipating, legitimizing and fulfilling governance demands for AI technology that will arise over the next 10-30 years. In short, there often needs to be some amount of traction on a technical area before it’s politically viable for governing bodies to demand that institutions apply and improve upon solutions in those areas.
I like this as a guiding principle, and have used it myself, though my choices have also been driven in part by more open-ended scientific curiosity. But when I apply the above principle, I get to quite different conclusions about recommended research areas.
As a specific example, take the problem of oversight of companies that want to create of deploy strong AI: the problem of getting to a place where society has accepted and implemented policy proposals that demand significant levels of oversight for such companies. In theory, such policy proposals might be held back by a lack of traction in a particular technical area, but I do not believe this is a significant factor in this case.
To illustrate, here are some oversight measures that apply right now to companies that create medical equipment, including diagnostic equipment that contains AI algorithms. (Detail: some years ago I used to work in such a company.) If the company wants to release any such medical technology to the public, it has to comply with a whole range of requirements about documenting all steps taken in development and quality assurance. A significant paper trail has to be created, which is subject to auditing by the regulator. The regulator can block market entry if the processes are not considered good enough. Exactly the same paper trail + auditing measures could be applied to companies that develop powerful non-medical AI systems that interact with the public. No technical innovation would be necessary to implement such measures.
So if any activist group or politician wants to propose measures to improve oversight of AI development and use by companies (either motivated by existential safety risks or by a more general desire to create better outcomes in society), there is no need for them to wait for further advances in Interpretability in ML (IntML), Fairness in ML (FairML) or Accountability in ML (AccML) techniques.
To lower existential risks from AI, it is absolutely necessary to locate proposals for solutions which are technically tractable. But to find such solutions, one must also look at low-tech and different-tech solitions that go beyond the application of even more AI research. The existence of tractable alternative solutions to make massive progress leads me to down-rank the three AI research areas I mention above, at least when considered from a pure existential safety perspective. The non-existence of alternatives also leads me to up-rank other areas (like corrigibility) which are not even mentioned in the original post.
I like the idea of recommending certain fields for their educational value to existential-safety-motivated researchers. However, I would also recommend that such researchers read broadly beyond the CS field, to read about how other high-risk fields are managing (or have failed to manage) to solve their safety and governance problems.
I believe that the most promising research approach for lowering AGI safety risk is to find solutions that combine AI research specific mechanisms with more general mechanisms from other fields, like the use of certain processes which are run by humans.
Nice post! In particular, I like your reasoning about picking research topics:
I like this as a guiding principle, and have used it myself, though my choices have also been driven in part by more open-ended scientific curiosity. But when I apply the above principle, I get to quite different conclusions about recommended research areas.
As a specific example, take the problem of oversight of companies that want to create of deploy strong AI: the problem of getting to a place where society has accepted and implemented policy proposals that demand significant levels of oversight for such companies. In theory, such policy proposals might be held back by a lack of traction in a particular technical area, but I do not believe this is a significant factor in this case.
To illustrate, here are some oversight measures that apply right now to companies that create medical equipment, including diagnostic equipment that contains AI algorithms. (Detail: some years ago I used to work in such a company.) If the company wants to release any such medical technology to the public, it has to comply with a whole range of requirements about documenting all steps taken in development and quality assurance. A significant paper trail has to be created, which is subject to auditing by the regulator. The regulator can block market entry if the processes are not considered good enough. Exactly the same paper trail + auditing measures could be applied to companies that develop powerful non-medical AI systems that interact with the public. No technical innovation would be necessary to implement such measures.
So if any activist group or politician wants to propose measures to improve oversight of AI development and use by companies (either motivated by existential safety risks or by a more general desire to create better outcomes in society), there is no need for them to wait for further advances in Interpretability in ML (IntML), Fairness in ML (FairML) or Accountability in ML (AccML) techniques.
To lower existential risks from AI, it is absolutely necessary to locate proposals for solutions which are technically tractable. But to find such solutions, one must also look at low-tech and different-tech solitions that go beyond the application of even more AI research. The existence of tractable alternative solutions to make massive progress leads me to down-rank the three AI research areas I mention above, at least when considered from a pure existential safety perspective. The non-existence of alternatives also leads me to up-rank other areas (like corrigibility) which are not even mentioned in the original post.
I like the idea of recommending certain fields for their educational value to existential-safety-motivated researchers. However, I would also recommend that such researchers read broadly beyond the CS field, to read about how other high-risk fields are managing (or have failed to manage) to solve their safety and governance problems.
I believe that the most promising research approach for lowering AGI safety risk is to find solutions that combine AI research specific mechanisms with more general mechanisms from other fields, like the use of certain processes which are run by humans.