In terms of decision relevance, the update towards “Automate AI R&D → Explosive feedback loop of AI progress specifically” seems significant to research prioritization. Under such a scenario, getting the automating AI R&D tools to be honest and transparent is more likely to be a pre-requisite for aligning TAI. Here’s my speculation as to what automated AI R&D scenario implies for prioritization:
Candidates for increased priority:
ELK for code generation
Interpretability for transformers
…
Candidates for decreased priority:
Safety of real world training of RL models e.g. impact regularization, assistance games, etc.
Of course, each of these potential consequences requires further argument to justify. For instance, I could imagine becoming convinced that AI R&D will find improved RL algorithms more quickly than other areas—in which case things like impact regularization might be particularly valuable.
In terms of decision relevance, the update towards “Automate AI R&D → Explosive feedback loop of AI progress specifically” seems significant to research prioritization. Under such a scenario, getting the automating AI R&D tools to be honest and transparent is more likely to be a pre-requisite for aligning TAI. Here’s my speculation as to what automated AI R&D scenario implies for prioritization:
Candidates for increased priority:
ELK for code generation
Interpretability for transformers …
Candidates for decreased priority:
Safety of real world training of RL models e.g. impact regularization, assistance games, etc.
Safety assuming infinite intelligence/knowledge limit …
Of course, each of these potential consequences requires further argument to justify. For instance, I could imagine becoming convinced that AI R&D will find improved RL algorithms more quickly than other areas—in which case things like impact regularization might be particularly valuable.