Curated. I’ve recently felt like there has been a lot of confusion about “use AIs to help solve alignment” as a “strategy”, because in fact it is multiple strategies that look very different from one another.
The strategy described here does not “hide” the hard part of solving alignment, the way “build an AI that solves alignment for you” does. Existing tools, such as calculators, enable human-driven cognitive effort which would intractable (or merely much more difficult) without them. Similarly, it seems possible that LLMs will be able to substitute for or extend human cognition in ways that let us make intellectual progress faster, and maybe even differentially advantage alignment research.
I think this post does a great job of describing what I consider to be the “less doomed” strategy, without eliding relevant warnings and caveats.
Curated. I’ve recently felt like there has been a lot of confusion about “use AIs to help solve alignment” as a “strategy”, because in fact it is multiple strategies that look very different from one another.
The strategy described here does not “hide” the hard part of solving alignment, the way “build an AI that solves alignment for you” does. Existing tools, such as calculators, enable human-driven cognitive effort which would intractable (or merely much more difficult) without them. Similarly, it seems possible that LLMs will be able to substitute for or extend human cognition in ways that let us make intellectual progress faster, and maybe even differentially advantage alignment research.
I think this post does a great job of describing what I consider to be the “less doomed” strategy, without eliding relevant warnings and caveats.