I once claimed that I thought building a comprehensive inside view on technical AI safety was not valuable, and I should spend more time grinding maths/ML/CS to start more directly contributing.
I no longer endorse that view. I’ve come around to the position that:
Many alignment researchers are just fundamentally confused about important questions/topics, and are trapped in inadequate ontologies
Considerable conceptual engineering is needed to make progress
Large bodies of extant technical AI safety work is just inapplicable to making advanced ML systems existentially safe
Becoming less confused, developing better frames, enriching my ontology, adding more tools to my conceptual toolbox, and just generally thinking clearer about technical AI safety is probably very valuable. Probably more valuable than rushing to try and execute on a technical agenda (that from my current outside view would probably end up being useless).
Most of the progress I’ve made on becoming a better technical AI safety person this year has been along the lines of trying to think clearer about the problem.
I once claimed that I thought building a comprehensive inside view on technical AI safety was not valuable, and I should spend more time grinding maths/ML/CS to start more directly contributing.
I no longer endorse that view. I’ve come around to the position that:
Many alignment researchers are just fundamentally confused about important questions/topics, and are trapped in inadequate ontologies
Considerable conceptual engineering is needed to make progress
Large bodies of extant technical AI safety work is just inapplicable to making advanced ML systems existentially safe
Becoming less confused, developing better frames, enriching my ontology, adding more tools to my conceptual toolbox, and just generally thinking clearer about technical AI safety is probably very valuable. Probably more valuable than rushing to try and execute on a technical agenda (that from my current outside view would probably end up being useless).
Most of the progress I’ve made on becoming a better technical AI safety person this year has been along the lines of trying to think clearer about the problem.