Very true.
In my own experience, the feeling of urgency actively detracts from useful AI alignment work. Instead of forward-chaining towards a better understanding of intelligence/corrigibility/how-minds-work/etc you end up back-chaining from any kind of results you imagine wold maybe do something.
Which isn’t at all the right frame for approaching the problem, since the effectiveness of what you get is limited by your first thought about it.
On a related note, I think the overwhelming majority of alignment work isn’t helping to address the core problems. Even great researchers like John Wentsworth can somewhat miss the point.
(edited: grammar fix)
Has anyone thought of putting a prize on the alignment problem? I imagine the application form would be swamped by cranks/LLM slop but that feels solvable (e.g. requiring a small fee for each submission to pay the evaluator, requiring >1k LW karma).