Most of the impact comes from helping people understand better what’s up with AI, what the stakes of the whole situation are, what might be ways we could mitigate the risk (which at the moment mostly don’t look like technical safety research, though the effect of that is not zero), and generally helping people grapple with what’s going on. Another big avenue for positive impact is building tools, or deploying AI systems, in contexts where they might help end the acute risk period (e.g. via facilitating coordination among nations, or companies or politicians).
I do think most current alignment work is useless, but that’s actually not really upstream of my beliefs here. Most work in any field is useless, and it would be surprising if alignment would be any different. I do think marginal technical alignment work is quite unlikely to make a big differences, though there are some things that I do think are worth quite a bit of investment, though they seem relatively neglected (and people have a surprising skill to take pointers to promising approaches and then co-opt them to mean something else in an attempt to recruit talent or investment or field off political attacks, so pointing at them is quite tricky).
Where do you think the impact comes from? And is this coming from a background belief that most current alignment work is useless?
Most of the impact comes from helping people understand better what’s up with AI, what the stakes of the whole situation are, what might be ways we could mitigate the risk (which at the moment mostly don’t look like technical safety research, though the effect of that is not zero), and generally helping people grapple with what’s going on. Another big avenue for positive impact is building tools, or deploying AI systems, in contexts where they might help end the acute risk period (e.g. via facilitating coordination among nations, or companies or politicians).
I do think most current alignment work is useless, but that’s actually not really upstream of my beliefs here. Most work in any field is useless, and it would be surprising if alignment would be any different. I do think marginal technical alignment work is quite unlikely to make a big differences, though there are some things that I do think are worth quite a bit of investment, though they seem relatively neglected (and people have a surprising skill to take pointers to promising approaches and then co-opt them to mean something else in an attempt to recruit talent or investment or field off political attacks, so pointing at them is quite tricky).