Organizers may promote arguments for AI safety that may be (comparatively*) compelling yet flawed
I feel like there’s an asymmetry here. “10% of researchers believe AI extinction is a possibility” isn’t somehow offset by “but 90% don’t”. For such an outrageous claim, 10% is a huge number! Similarly, “maybe AIs won’t be instrumentally convergent” is not enough here. “We are absolutely positive that we can build AIs that are not instrumentally convergent, and that no amount of unavoidable successive dumbass tinkering will suffice to change that” would be. Which is kind of what alignment research is about? Whenever people have a P(doom) lower than 100% (which is most people besides Yud), that margin usually lies somewhere in these possibilities. But even a P(doom) of 1% is stupid high and worth spending effort reducing further.
I feel like there’s an asymmetry here. “10% of researchers believe AI extinction is a possibility” isn’t somehow offset by “but 90% don’t”. For such an outrageous claim, 10% is a huge number! Similarly, “maybe AIs won’t be instrumentally convergent” is not enough here. “We are absolutely positive that we can build AIs that are not instrumentally convergent, and that no amount of unavoidable successive dumbass tinkering will suffice to change that” would be. Which is kind of what alignment research is about? Whenever people have a P(doom) lower than 100% (which is most people besides Yud), that margin usually lies somewhere in these possibilities. But even a P(doom) of 1% is stupid high and worth spending effort reducing further.