Another fairly common argument and motivation at OpenAI in the early days was the risk of “hardware overhang,” that slower development of AI would result in building AI with less hardware at a time when they can be more explosively scaled up with massively disruptive consequences. I think that in hindsight this effect seems like it was real, and I would guess that it is larger than the entire positive impact of the additional direct work that would be done by the AI safety community if AI progress had been slower 5 years ago.
Could you clarify this bit? It sounds like you’re saying that OpenAI’s capabilities work around 2017 was net-positive for reducing misalignment risk, even if the only positive we count is this effect. (Unless you think that there’s substantial reason that acceleration is bad other than giving the AI safety community less time.) But then in the next paragraph you say that this argument was wrong (even before GPT-3 was released, which vaguely gestures at the “around 2017”-time). I don’t see how those are compatible.
One positive consideration is: AI will be built at a time when it is more expensive (slowing later progress). One negative consideration is: there was less time for AI-safety-work-of-5-years-ago. I think that this particular positive consideration is larger than this particular negative consideration, even though other negative considerations are larger still (like less time for growth of AI safety community).
Are you saying that the AI safety community gets less effective at advancing SOTA interpretability/etc. as it gets more funding/interest, or that the negative consideration is the fact that the AI safety has had less time to grow, or something else? It seems odd to me that AI safety research progress would be negatively correlated with the size and amount of volunteer hours in the field, though I can imagine reasons why someone would think that.
Could you clarify this bit? It sounds like you’re saying that OpenAI’s capabilities work around 2017 was net-positive for reducing misalignment risk, even if the only positive we count is this effect. (Unless you think that there’s substantial reason that acceleration is bad other than giving the AI safety community less time.) But then in the next paragraph you say that this argument was wrong (even before GPT-3 was released, which vaguely gestures at the “around 2017”-time). I don’t see how those are compatible.
One positive consideration is: AI will be built at a time when it is more expensive (slowing later progress). One negative consideration is: there was less time for AI-safety-work-of-5-years-ago. I think that this particular positive consideration is larger than this particular negative consideration, even though other negative considerations are larger still (like less time for growth of AI safety community).
Are you saying that the AI safety community gets less effective at advancing SOTA interpretability/etc. as it gets more funding/interest, or that the negative consideration is the fact that the AI safety has had less time to grow, or something else? It seems odd to me that AI safety research progress would be negatively correlated with the size and amount of volunteer hours in the field, though I can imagine reasons why someone would think that.
I’m saying that faster progress gives less time for the AI safety community to grow. (I added “less time for” to the original comment to clarify.)
Ahh, ok.