A common view is that the timelines to risky AI are largely driven by hardware progress and deep learning progress occurring outside of OpenAI. Many people (both at OpenAI and elsewhere) believe that questions of who builds AI and how are very important relative to acceleration of AI timelines. This is related to lower estimates of alignment risk, higher estimates of the importance of geopolitical conflict, and (perhaps most importantly of all) radically lower estimates for the amount of useful alignment progress that would occur this far in advance of AI if progress were to be slowed down. Below I’ll also discuss two arguments that delaying AI progress would on net reduce alignment risk which I often encountered at OpenAI.
I think that OpenAI has had a meaningful effect on accelerating AI timelines and that this was a significant cost that the organization did not adequately consider (plenty of safety-focused folk pushed back on various accelerating decisions and this is ultimately related to many departures though not directly my own). I also think that OpenAI is significantly driven by the desire to do something impactful and to reap the short-term benefits of AI. In significant part that’s about wanting to be involved in altruistic benefits (though it’s also based on a more basic and generally scary desire to just do something impactful). I think that OpenAI folks’ views on altruistic benefits are based on some claims I agree with about possible impacts, but also on them caring less than I do about future generations and by having what I regard as mistaken empirical views (which partly persist because many folks have underinvested in careful thinking about the future).
That said, I think that the LW community significantly overestimates the negative impact of OpenAI’s timeline-accelerating effects to date, and I suspect that these do not dominate their net impacts (neither do the claims about disrupting a relatively flimsy “only DeepMind works on AGI” equilibrium). That still leaves room for debate about whether the other impacts are positive or negative.
It’s worth being aware of some common arguments that acceleration is less bad than it looks or even net positive:
I think it’s basically reasonable to think that MIRI and the broader AI safety community made very little meaningful progress over the last 10 years, and to have the view that the overwhelmingly dominant drivers of accelerating alignment progress have been and will continue to be increased interest and investment as AI improves (this seems wrong to me in large part because the AI safety community and EA community more broadly have been growing independent of increased interest in AI). If that were the case, then cutting one month off of AI timelines does not have much direct effect on our ability to manage AI risk via giving us more time for alignment research, and the calculus is instead dominated by other trends in the world are positive or negative (e.g. how much do you think general institutional capacity is improving vs deteriorating over the coming decades, how worried are you about rise of China relative to the west, etc.)
Another fairly common argument and motivation at OpenAI in the early days was the risk of “hardware overhang,” that slower development of AI would result in building AI with less hardware at a time when they can be more explosively scaled up with massively disruptive consequences. I think that in hindsight this effect seems like it was real, and I would guess that it is larger than the entire positive impact of the additional direct work that would be done by the AI safety community if AI progress had been slower 5 years ago. I think the LW community considers this argument non-serious, but in my opinion (and I expect the judgment of most independent observers) the empirical track record of this community and Eliezer on the relevant AI forecasts seems bad enough that no one should be taking community consensus on that point as a source of independent evidence.
I think that both of those arguments were significantly more plausible in the past and particularly before the release of GPT-3, though I still think they were wrong and likely in significant part the result of motivated cognition (or more realistically memetic and political selection within OpenAI and the adjacent communities).
At this point I think it’s fairly clear that if OpenAI were focused on making the long-term future good they should not be disclosing or deploying improved systems (and it seems likely to me that they should not even be developing them), so the main point of debate is exactly how bad it is. I think it’s less obvious whether it is good or bad on a certain kind of myopic altruism since I’d guess that the cost of 1 year of acceleration is less than a 1% reduction in survival probability (while ~1% of people die each year and people might reasonably value the profound suffering that occurs over a single year at 1% of survival).
Overall I think the LW community tends to be kind of deontological about this, and that when making quantitative estimates they tend to be at best debatable (and wildly overconfident and aggressive). I’d guess these overall decrease the efficiency of the LW community as a good influence on labs or force for good in the world.
Another fairly common argument and motivation at OpenAI in the early days was the risk of “hardware overhang,” that slower development of AI would result in building AI with less hardware at a time when they can be more explosively scaled up with massively disruptive consequences. I think that in hindsight this effect seems like it was real, and I would guess that it is larger than the entire positive impact of the additional direct work that would be done by the AI safety community if AI progress had been slower 5 years ago.
Could you clarify this bit? It sounds like you’re saying that OpenAI’s capabilities work around 2017 was net-positive for reducing misalignment risk, even if the only positive we count is this effect. (Unless you think that there’s substantial reason that acceleration is bad other than giving the AI safety community less time.) But then in the next paragraph you say that this argument was wrong (even before GPT-3 was released, which vaguely gestures at the “around 2017”-time). I don’t see how those are compatible.
One positive consideration is: AI will be built at a time when it is more expensive (slowing later progress). One negative consideration is: there was less time for AI-safety-work-of-5-years-ago. I think that this particular positive consideration is larger than this particular negative consideration, even though other negative considerations are larger still (like less time for growth of AI safety community).
Are you saying that the AI safety community gets less effective at advancing SOTA interpretability/etc. as it gets more funding/interest, or that the negative consideration is the fact that the AI safety has had less time to grow, or something else? It seems odd to me that AI safety research progress would be negatively correlated with the size and amount of volunteer hours in the field, though I can imagine reasons why someone would think that.
A common view is that the timelines to risky AI are largely driven by hardware progress and deep learning progress occurring outside of Open
What’s the justification for this view? It seems like significant deep learning process happens inside of OpenAI.
Many people (both at OpenAI and elsewhere) believe that questions of who builds AI and how are very important relative to acceleration of AI timelines.
If who builds AI is such an important question for OpenAI, then why would they publish capabilities research thus giving up majority of control on who builds AI and how?
At this point I think it’s fairly clear that if OpenAI were focused on making the long-term future good they should not be disclosing or deploying improved systems (and it seems most likely they should not even be developing them), so the main point of debate is exactly how bad it is.
To a layman, It seems like they’re on track to deploy GPT-4 as well as publish all the capabilities research related to that soon. Is there any reason to hope they won’t be doing that?
~1% of people die each year and people might reasonably value the profound suffering that occurs over a single year at 1% of survival).
How is the harm caused by 1% of people dying even remotely equivalent to 1% reduction in survival, even without considering the value lost in the future lightcone?
It seems highly doubtful to me that OpenAI’s dedication to doing and publishing capabilities research is a deliberate choice to accelerate timelines due to their deep philosophical adherence to myopic altruism.
I don’t think they would be doing this if they actually thought they were increasing p(doom) by 1% (which is already an optimistic estimate) per 1 year acceleration of timelines—a much simpler explanation is that they’re at least somewhat longtermist (like most humans) but they don’t really think there’s a significant p(doom) (at least the capabilities researchers and the leadership team).
This seems wrong to me in large part because the AI safety community and EA community more broadly have been growing independent of increased interest in AI
Agreed, this is one of the biggest considerations missed, in my opinion, by people who think accelerating progress was good. (TBH, if anyone was attempting to accelerate progress to reduce AI risk, I think that they were trying to be too clever by half; or just rationalisting).
A common view is that the timelines to risky AI are largely driven by hardware progress and deep learning progress occurring outside of OpenAI. Many people (both at OpenAI and elsewhere) believe that questions of who builds AI and how are very important relative to acceleration of AI timelines. This is related to lower estimates of alignment risk, higher estimates of the importance of geopolitical conflict, and (perhaps most importantly of all) radically lower estimates for the amount of useful alignment progress that would occur this far in advance of AI if progress were to be slowed down. Below I’ll also discuss two arguments that delaying AI progress would on net reduce alignment risk which I often encountered at OpenAI.
I think that OpenAI has had a meaningful effect on accelerating AI timelines and that this was a significant cost that the organization did not adequately consider (plenty of safety-focused folk pushed back on various accelerating decisions and this is ultimately related to many departures though not directly my own). I also think that OpenAI is significantly driven by the desire to do something impactful and to reap the short-term benefits of AI. In significant part that’s about wanting to be involved in altruistic benefits (though it’s also based on a more basic and generally scary desire to just do something impactful). I think that OpenAI folks’ views on altruistic benefits are based on some claims I agree with about possible impacts, but also on them caring less than I do about future generations and by having what I regard as mistaken empirical views (which partly persist because many folks have underinvested in careful thinking about the future).
That said, I think that the LW community significantly overestimates the negative impact of OpenAI’s timeline-accelerating effects to date, and I suspect that these do not dominate their net impacts (neither do the claims about disrupting a relatively flimsy “only DeepMind works on AGI” equilibrium). That still leaves room for debate about whether the other impacts are positive or negative.
It’s worth being aware of some common arguments that acceleration is less bad than it looks or even net positive:
I think it’s basically reasonable to think that MIRI and the broader AI safety community made very little meaningful progress over the last 10 years, and to have the view that the overwhelmingly dominant drivers of accelerating alignment progress have been and will continue to be increased interest and investment as AI improves (this seems wrong to me in large part because the AI safety community and EA community more broadly have been growing independent of increased interest in AI). If that were the case, then cutting one month off of AI timelines does not have much direct effect on our ability to manage AI risk via giving us more time for alignment research, and the calculus is instead dominated by other trends in the world are positive or negative (e.g. how much do you think general institutional capacity is improving vs deteriorating over the coming decades, how worried are you about rise of China relative to the west, etc.)
Another fairly common argument and motivation at OpenAI in the early days was the risk of “hardware overhang,” that slower development of AI would result in building AI with less hardware at a time when they can be more explosively scaled up with massively disruptive consequences. I think that in hindsight this effect seems like it was real, and I would guess that it is larger than the entire positive impact of the additional direct work that would be done by the AI safety community if AI progress had been slower 5 years ago. I think the LW community considers this argument non-serious, but in my opinion (and I expect the judgment of most independent observers) the empirical track record of this community and Eliezer on the relevant AI forecasts seems bad enough that no one should be taking community consensus on that point as a source of independent evidence.
I think that both of those arguments were significantly more plausible in the past and particularly before the release of GPT-3, though I still think they were wrong and likely in significant part the result of motivated cognition (or more realistically memetic and political selection within OpenAI and the adjacent communities).
At this point I think it’s fairly clear that if OpenAI were focused on making the long-term future good they should not be disclosing or deploying improved systems (and it seems likely to me that they should not even be developing them), so the main point of debate is exactly how bad it is. I think it’s less obvious whether it is good or bad on a certain kind of myopic altruism since I’d guess that the cost of 1 year of acceleration is less than a 1% reduction in survival probability (while ~1% of people die each year and people might reasonably value the profound suffering that occurs over a single year at 1% of survival).
Overall I think the LW community tends to be kind of deontological about this, and that when making quantitative estimates they tend to be at best debatable (and wildly overconfident and aggressive). I’d guess these overall decrease the efficiency of the LW community as a good influence on labs or force for good in the world.
Could you clarify this bit? It sounds like you’re saying that OpenAI’s capabilities work around 2017 was net-positive for reducing misalignment risk, even if the only positive we count is this effect. (Unless you think that there’s substantial reason that acceleration is bad other than giving the AI safety community less time.) But then in the next paragraph you say that this argument was wrong (even before GPT-3 was released, which vaguely gestures at the “around 2017”-time). I don’t see how those are compatible.
One positive consideration is: AI will be built at a time when it is more expensive (slowing later progress). One negative consideration is: there was less time for AI-safety-work-of-5-years-ago. I think that this particular positive consideration is larger than this particular negative consideration, even though other negative considerations are larger still (like less time for growth of AI safety community).
Are you saying that the AI safety community gets less effective at advancing SOTA interpretability/etc. as it gets more funding/interest, or that the negative consideration is the fact that the AI safety has had less time to grow, or something else? It seems odd to me that AI safety research progress would be negatively correlated with the size and amount of volunteer hours in the field, though I can imagine reasons why someone would think that.
I’m saying that faster progress gives less time for the AI safety community to grow. (I added “less time for” to the original comment to clarify.)
Ahh, ok.
What’s the justification for this view? It seems like significant deep learning process happens inside of OpenAI.
If who builds AI is such an important question for OpenAI, then why would they publish capabilities research thus giving up majority of control on who builds AI and how?
To a layman, It seems like they’re on track to deploy GPT-4 as well as publish all the capabilities research related to that soon. Is there any reason to hope they won’t be doing that?
How is the harm caused by 1% of people dying even remotely equivalent to 1% reduction in survival, even without considering the value lost in the future lightcone?
It seems highly doubtful to me that OpenAI’s dedication to doing and publishing capabilities research is a deliberate choice to accelerate timelines due to their deep philosophical adherence to myopic altruism.
I don’t think they would be doing this if they actually thought they were increasing p(doom) by 1% (which is already an optimistic estimate) per 1 year acceleration of timelines—a much simpler explanation is that they’re at least somewhat longtermist (like most humans) but they don’t really think there’s a significant p(doom) (at least the capabilities researchers and the leadership team).
I think Paul was speaking in 3rd person for parts of it where you didn’t realize
Agreed, this is one of the biggest considerations missed, in my opinion, by people who think accelerating progress was good. (TBH, if anyone was attempting to accelerate progress to reduce AI risk, I think that they were trying to be too clever by half; or just rationalisting).