Surely you could give an elaborate answer yourself.
I worry that this style of discussion is adversarial, and leads to conflict and polarization rather than advancing understanding.
Science is usually adversarial or advocacy based. It’s also usually very slow to arrive at consensus and thereby drive policy cleanly. We need to do better than traditional science.
If the outside perspective is just “experts disagree and argue,” decision-makers will be free to pick whatever perspective matches their Motivated reasoning, and confirmation bias.
Taking a collaborative and problem-solving approach may produce consensus more rapidly. It’s worth a shot. Arguing and pointing fingers is known to be slow and cause disagreement among experts.
And just to be clear, I think the fields’s consensus should be that alignment could be very tricky, so we should slow down progress if at all possible, and devote a lot more resources to alignment work.
Surely you could give an elaborate answer yourself.
I really genuinely couldn’t!
I can list reasons that it’s difficult, and I can list reasons that it’s quite difficult in the long run, assuming compute isn’t a big bottleneck. But I don’t believe that the assumption of long-run, non-compute-bottlenecked research being the main driver of AGI progress is an assumption held by Evan or by Anthropic. I could be incorrect about this, happy to be corrected.
I don’t know it to be infeasible to globally prevent AGI research that involves significant resources. Do you know this? I’m genuinely asking. I’m not aware of any serious case laid out for this. That’s my ignorance, someone could just link such to me! (I did run an AI search for such, without relevant results.)
Like, in my ignorance, as far as I’ve personally seen / tracked, Anthropic seems be almost entirely, though not entirely, messaging on the presumption that a global stop is infeasible; and they use that presumption as justification for leading frontier AGI research including RSI. Is this presumption seriously defended anywhere?
I see. I thought it was pretty clear why requirements to do more alignment work would be a lot easier and more attainable ask than a pause.
To answer your question: I don’t know where anyone has tried to carefully make the case that a pause is impossible. It’s more that there’s an assumption that it’s quite hard, so the burden of proof falls more on those calling for one to propose a plan that could work. I don’t think a pause is impossible, although I think it’s unlikely to get one right now soon.
So I think the assumption is that it would be easier to get requirements for alignment work as a requirement of US labs. With that, they could still stay ahead of Chinese development. That lowers the political bar substantially.
I personally hope that the government will take AGI and therefore misalignment risks more seriously as more people in govbernment see superhuman AI with their own eyes. I hope this will prompt them to slow down US progress, and try to negotiate cooperation with China. I”ve written about this in Whether governments will control AGI is important and neglected and have a mostly-finished draft followup with the working title “The goverment will assert control of AGI”.
I also think it’s not obvious that China is a less cautious actor. There have been recent good posts questioning this; China won’t win the AI race but would it be much worse if it did? and others have previously, including me, have pointed out that Chinese leadership seems much more cautious by nature. And others have pointed out that, in the case of successful intent alignment, having the Chinese government taking over the future really probably wouldn’t be much worse in expectation than having the US government take it over. But that’s a tougher sell for political purposes. We probably can’t get through a US government commitment to unilaterally pausing, even once they do see the danger of misalignment more clearly. But I suspect the Chinese would be pretty eager for any deal other than “we’ll race as fast as we can and maybe get everyone killed and pretty much take over the world if we get aligned ASI first”.
On Anthropic, even though that’s not the focus of your question: I don’t think Anthropic have sounded like they’ve been claiming that pause is impossible, just there’s not one on offer. Them pausing won’t make others pause. This does raise the question of why they’re not mentioning that they’d pause if others did. I believe they’re now on record stating they would (although of course this isnt’ a firm commitment). In their recent post When AI builds itself on RSI and the implications, they explicitly say that it would be wise to slow down the rush to ASI if “less cautious” actors don’t keep going full speed. And they say the Anthropic Institute will be studying how to do that. This is somewhat beside the point, except that it is a bit more explicit statement of their position.
I thought it was pretty clear why requirements to do more alignment work would be a lot easier and more attainable ask than a pause.
Well, Evan wrote:
I think this is much better than other slowdown proposals I’ve seen, since I think it’s very concrete, verifiable, and enforceable
It seems more attainable, sure, but also way way less useful, to the point where I don’t understand how it’s better than even something as sketchy as a napkin with “Advocate for a global stop on AGI research” written on it. Do you disagree?
We probably can’t get through a US government commitment to unilaterally pausing
I’m not especially advocating for that, though it might be good, IDK. I’m advocating for Anthropic to advocate for a global agreement.
I don’t know where anyone has tried to carefully make the case that a pause is impossible. It’s more that there’s an assumption that it’s quite hard, so the burden of proof falls more on those calling for one to propose a plan that could work. I don’t think a pause is impossible, although I think it’s unlikely to get one right now soon.
Do you agree that this assumption is a founding pillar of Anthropic’s strategy, at least as they present it? If this assumption is justifying doing frontier AGI research, should it be, like, argued somewhere? Don’t you think it’s weird that Anthropic as a whole big entity seems to be acting on this assumption, without a public argument for it? Do you think there’s no good argument for it, or do you think that it’s all in private somehow?
Something I’ve thought before is that it seems like most people are rolling their own conclusion about the political feasibility of pausing. They think about it for five minutes or less, and then they’re done; they decide whether building a US pause coalition sounds reasonable, or whether China could be cooperated with, mostly on priors. There’s no Rootclaim for the politics of an AI pause. No one org owns a pipeline for doing this research, not even for the narrow version of message testing. There’s just PauseAI, ControlAI, and StopAI doing their own scattered advocacy efforts live, with close to no support. Vastly influential decisions that caused hundreds of millions to flow to technical safety, and at most a few million to advocacy, were made mostly on vibes. No one even did A/B testing or focus groups for pause messaging until last year!
Mind that I don’t mean the technical implementation or effects of a pause, like with what MIRI does or what the 2023 AI Pause Debate did. I mean whether it seems politically achievable, whether it’s even in the Overton window, or if the window can be moved there. We saw with how PEPFAR was founded that sometimes a political miracle can just happen if there’s the will for it.
I think the question of “is a pause politically feasible” should have an adversarial collaboration done on it. That’s one of the best methods of truth-finding I know of, and it’s sad we don’t do more of it.
I don’t know it to be infeasible to globally prevent AGI research that involves significant resources. Do you know this?
By “infeasible”, do you mean strictly “technically infeasible / infeasible even if there’s the will to make it happen”, or something that also includes “tractable / politically palatable”?
Surely you could give an elaborate answer yourself.
I worry that this style of discussion is adversarial, and leads to conflict and polarization rather than advancing understanding.
Science is usually adversarial or advocacy based. It’s also usually very slow to arrive at consensus and thereby drive policy cleanly. We need to do better than traditional science.
If the outside perspective is just “experts disagree and argue,” decision-makers will be free to pick whatever perspective matches their Motivated reasoning, and confirmation bias.
Taking a collaborative and problem-solving approach may produce consensus more rapidly. It’s worth a shot. Arguing and pointing fingers is known to be slow and cause disagreement among experts.
And just to be clear, I think the fields’s consensus should be that alignment could be very tricky, so we should slow down progress if at all possible, and devote a lot more resources to alignment work.
I really genuinely couldn’t!
I can list reasons that it’s difficult, and I can list reasons that it’s quite difficult in the long run, assuming compute isn’t a big bottleneck. But I don’t believe that the assumption of long-run, non-compute-bottlenecked research being the main driver of AGI progress is an assumption held by Evan or by Anthropic. I could be incorrect about this, happy to be corrected.
I don’t know it to be infeasible to globally prevent AGI research that involves significant resources. Do you know this? I’m genuinely asking. I’m not aware of any serious case laid out for this. That’s my ignorance, someone could just link such to me! (I did run an AI search for such, without relevant results.)
Like, in my ignorance, as far as I’ve personally seen / tracked, Anthropic seems be almost entirely, though not entirely, messaging on the presumption that a global stop is infeasible; and they use that presumption as justification for leading frontier AGI research including RSI. Is this presumption seriously defended anywhere?
I see. I thought it was pretty clear why requirements to do more alignment work would be a lot easier and more attainable ask than a pause.
To answer your question: I don’t know where anyone has tried to carefully make the case that a pause is impossible. It’s more that there’s an assumption that it’s quite hard, so the burden of proof falls more on those calling for one to propose a plan that could work. I don’t think a pause is impossible, although I think it’s unlikely to get one right now soon.
So I think the assumption is that it would be easier to get requirements for alignment work as a requirement of US labs. With that, they could still stay ahead of Chinese development. That lowers the political bar substantially.
I personally hope that the government will take AGI and therefore misalignment risks more seriously as more people in govbernment see superhuman AI with their own eyes. I hope this will prompt them to slow down US progress, and try to negotiate cooperation with China. I”ve written about this in Whether governments will control AGI is important and neglected and have a mostly-finished draft followup with the working title “The goverment will assert control of AGI”.
I also think it’s not obvious that China is a less cautious actor. There have been recent good posts questioning this; China won’t win the AI race but would it be much worse if it did? and others have previously, including me, have pointed out that Chinese leadership seems much more cautious by nature. And others have pointed out that, in the case of successful intent alignment, having the Chinese government taking over the future really probably wouldn’t be much worse in expectation than having the US government take it over. But that’s a tougher sell for political purposes. We probably can’t get through a US government commitment to unilaterally pausing, even once they do see the danger of misalignment more clearly. But I suspect the Chinese would be pretty eager for any deal other than “we’ll race as fast as we can and maybe get everyone killed and pretty much take over the world if we get aligned ASI first”.
On Anthropic, even though that’s not the focus of your question: I don’t think Anthropic have sounded like they’ve been claiming that pause is impossible, just there’s not one on offer. Them pausing won’t make others pause. This does raise the question of why they’re not mentioning that they’d pause if others did. I believe they’re now on record stating they would (although of course this isnt’ a firm commitment). In their recent post When AI builds itself on RSI and the implications, they explicitly say that it would be wise to slow down the rush to ASI if “less cautious” actors don’t keep going full speed. And they say the Anthropic Institute will be studying how to do that. This is somewhat beside the point, except that it is a bit more explicit statement of their position.
Well, Evan wrote:
It seems more attainable, sure, but also way way less useful, to the point where I don’t understand how it’s better than even something as sketchy as a napkin with “Advocate for a global stop on AGI research” written on it. Do you disagree?
I’m not especially advocating for that, though it might be good, IDK. I’m advocating for Anthropic to advocate for a global agreement.
Do you agree that this assumption is a founding pillar of Anthropic’s strategy, at least as they present it? If this assumption is justifying doing frontier AGI research, should it be, like, argued somewhere? Don’t you think it’s weird that Anthropic as a whole big entity seems to be acting on this assumption, without a public argument for it? Do you think there’s no good argument for it, or do you think that it’s all in private somehow?
BTW, do you work at Anthropic?
Something I’ve thought before is that it seems like most people are rolling their own conclusion about the political feasibility of pausing. They think about it for five minutes or less, and then they’re done; they decide whether building a US pause coalition sounds reasonable, or whether China could be cooperated with, mostly on priors. There’s no Rootclaim for the politics of an AI pause. No one org owns a pipeline for doing this research, not even for the narrow version of message testing. There’s just PauseAI, ControlAI, and StopAI doing their own scattered advocacy efforts live, with close to no support. Vastly influential decisions that caused hundreds of millions to flow to technical safety, and at most a few million to advocacy, were made mostly on vibes. No one even did A/B testing or focus groups for pause messaging until last year!
Mind that I don’t mean the technical implementation or effects of a pause, like with what MIRI does or what the 2023 AI Pause Debate did. I mean whether it seems politically achievable, whether it’s even in the Overton window, or if the window can be moved there. We saw with how PEPFAR was founded that sometimes a political miracle can just happen if there’s the will for it.
I think the question of “is a pause politically feasible” should have an adversarial collaboration done on it. That’s one of the best methods of truth-finding I know of, and it’s sad we don’t do more of it.
By “infeasible”, do you mean strictly “technically infeasible / infeasible even if there’s the will to make it happen”, or something that also includes “tractable / politically palatable”?
Feasible overall, so including political feasibility.