You can do alignment research without doing current-flavored capabilities research. For example, no one has the concepts needed to understand values, which would probably be needed to do alignment; this can, and probably must, be investigated conceptually.
Before we get to these points I want to rehilight and briefly expand on the other two points in my original comments.
Stopping all research would be much more difficulty politically in the U.S.
This would erase trillions of dollars of value, potentially 10s of trillions, something the U.S. government has a lot of incentives not to do.
As Kokotajlo points out, this could feasibly be done just in the U.S. without the U.S. losing their lead.
To get anything close to a real global pause the U.S. or others would have to commit (and probably take) military action to enforce it, again something the U.S. government has a lot of incentives not to do.
To address your points. I agree with you on 1. to an extent, but think that the amount of work that we can get done in a hard pause is probably fairly limited. This is mostly an intuition based on the fact that folks have been working on the theoretical questions for decades and that many of the remaining non-technical questions rely on people and governments taking AGI very seriously, something that is not the case right now and I believe is unlikely to become the case under a pause.
For point 2, I need to read more about feasibility of human intelligence amplifications. My first reaction though is that either we are very close to RSI/AGI/ASI in which case I think a global pause does not slow progress enough (due to enforcement issues) that long term plans are not good enough or we are not very close to RSI/AGI/ASI in which case we should hold off on a hard pause because the case for it will be stronger in the future. I think “giving everyone more time to understand the problem better and figure out ways to continue stopping” is a pretty strong argument, although I do worry that a stop would lower the salience of the issue.
people and governments taking AGI very seriously, something that is not the case right now and I believe is unlikely to become the case under a pause.
By a “pause”, the main thing I mean is “an international treaty to stop AGI globally”. I assume you’d think that’s unlikely because it’s unlikely that people + gvts would take it seriously enough. I don’t want to make a strong claim about it being likely feasible, and presumably even if it’s doable it would be a huge amount of hard work. But are you strongly claiming that it’s infeasible? If so, that’s the position I’d like to understand—why do you think that, if you do? Is this a case that’s been worked out and explained somewhere? Has it been debated seriously?
we are very close to RSI/AGI/ASI in which case I think a global pause does not slow progress enough (due to enforcement issues)
I mean, I agree that it’s kinda hard, but if AGI is very close, it probably involves lots of compute, right? Big piles of compute seem plausibly regulable. That doesn’t seem like an infeasible enforcement issue.
we are not very close to RSI/AGI/ASI in which case we should hold off on a hard pause because the case for it will be stronger in the future. …. although I do worry that a stop would lower the salience of the issue.
Neither of these arguments make sense to me, and seem quite opposite the truth. Like, we should stop ASAP so that we’re not in a terrible time crunch, right?
But are you strongly claiming that it’s infeasible? If so, that’s the position I’d like to understand—why do you think that, if you do? Is this a case that’s been worked out and explained somewhere? Has it been debated seriously?
I don’t think you can talk about feasibility in a vacuum. Is it feasible that Congress pass a constitutional amendment removing the electoral college that is ratified by the states? From a technical sense, of course. From my perspective as someone who wants that to happen, not really.
Let me try to lay out my overall position.
I think right now you can break players into three categories
If all players are in group 1 (and knew the others were in group 1) a global pause is trivial. If all players are in groups 1 and 2 a global pause is a technical question: Can you create a treaty/system strong enough that secret defection is not possible? Players in group 3 have no interest in a pause and would have to be brought to the table with other incentives.
It’s very hard to know who is in which camp. I’d guess lab employees are largely split between 1, 2, and 3. I think the companies themselves are probably in groups 2 or 3.I think the U.S. government is in group 3 (caveat that there are lots of competing voices here and I have a very hard time modelling Trump). I have no idea where China or other world powers are.
The groups are more of a spectrum than binary. Someone in group 2 who thinks the status quo is very negative will be willing to spend more resources on a pause than someone who more loosely prefers a pause.
So I think the question of feasibility is how feasible is it for people in groups 1 and 2 to get the U.S. government into group 2 and strongly enough in group 2 that it is willing to expend lots of resources on a treaty (and I think since there are other players in group 3 you would need lots of resources). I don’t have a rock solid case that this is infeasible (I don’t think you ever really could), but I think it likely is at this time. EDIT: Adding in a little more here because I saw your conversations with others and think this may be our central disagreement. I think many people overrate how politically salient AI is. Anti-AI sentiment is all over the place, but I think its a mile wide and an inch deep. When you look at Gallup polling of the most important issues facing our country, AI doesn’t show up. I think any politicians that did serious damage to the U.S. economy and potentially started wars to pause AI would be electorally punished. I think if there are anti-AI legislation that is electorally valuable it will probably be surface level stuff that doesn’t move the needle on AI risk (e.g. data center water regulations). Now voters are not an immovable force, but I think it will be hard to move them quickly and may take some other shock to the system (impactful warning shot, widespread job loss, etc.)
Neither of these arguments make sense to me, and seem quite opposite the truth. Like, we should stop ASAP so that we’re not in a terrible time crunch, right?
As for why we might want to wait for a pause if AGI is further away, If we could have a strong pause now, I’d definitely agree with you. However, I think if you were to push the world into a weakly enforced pause it may be harder to get a strong pause down the line. In the status quo I hope we can slow down capabilities work, build up chip control systems and other monitoring options and as AI gets more powerful (hopefully before we all die) people will move towards a real pause. My concern is that a weak pause drives AI development underground, differentially hurts safety, and doesn’t allow people to update in the direction of a real pause. Like a think a world where AI development is nominally illegal, but the Chinese and U.S. Governments both had well funded secret programs is much worse than Evan’s proposal and likely worse than the status quo.
FWIW if I were a world dictator I would implement a global pause followed by heavily regulated and monitored technical alignment research. I also weakly hold the belief that people who think we are in an emergency situation should state that clearly and strongly, but have found some arguments for being more strategic compelling (e.g. Holden Karnofsky on the 80,000 hours podcast).
I think many people overrate how politically salient AI is.
I would say that much of my intuition that a pause (yes, driven by various national governments, and public desire) is plausibly doable comes from “the trajectory” of sentiments, rather than the total amount. I agree as a fraction of total political discourse it’s quite small.
Anti-AI sentiment is all over the place, but I think its a mile wide and an inch deep
This vaguely matches my impression for the most part, in the sense that if you look at discussion on AI in general, much of it may be negative but most of that isn’t people who deeply worry + care about x-risk. But to refine the picture, it’s mile wide, inch deep, but getting full of holes: it seems like there’s an ever growing number of people, including higher ups both in AI and also in government, who seem to take x-risk seriously, largely in words but also in more meaningful ways like drafting bills and stuff like that.
I think any politicians that did serious damage to the U.S. economy and potentially started wars to pause AI would be electorally punished.
Pardon my stupid question, but what goes wrong concretely? If you ban AGI, but let people keep running existing LLMs (say), does this really cause big and legible enough economic damage that voters would actually move? I mean, I’d think there’s lots of economically damaging things that don’t get credit-assigned into much actual vote shifts.
My concern is that a weak pause drives AI development underground, differentially hurts safety, and doesn’t allow people to update in the direction of a real pause. Like a think a world where AI development is nominally illegal, but the Chinese and U.S. Governments both had well funded secret programs is much worse than Evan’s proposal and likely worse than the status quo.
I may bow out, and feel free to take the maybe last, but I’ll just note that this still doesn’t make sense to me basically at all. Like, yes, there’s kinda-plausible scenarios where a global pause surprisingly ends up worse. But it would still be surprising, right? Like, there’s probably less human cloning right now, compared to the counterfactual where it wasn’t banned, right?? Someone could go underground with it, but that’s really hard and takes work!
I agree that slightly/somewhat unprecedentedly much access (official or espionage) might have to be somehow granted + enforced for treaty implementation. Maybe this is a strong defeater, I just don’t see it. Like, I agree it seems potentially kinda hard or quite hard in some scenarios. But this “global ban is actually worse than hugely resources companies going full tilt or even slightly less full tilt because of a “slowdown”″ seems galaxy-brained and false.
You can do alignment research without doing current-flavored capabilities research. For example, no one has the concepts needed to understand values, which would probably be needed to do alignment; this can, and probably must, be investigated conceptually.
An intervention has many effects and differentially affects many processes, not just capabilities and alignment. Cf. https://www.lesswrong.com/posts/K4K6ikQtHxcG49Tcn/hia-and-x-risk-part-2-why-it-hurts#An_ontology_of_effects_of_interventions_on_world_processes For example, a stop gives everyone more time to understand the problem better and figure out ways to continue stopping. It also gives more time for human intelligence amplification, and other ways that humanity can make itself more able to survive the possibility of AGI.
Before we get to these points I want to rehilight and briefly expand on the other two points in my original comments.
Stopping all research would be much more difficulty politically in the U.S.
This would erase trillions of dollars of value, potentially 10s of trillions, something the U.S. government has a lot of incentives not to do.
As Kokotajlo points out, this could feasibly be done just in the U.S. without the U.S. losing their lead.
To get anything close to a real global pause the U.S. or others would have to commit (and probably take) military action to enforce it, again something the U.S. government has a lot of incentives not to do.
To address your points. I agree with you on 1. to an extent, but think that the amount of work that we can get done in a hard pause is probably fairly limited. This is mostly an intuition based on the fact that folks have been working on the theoretical questions for decades and that many of the remaining non-technical questions rely on people and governments taking AGI very seriously, something that is not the case right now and I believe is unlikely to become the case under a pause.
For point 2, I need to read more about feasibility of human intelligence amplifications. My first reaction though is that either we are very close to RSI/AGI/ASI in which case I think a global pause does not slow progress enough (due to enforcement issues) that long term plans are not good enough or we are not very close to RSI/AGI/ASI in which case we should hold off on a hard pause because the case for it will be stronger in the future. I think “giving everyone more time to understand the problem better and figure out ways to continue stopping” is a pretty strong argument, although I do worry that a stop would lower the salience of the issue.
By a “pause”, the main thing I mean is “an international treaty to stop AGI globally”. I assume you’d think that’s unlikely because it’s unlikely that people + gvts would take it seriously enough. I don’t want to make a strong claim about it being likely feasible, and presumably even if it’s doable it would be a huge amount of hard work. But are you strongly claiming that it’s infeasible? If so, that’s the position I’d like to understand—why do you think that, if you do? Is this a case that’s been worked out and explained somewhere? Has it been debated seriously?
I mean, I agree that it’s kinda hard, but if AGI is very close, it probably involves lots of compute, right? Big piles of compute seem plausibly regulable. That doesn’t seem like an infeasible enforcement issue.
Neither of these arguments make sense to me, and seem quite opposite the truth. Like, we should stop ASAP so that we’re not in a terrible time crunch, right?
I don’t think you can talk about feasibility in a vacuum. Is it feasible that Congress pass a constitutional amendment removing the electoral college that is ratified by the states? From a technical sense, of course. From my perspective as someone who wants that to happen, not really.
Let me try to lay out my overall position.
I think right now you can break players into three categories
Those who model the situation as a stag hunt.
Those that model the situation as a prisoner’s dilemma.
Those that model the situation as a deadlock.
If all players are in group 1 (and knew the others were in group 1) a global pause is trivial. If all players are in groups 1 and 2 a global pause is a technical question: Can you create a treaty/system strong enough that secret defection is not possible? Players in group 3 have no interest in a pause and would have to be brought to the table with other incentives.
It’s very hard to know who is in which camp. I’d guess lab employees are largely split between 1, 2, and 3. I think the companies themselves are probably in groups 2 or 3.I think the U.S. government is in group 3 (caveat that there are lots of competing voices here and I have a very hard time modelling Trump). I have no idea where China or other world powers are.
The groups are more of a spectrum than binary. Someone in group 2 who thinks the status quo is very negative will be willing to spend more resources on a pause than someone who more loosely prefers a pause.
So I think the question of feasibility is how feasible is it for people in groups 1 and 2 to get the U.S. government into group 2 and strongly enough in group 2 that it is willing to expend lots of resources on a treaty (and I think since there are other players in group 3 you would need lots of resources). I don’t have a rock solid case that this is infeasible (I don’t think you ever really could), but I think it likely is at this time. EDIT: Adding in a little more here because I saw your conversations with others and think this may be our central disagreement. I think many people overrate how politically salient AI is. Anti-AI sentiment is all over the place, but I think its a mile wide and an inch deep. When you look at Gallup polling of the most important issues facing our country, AI doesn’t show up. I think any politicians that did serious damage to the U.S. economy and potentially started wars to pause AI would be electorally punished. I think if there are anti-AI legislation that is electorally valuable it will probably be surface level stuff that doesn’t move the needle on AI risk (e.g. data center water regulations). Now voters are not an immovable force, but I think it will be hard to move them quickly and may take some other shock to the system (impactful warning shot, widespread job loss, etc.)
As for why we might want to wait for a pause if AGI is further away, If we could have a strong pause now, I’d definitely agree with you. However, I think if you were to push the world into a weakly enforced pause it may be harder to get a strong pause down the line. In the status quo I hope we can slow down capabilities work, build up chip control systems and other monitoring options and as AI gets more powerful (hopefully before we all die) people will move towards a real pause. My concern is that a weak pause drives AI development underground, differentially hurts safety, and doesn’t allow people to update in the direction of a real pause. Like a think a world where AI development is nominally illegal, but the Chinese and U.S. Governments both had well funded secret programs is much worse than Evan’s proposal and likely worse than the status quo.
FWIW if I were a world dictator I would implement a global pause followed by heavily regulated and monitored technical alignment research. I also weakly hold the belief that people who think we are in an emergency situation should state that clearly and strongly, but have found some arguments for being more strategic compelling (e.g. Holden Karnofsky on the 80,000 hours podcast).
I would say that much of my intuition that a pause (yes, driven by various national governments, and public desire) is plausibly doable comes from “the trajectory” of sentiments, rather than the total amount. I agree as a fraction of total political discourse it’s quite small.
This vaguely matches my impression for the most part, in the sense that if you look at discussion on AI in general, much of it may be negative but most of that isn’t people who deeply worry + care about x-risk. But to refine the picture, it’s mile wide, inch deep, but getting full of holes: it seems like there’s an ever growing number of people, including higher ups both in AI and also in government, who seem to take x-risk seriously, largely in words but also in more meaningful ways like drafting bills and stuff like that.
Pardon my stupid question, but what goes wrong concretely? If you ban AGI, but let people keep running existing LLMs (say), does this really cause big and legible enough economic damage that voters would actually move? I mean, I’d think there’s lots of economically damaging things that don’t get credit-assigned into much actual vote shifts.
I may bow out, and feel free to take the maybe last, but I’ll just note that this still doesn’t make sense to me basically at all. Like, yes, there’s kinda-plausible scenarios where a global pause surprisingly ends up worse. But it would still be surprising, right? Like, there’s probably less human cloning right now, compared to the counterfactual where it wasn’t banned, right?? Someone could go underground with it, but that’s really hard and takes work!
I agree that slightly/somewhat unprecedentedly much access (official or espionage) might have to be somehow granted + enforced for treaty implementation. Maybe this is a strong defeater, I just don’t see it. Like, I agree it seems potentially kinda hard or quite hard in some scenarios. But this “global ban is actually worse than hugely resources companies going full tilt or even slightly less full tilt because of a “slowdown”″ seems galaxy-brained and false.
I’d also point out that a treaty isn’t a thing that happens, and then whatever’s written in the treaty determines how well the treaty helps the situation. A treaty is a step in a broader process that can continue to adapt and develop to avoid the dangerous stuff happening. Cf. https://www.lesswrong.com/posts/Sdrzo7z3STzdrnwKW/what-exactly-would-an-international-ai-treaty-say-is-a-bad