AI coordination needs clear wins
Thanks to Kate Woolverton and Richard Ngo for useful conversations, comments, and feedback.
EA and AI safety have invested a lot of resources into building our ability to get coordination and cooperation between big AI labs. So far, however, despite that investment, it doesn’t seem to me like we’ve had that many big coordination “wins” yet. I don’t mean to say that to imply that our efforts have failed, however—the obvious other hypothesis is just that we don’t really have that much to coordinate on right now, other than the very nebulous goal of improving our general coordination/cooperation capabilities.
In my opinion, however, I think that our lack of clear wins is actually a pretty big problem—and not just because I think there are useful things that we can plausibly coordinate on right now, but also because I expect our lack of clear wins now to limit our ability to get the sort of cooperation we need in the future.
In the theory of political capital, it is a fairly well-established fact that “Everybody Loves a Winner.” That is: the more you succeed at leveraging your influence to get things done, the more influence you get in return. This phenomenon is most thoroughly studied in the context of the ability of U.S. presidents’ to get their agendas through Congress—contrary to a naive model that might predict that legislative success uses up a president’s influence, what is actually found is the opposite: legislative success engenders future legislative success, greater presidential approval, and long-term gains for the president’s party.
I think many people who think about the mechanics of leveraging influence don’t really understand this phenomenon and conceptualize their influence as a finite resource to be saved up over time so it can all be spent down when it matters most. But I think that is just not how it works: if people see you successfully leveraging influence to change things, you become seen as a person who has influence, has the ability to change things, can get things done, etc. in a way that gives you more influence in the future, not less.
Of course, you do have to actually succeed to make this work—if you try to spend your influence to make something happen and fail, you get the opposite effect. This suggests the obvious strategy, however, of starting with small but nevertheless clear coordination wins and working our way up towards larger ones—which is exactly the strategy that I think we should be pursuing.[1]
- ↩︎
In that vein, in a follow-up post, I will propose a particular clear, concrete coordination task that I think might be achievable soon given the current landscape, would generate a clear win, and that I think would be highly useful in and of itself.
- Against Almost Every Theory of Impact of Interpretability by 17 Aug 2023 18:44 UTC; 329 points) (
- RSPs are pauses done right by 14 Oct 2023 4:06 UTC; 164 points) (
- Monitoring for deceptive alignment by 8 Sep 2022 23:07 UTC; 135 points) (
- RSPs are pauses done right by 14 Oct 2023 4:06 UTC; 93 points) (EA Forum;
- The optimal timing of spending on AGI safety work; why we should probably be spending more now by 24 Oct 2022 17:42 UTC; 92 points) (EA Forum;
- Compounding Resource X by 11 Jan 2023 3:14 UTC; 77 points) (
- The optimal timing of spending on AGI safety work; why we should probably be spending more now by 24 Oct 2022 17:42 UTC; 62 points) (
- Voting Results for the 2022 Review by 2 Feb 2024 20:34 UTC; 57 points) (
- EA & LW Forums Weekly Summary (28 Aug − 3 Sep 22’) by 6 Sep 2022 11:06 UTC; 51 points) (
- Safety timelines: How long will it take to solve alignment? by 19 Sep 2022 12:51 UTC; 45 points) (EA Forum;
- 6 Oct 2022 6:12 UTC; 44 points) 's comment on Warning Shots Probably Wouldn’t Change The Picture Much by (
- EA & LW Forums Weekly Summary (28 Aug − 3 Sep 22’) by 6 Sep 2022 10:46 UTC; 42 points) (EA Forum;
- Safety timelines: How long will it take to solve alignment? by 19 Sep 2022 12:53 UTC; 37 points) (
- 25 Oct 2023 0:20 UTC; 23 points) 's comment on Thoughts on responsible scaling policies and regulation by (
- Where does Responsible Capabilities Scaling take AI governance? by 9 Jun 2024 22:25 UTC; 17 points) (EA Forum;
- Horizontal and Vertical Integration by 1 Jul 2023 1:15 UTC; 17 points) (
- 9 Sep 2022 12:47 UTC; 11 points) 's comment on Monitoring for deceptive alignment by (
- 26 Oct 2023 1:03 UTC; 9 points) 's comment on Responsible Scaling Policies Are Risk Management Done Wrong by (EA Forum;
- 26 Oct 2023 12:51 UTC; 5 points) 's comment on Responsible Scaling Policies Are Risk Management Done Wrong by (
- 14 May 2023 14:52 UTC; 4 points) 's comment on AI doom from an LLM-plateau-ist perspective by (
- 15 Jan 2023 2:38 UTC; 2 points) 's comment on The Coordination Frontier: Sequence Intro by (
- 11 May 2023 22:16 UTC; 2 points) 's comment on TurnTrout’s shortform feed by (
I disagree with the conclusion of this post, but still found it a valuable reference for a bunch of arguments I do think are important to model in the space.