azergante comments on Why is LW not about winning?

azergante 14 Jul 2025 10:49 UTC
4 points
0
Note: I have not read the linked posts yet, will do that later
a) I don’t see many posts to the tune of “What do you think of [some strategy that doesn’t involve direct research on alignment]?” (maybe getting influence in your local town hall, or university etc), perhaps you can point out to such posts? In the same way I don’t see a lot of experience reports like “I paused alignment research and went off this other route instead, hoping for an efficiency multiplier. Here’s what worked and here is what didn’t”.
I am not saying these posts never happen, but given the possible leverage, I would expect to see more of them. I think it’s fair to say that there are a lot more posts about direct research than about other (leveraged) ways to approach the issue. For example here is my LW feed, there are 3.5 posts about alignment (highlighted), 3.5 about AI and none about other strategies (the post “Lessons from the Iraq War for AI policy” is still pretty far from that as it does not discuss something like a career path or actions that can be taken by an individual).
You say these have happened a lot, but I don’t see this discussed much on LW. LW itself can be characterized as Eliezer’s very successful leveraged strategy to bring more people into alignment research, so maybe the leveraged strategies end up discussed more outside LW? But in any case this at least shows that some leveraged strategies work, so maybe it’s worth discussing more.
b) I think this can be summarized as “we don’t know how to put more resources into alignment without this having (sometimes very) negative unintended outcomes”. Okay fair enough, but this seems like a huge issue and maybe there should be more posts about exploring and finding leveraged strategies that won’t backfire. Same for power seeking, there is a reason why power is an instrumental goal of ASI, it’s because it’s useful to accomplish any goal, so it’s important to figure out good ways to get and use power.
Now maybe your answer is something like “we tried, it didn’t work out that well so we re-prioritized accordingly”. But it’s not obvious to me that we shouldn’t try more and develop a better map of all the available options. Anyway, I will read up on what you linked, if you have more links that you think would clarify what was tried and what worked/didn’t work don’t hesitate to share.
- Raemon 14 Jul 2025 18:42 UTC
  16 points
  11
  Parent
  I think these mostly don’t take the form of “posts” because it mostly involves actually going and forming organizations and coordinating and running stuff. (maybe see Dark Forest Theories—most of the discussion of it is happening in places you can’t see because it’s pretty high context and not that useful to have randos in it)
  There was a lot more explicit discussion of this sort of thing 10 years ago during the early days of the EA movement, and right now I think it’s a combo of a) mostly those conversations turned into professional orgs doing stuff, and b) we’re also in a period where it’s more obvious that there were significant problems with this focus so there’s a bit of a reaction against it.
  Also, note: if your plan to recruit more people is working, you should still expect to see mostly posts on the object level. Like, if you didn’t successfully get 10x or 100x the people working on the object level, that would indicate your plan to scale had failed.
- Thane Ruthenis 14 Jul 2025 15:21 UTC
  8 points
  2
  Parent
  LW itself can be characterized as Eliezer’s very successful leveraged strategy to bring more people into alignment research
  My understanding is that Eliezer himself does not view it as hugely successful. MIRI thinks that ~nobody in LW-adjacent communities is doing useful alignment work, and my expectation is that Eliezer would agree with this post of John’s regarding the state of the field.
  Simultaneously, the proliferation of the talk about the AI Alignment problem, which was ~necessary to kickstart the field, potentially dramatically decreased the time-to-omnicide. It attracted the attentions of various powerful people whose contributions were catastrophically anti-helpful, from those who were probably well-meaning but misunderstood the problem (Elon Musk) to straight-up power-hungry psychopaths (Sam Altman).
  I overall agree that “getting dramatically more people to work on alignment” is a good initial idea. But it seems that what actually happens when you try to proliferate the talk about the problem, is that most people end up misunderstanding it and either working on wrong problems, or actively making things worse. This is of course fundamentally a skill issue on the part of the proliferators, but the level of skill where this doesn’t happen may be really high, and as you’re trying to get better at this, you’re leaving net-negative memetic infections in your wake. Plus, you may not actually get to iterate indefinitely on this: there are only so many Silicon Valleys and so many billionaires.
  So the “recruit more people to work on the problem” strategy that would be actually effective in practice probably looks more like “look for promising people and recruit them manually, one-by-one”, instead of anything higher-leverage and higher-profile. One wonders whether the counterfactual timeline in which MIRI instead quietly focused on research and this more targeted recruiting is doing better than this one.
  Possibly not. Possibly that primordial awareness-raising effort is going to provide the foundation for an international AGI-research ban. But I don’t think it’s clear that this was the better plan, in hindsight.