[Question] Why is LW not about winning?

azergante13 Jul 2025 22:36 UTC

21 points

This is a bit of a rant but I notice that I am confused.

Eliezer said in the original Sequences:

Rationality is Systematized Winning

But it’s pretty obvious that LessWrong is not about winning (and Eliezer provides a more accurate definition of what he means by rationality here). As far as I can tell LW is mostly about cognitive biases and algorithms/epistemology (the topic of Eliezer’s sequences), self-help, and a lot of AI alignment.

But LW should be about winning! LW has the important goal of solving alignment, so it should care a lot about the most efficient way to go about it, in other words about how to win, right?

So what would it look like if LW had a winning attitude towards alignment?

Well, I think this is where the distinction between the two styles of rationality (cognitive algorithm development VS winning) matters a lot. If you want to solve alignment and want to be efficient about it, it seems obvious that there are better strategies than researching the problem yourself, like don’t spend 3+ years on a PhD (cognitive rationality) but instead get 10 other people to work on the issue (winning rationality). And that 10x s your efficiency already.

My point is that we should consider all strategies when solving a problem. Not only the ones that focus directly on the problem (cognitive rationality/researching alignment), but also the ones that involve acquiring a lot of resources and spending these to solve the problem (winning rationality/getting 10 other people to research alignment).

This is especially true when other strategies get you orders of magnitude more leverage on the problem. To pick an extreme example, who do you think has more capacity to solve alignment, Paul Christiano, or Elon Musk? (hint: Elon Musk can hire a lot of AI alignment researchers).

I am confused because LW teaches cognitive rationality so it should notice all that and recognize that epistemology and cognitive biases and a direct approach is not the most efficient way to go about alignment (or any ambitious goal), and start studying how people actually win in the real world.

But it’s not happening (well, not much at least).

As far as I can tell cognitive rationality helps but winning seems to be mostly about agency and power really. So maybe LW should talk more about these (and how to use them for good)?

azergante13 Jul 2025 22:36 UTC

21 points

21 comments1 min readLW link

Community Rationality

Raemon 13 Jul 2025 23:22 UTC
24 points
11
Well, I think this is where the distinction between the two styles of rationality (cognitive algorithm development VS winning) matters a lot. If you want to solve alignment and want to be efficient about it, it seems obvious that there are better strategies than researching the problem yourself, like don’t spend 3+ years on a PhD (cognitive rationality) but instead get 10 other people to work on the issue (winning rationality). And that 10x s your efficiency already.
[...]
This is especially true when other strategies get you orders of magnitude more leverage on the problem. To pick an extreme example, who do you think has more capacity to solve alignment, Paul Christiano, or Elon Musk? (hint: Elon Musk can hire a lot of AI alignment researchers).
a) what makes you think this isn’t already what’s happening? (I think it’s actually happened a lot)
b) I think we’ve historically actually overindexed on the kinds of things you talk about here, and much of it has turned out to be very bad for the world IMO, and the good parts of it are still much harder/complicated than you’re implying here.
(This comment ended up a little more aggro than I meant it to, I think it’s fairly reasonable to come in with the question you have here, but I do think the assumption here is fairly wrong on two levels)
There’s been a bunch of fieldbuilding work, starting with the MIRI (then Singularity Institute) Summer Fellows, in many ways the founding of CFAR, AIRCs, MATS, PIBBS. (CFAR both included a fairly major focus on “winning” and also was in significant part an effort to recruit people capable of working on the alignment problem).
In 2014, this included getting Elon Musk involved, which AFAICT contributed nontrivially to OpenAI getting created, which was IMO very bad for the world. Later, the person who seemed maybe on track to have a lot of real world power/winning was Sam Bankman-fried, who later turned out to destroy $8 billion and burn a lot of bridges and was hugely net negative.
It’s not enough to say “work on AI and alignment”, you need to successfully convey the subtleties of what that actually means. Today, there are programs that particularly scale the reasonably-scalable parts of the AI safety field, but those parts generally aren’t the most difficult and bottlenecky parts. And it’s still a fairly common outcome from people in those programs to end up joining frontier labs doing work that is IMO net negative.
The work that needs doing for alignment is just actually very hard, many people working on the harder parts have tried and failed to scale the work.
(Also, note, the whole reason I wrote Rationality is not (exactly) Winning is that this was a very common focus, that needed to be argued against. It turns out when you focus on winning, you get powerseeking and bad epistemics fairly shortly)
None of this is to say winning isn’t important or even in some sense the most important part of rationality, just that overly focusing on it has predictable problems.
See:
- azergante 14 Jul 2025 10:49 UTC
  4 points
  0
  Parent
  Note: I have not read the linked posts yet, will do that later
  a) I don’t see many posts to the tune of “What do you think of [some strategy that doesn’t involve direct research on alignment]?” (maybe getting influence in your local town hall, or university etc), perhaps you can point out to such posts? In the same way I don’t see a lot of experience reports like “I paused alignment research and went off this other route instead, hoping for an efficiency multiplier. Here’s what worked and here is what didn’t”.
  I am not saying these posts never happen, but given the possible leverage, I would expect to see more of them. I think it’s fair to say that there are a lot more posts about direct research than about other (leveraged) ways to approach the issue. For example here is my LW feed, there are 3.5 posts about alignment (highlighted), 3.5 about AI and none about other strategies (the post “Lessons from the Iraq War for AI policy” is still pretty far from that as it does not discuss something like a career path or actions that can be taken by an individual).
  You say these have happened a lot, but I don’t see this discussed much on LW. LW itself can be characterized as Eliezer’s very successful leveraged strategy to bring more people into alignment research, so maybe the leveraged strategies end up discussed more outside LW? But in any case this at least shows that some leveraged strategies work, so maybe it’s worth discussing more.
  b) I think this can be summarized as “we don’t know how to put more resources into alignment without this having (sometimes very) negative unintended outcomes”. Okay fair enough, but this seems like a huge issue and maybe there should be more posts about exploring and finding leveraged strategies that won’t backfire. Same for power seeking, there is a reason why power is an instrumental goal of ASI, it’s because it’s useful to accomplish any goal, so it’s important to figure out good ways to get and use power.
  Now maybe your answer is something like “we tried, it didn’t work out that well so we re-prioritized accordingly”. But it’s not obvious to me that we shouldn’t try more and develop a better map of all the available options. Anyway, I will read up on what you linked, if you have more links that you think would clarify what was tried and what worked/didn’t work don’t hesitate to share.
  - Raemon 14 Jul 2025 18:42 UTC
    16 points
    11
    Parent
    I think these mostly don’t take the form of “posts” because it mostly involves actually going and forming organizations and coordinating and running stuff. (maybe see Dark Forest Theories—most of the discussion of it is happening in places you can’t see because it’s pretty high context and not that useful to have randos in it)
    There was a lot more explicit discussion of this sort of thing 10 years ago during the early days of the EA movement, and right now I think it’s a combo of a) mostly those conversations turned into professional orgs doing stuff, and b) we’re also in a period where it’s more obvious that there were significant problems with this focus so there’s a bit of a reaction against it.
    Also, note: if your plan to recruit more people is working, you should still expect to see mostly posts on the object level. Like, if you didn’t successfully get 10x or 100x the people working on the object level, that would indicate your plan to scale had failed.
  - Thane Ruthenis 14 Jul 2025 15:21 UTC
    8 points
    2
    Parent
    LW itself can be characterized as Eliezer’s very successful leveraged strategy to bring more people into alignment research
    My understanding is that Eliezer himself does not view it as hugely successful. MIRI thinks that ~nobody in LW-adjacent communities is doing useful alignment work, and my expectation is that Eliezer would agree with this post of John’s regarding the state of the field.
    Simultaneously, the proliferation of the talk about the AI Alignment problem, which was ~necessary to kickstart the field, potentially dramatically decreased the time-to-omnicide. It attracted the attentions of various powerful people whose contributions were catastrophically anti-helpful, from those who were probably well-meaning but misunderstood the problem (Elon Musk) to straight-up power-hungry psychopaths (Sam Altman).
    I overall agree that “getting dramatically more people to work on alignment” is a good initial idea. But it seems that what actually happens when you try to proliferate the talk about the problem, is that most people end up misunderstanding it and either working on wrong problems, or actively making things worse. This is of course fundamentally a skill issue on the part of the proliferators, but the level of skill where this doesn’t happen may be really high, and as you’re trying to get better at this, you’re leaving net-negative memetic infections in your wake. Plus, you may not actually get to iterate indefinitely on this: there are only so many Silicon Valleys and so many billionaires.
    So the “recruit more people to work on the problem” strategy that would be actually effective in practice probably looks more like “look for promising people and recruit them manually, one-by-one”, instead of anything higher-leverage and higher-profile. One wonders whether the counterfactual timeline in which MIRI instead quietly focused on research and this more targeted recruiting is doing better than this one.
    Possibly not. Possibly that primordial awareness-raising effort is going to provide the foundation for an international AGI-research ban. But I don’t think it’s clear that this was the better plan, in hindsight.
Neel Nanda 15 Jul 2025 1:09 UTC
10 points
6
I agree in theory, but this has happened a lot. You’re arguing “the amount that should happen is non zero” which is uncontroversially true imo, but the hard question is “what fraction of the community’s effort should be spent on field building Vs directly doing things”—I think the answer is clearly not 100%, so the key Q becomes figuring out if it happens too much or too little right now. It sounds like you’re maybe not aware of a bunch of the field building work that has happened/is happening, I think you need to get a better sense on that in order to have an informed take of strategic errors or correct allocation of resources.
Joseph Miller 14 Jul 2025 12:50 UTC
5 points
0
This style of thinking is roughly the right way to approach the problem. Although I also agree with John Wentworth’s comment that not enough people are directly trying to solve the problem.
You are getting fairly negative reactions because you appear to be unfamiliar with a lot of the existing ways in which people are already trying to win. For example, some people are now trying to influence government policy. But if you can think of a good alternative that people are not already doing, then you should suggest it.
Knight Lee 16 Jul 2025 10:30 UTC
2 points
0
Isn’t the most upvoted curated post right now about winning? A case for courage, when speaking of AI danger is talking about strategy, not technical research.
If you’re looking for people interested in personal strategies for individuals (e.g. earning to give), I think most of them are on the Effective Altruism Forum rather than LessWrong. The network effect means that everyone interested in a topic tend to cluster in one forum, even if they are given two choices initially.
Another speculative explanation, is that
maybe the upvote system allows the group of people interested in one particular topic (e.g. technical research, e.g. conceptual theorization) to upvote every post on that topic without running out of upvotes. This rewards people to repeatedly write posts on the most popular topics since it’s much easier to have net positive upvotes that way.
PS: I agree that earning to give is reasonable
I’m considering this myself right now :)
I mostly agree with you that hiring experts and having a great impact is feasible. Many of the technical alignment researchers who lament “money isn’t what we need, what we need is to be on the right direction instead of having so much fake research!” fail to realize that their own salaries are also coming from the flawed but nonetheless vital funding sources. If it wasn’t for the flawed funding sources, they would have nothing at all.
Some of them might be wealthy enough to fund themselves, but that’s effectively still making money to hire experts (the expert is themselves).
And yes, some people use AI safety careers as a stepping stone to AI capabilities careers. But realistically, the whole world spends less than $0.2 billion on AI safety and hundreds of billions on AI capabilities. AI safety salaries are negligible here. One might argue that the non-monetary moral motivation in working on AI safety, has caused people to end up working on AI capabilities. But in this case increasing AI safety salaries should reduce this throughput rather than increase it.
But Raemon is so right about the great danger of being a net negative. Don’t follow an “ends justify the means” strategy like Sam Bankman Fried, beware of your ego convincing you that AI is safer so long as you’re they guy in charge (like Sam Altman or Elon Musk). These biases are insidious, because we are machines programmed by evolution, not to seek truth for the sake of truth, but to
Arrive at the truth when it increases inclusive fitness
Arrive at beliefs which get us to do evil while honestly believing we are doing good (when it increases inclusive fitness)
Arrive at the said beliefs, despite wholly believing we are seeking the truth

johnswentworth 14 Jul 2025 0:11 UTC
113 points
79
If you want to solve alignment and want to be efficient about it, it seems obvious that there are better strategies than researching the problem yourself, like don’t spend 3+ years on a PhD (cognitive rationality) but instead get 10 other people to work on the issue (winning rationality). And that 10x s your efficiency already.
Alas, approximately every single person entering the field has either that idea, or the similar idea of getting thousands of AIs to work on the issue instead of researching it themselves. We have thus ended up with a field in which nearly everyone is hoping that somebody else is going to solve the hard parts, and the already-small set of people who are just directly trying to solve it has, if anything, shrunk somewhat.
It turns out that, no, hiring lots of other people is not actually how you win when the problem is hard.
What links here?
- Why is LW not about winning? by azergante (13 Jul 2025 22:36 UTC; 21 points)
- BryceStansfield 16 Jul 2025 5:30 UTC
  8 points
  0
  Parent
  I think a big part of the problem is that people fundamentally misunderstand what the funnel is. The way to get people into a field isn’t rousing arguments, it’s cool results, accessible entry research, and opportunity.
  
  As a kid, I didn’t go into pure mathematics because someone convinced me that it was a good use of my time, it was because I saw cool videos about mathematical theorems and decided that it looked fun. I didn’t move into applied maths because someone convinced me, but because there was interesting, non-trivial modelling that I could pick up and work on; and I didn’t move into the trading industry because someone convinced me that options liquidity is the primary measure of a civilizations virtue, it was because nobody else would hire me in Australia, but a trading firm offered me a shit tonne of money.
  
  Doing interesting work is itself an important part of the recruitment funnel, keeping some easy problems on hand for grads is another important part, and (imo) diversifying the industry out of like 2 cities (London and SanFran) would be a great way to remove a thin wedge from the top of the funnel.
  
  Some people are going to go into whatever field they think is maximum utility, but I reckon they’re the exception. Most scientists are fundamentally amoral people who will go into whatever they find interesting, and whatever they can get work in. I’ve seen people change fields from climate research into weapons manufacturing because the opportunity wasn’t there, and ML Safety is squandering most of the world.
- leogao 14 Jul 2025 21:14 UTC
  7 points
  3
  Parent
  i agree hiring lots of people is rarely the solution, especially for this type of problem. adding more people to a team actually slows it down at some point. generally, you want a small team of extremely competent people.
- azergante 14 Jul 2025 20:41 UTC
  6 points
  0
  Parent
  I read both of the posts you link to, I interpret the main claim as “you can’t use money to hire experts because you can’t reliably identify them”.
  
  But the reality is that knowledge companies do manage to hire experts and acquire expertise. This implies that alignment research organizations should be able to do the same and I think it’s enough to make the the strong version of the claim irrelevant.
  
  I agree with a weaker version which is that some amount of money is wasted because hiring is unreliable, but again it’s the same for all knowledge companies and society has many mechanisms such as reputation, diplomas and tests to better navigate these issues already.
  
  Edit: your argument about Jeff Bezos rings very wrong to me
  
  Last I heard, Jeff Bezos was the official richest man in the world. He can buy basically anything money can buy. But he can’t buy a cure for cancer. Is there some way he could spend a billion dollars to cure cancer in five years? Probably, yes. But Jeff Bezos does not know how to do that. Even if someone somewhere in the world does know how to turn a billion dollars into a cancer cure in five years, Jeff Bezos cannot distinguish that person from the thousands of other people who claim to know (and may even believe it themselves) but are wrong.
  
  I was not sure how to articulate that so I pasted the 1st paragraph of the essay into Claude and I think it did a good job of pointing deficiencies, here is the full transcript:
  LLM transcript
  Me: evaluate that argument with critical thinking “”″First Puzzle Piece By and large, the President of the United States can order people to do things, and they will do those things. POTUS is often considered the most powerful person in the world. And yet, the president cannot order a virus to stop replicating. The president cannot order GDP to increase. The president cannot order world peace. Are there orders the president could give which would result in world peace, or increasing GDP, or the end of a virus? Probably, yes. Any of these could likely even be done with relatively little opportunity cost. Yet no president in history has known which orders will efficiently achieve these objectives. There are probably some people in the world who know which orders would efficiently increase GDP, but the president cannot distinguish them from the millions of people who claim to know (and may even believe it themselves) but are wrong. Last I heard, Jeff Bezos was the official richest man in the world. He can buy basically anything money can buy. But he can’t buy a cure for cancer. Is there some way he could spend a billion dollars to cure cancer in five years? Probably, yes. But Jeff Bezos does not know how to do that. Even if someone somewhere in the world does know how to turn a billion dollars into a cancer cure in five years, Jeff Bezos cannot distinguish that person from the thousands of other people who claim to know (and may even believe it themselves) but are wrong. When non-experts cannot distinguish true expertise from noise, money cannot buy expertise. Knowledge cannot be outsourced; we must understand things ourselves.”″”
  
  Claude: This argument presents an interesting paradox about power and knowledge, but it contains several logical flaws and oversimplifications that weaken its conclusions.
  
  Strengths of the argument: The core insight is valuable—that raw power (political authority, wealth) doesn’t automatically translate into the ability to solve complex problems. The distinction between having resources and knowing how to deploy them effectively is important. The argument correctly identifies that information asymmetry creates genuine challenges for decision-makers.
  
  Critical weaknesses:
  
  False dichotomy on expertise identification: The argument assumes leaders must either perfectly identify true experts or remain completely helpless. In reality, there are established methods for evaluating expertise: peer review, track records, institutional credentials, reproducible results, and consensus among qualified professionals. While imperfect, these mechanisms work reasonably well—we don’t see random people performing surgery or designing bridges.
  
  Ignoring existing institutional solutions: The argument overlooks how real-world leaders actually operate. Presidents don’t personally evaluate every expert claim—they rely on established institutions (CDC, Federal Reserve, Pentagon) that have developed systematic approaches to vetting expertise. Bezos doesn’t personally screen cancer researchers—he funds established institutions and relies on scientific peer review processes.
  
  Overstated claims about knowledge and outsourcing: The conclusion that “knowledge cannot be outsourced” is demonstrably false. We successfully outsource knowledge constantly—when you visit a doctor, fly in an airplane, or use a smartphone, you’re benefiting from outsourced expertise. The key is having reliable mechanisms for credentialing and accountability.
  
  Survivorship bias in examples: The argument cherry-picks unsolved problems (cancer cure, world peace) while ignoring areas where wealth and power have successfully purchased expertise to solve complex problems—like developing COVID vaccines, creating global supply chains, or landing on the moon.
  
  Conflating different types of problems: The argument treats technical problems (virus replication), economic problems (GDP growth), and political problems (world peace) as equivalent, when they have fundamentally different characteristics and solution pathways.
  
  Better framework: Rather than concluding that power is useless without perfect knowledge, a more nuanced view would recognize that effective leadership involves building systems and institutions that can reliably identify and deploy expertise, while acknowledging the inherent uncertainties and limitations in complex problem-solving.
  
  The argument raises important questions about the relationship between power and knowledge, but its binary thinking and sweeping conclusions undermine its analytical value.
  - Raemon 14 Jul 2025 21:18 UTC
    18 points
    12
    Parent
    I feel like you haven’t actually updated on “we’ve tried this a bunch, lots of people have this idea, and are currently doing it a bunch” in addition to “and it didn’t work nearly as well as you might think.” Like, you might be right that we didn’t try it right or something, but, your initial post was built on a mistaken premise.
    - azergante 15 Jul 2025 17:12 UTC
      2 points
      0
      Parent
      My above comment (not focusing on the main post for a moment) does not claim that it’s easy to hire alignment researchers, but that “you can’t use money to hire experts because you can’t reliably identify them” is the wrong causal model to explain why hiring for alignment is difficult because it’s false: if that causal model were true, you’d expect no companies to be able to hire experts, which is not the case. Anyway, maybe this is nitpicking but to me something like “AI alignment is in its infancy so it’s harder to hire for it than for other fields” would be more convincing.
      your initial post was built on a mistaken premise
      I do miss a lot of background on what has been discussed and tried so far, in retrospect most of what I read on LW so far is Rationality: A-Z and the Codex, plus some of the posts in my feed.
      If the library had a “A Short History of AI alignment” section I probably would have read it, maybe pinning something like that somewhere visible will help new users get up to speed on the subject more reliably? I do understand that this is a big time investment though
      - Raemon 15 Jul 2025 18:01 UTC
        6 points
        0
        Parent
        Nod. But, I think you are also wrong about the “you can hire hire experts” causal model, and “we tried this and it’s harder than you think” is entangled with why, and it didn’t seem that useful to argue the point more if you weren’t making more of an explicit effort to figure out where your model was wrong.
        Normally, people can hire try to hire experts, but, it often doesn’t work very well. (I can’t find the relevant Paul Graham essay, but, if you don’t have the good taste to know what expertise looks like, you are going to end up hiring people who are good at persuading you they are experts, rather than actual experts).
        It can work in vert well understood domains where it’s obvious what success looks like.
        In domains where there is no consensus on what an expert would look like (and, since no one has solved the problem, expertise basically “doesn’t exist”).
        (Note you didn’t actually argue that hiring experts works, just asserted it)
        I agree it’d be nice to have a clearly written history of what has been tried. An awful lot of things have been tried though, and different people coming in would probably want different histories tailored for different goals, and it’s fairly hard to summarize. It could totally be done, but the people equipped to do a good job of it often have other important things to do and it’s not obviously the right call.
        If you want to contribute to the overall situation I do think you should expect to need to have a pretty good understanding of the object level problem as well as what meta-level solutions have been tried. A lot of the reason meta-level solutions have failed is that people didn’t understand the object level problem well enough and scaled the wrong thing.
        (try searching “postmortem” and maybe skim some of the things that come up, especially higher karma ones?)
- azergante 14 Jul 2025 11:02 UTC
  4 points
  0
  Parent
  
  It turns out that, no, hiring lots of other people is not actually how you win when the problem is hard.
  
  Using money to hire people is only one strategy among many, if it does not work then maybe starting an alignment course at your local university does? Maybe talking to your mayor is useful? what about becoming the mayor? but how would we know since LW mostly does not talk about that?
  
  These are just examples, surely among all the possible strategies we can find a few that are worth doing and give leverage over direct research? (LessWrong itself would be an example). This seems worth exploring.
  - Elizabeth 15 Jul 2025 2:01 UTC
    7 points
    0
    Parent
    There was a post on a university class less than two weeks ago.
Viliam 17 Jul 2025 22:52 UTC
17 points
11
One problem with winning, is that you need to be more specific: “winning at what?” And if you try to write down the list (for a human being, let’s ignore the AI for a moment), it turns out to be quite long.
To win at life, you probably want to be rich, but you also want to be fit, you want to be smart… but the time you spend earning money is the time you can’t spend exercising, and the time you spend exercising is the time you can’t spend learning, and you should also spend some time socializing, thinking strategically about your plans, maybe meditating, you should definitely get enough sleep, if you want to eat healthy food that is not too expensive that probably means you should learn to cook… and soon the list is too long. The day only has 24 hours, so it takes a lot of discipline to accomplish all of this without burning out, even under optional conditions (physical health, mental health, supportive family, some safety network).
It is much easier to win at one specific thing, for example to be an excellent student, while your parents take care of the money and food and strategic planning. Then you can spend 8 hours on the project, and the remaining 8 hours having fun (which is important for your mental health, and makes it all sustainable).
Some people get great at one thing by sacrificing everything else. For example, they create big successful companies and make tons of money… but their partner divorces them, their kids don’t talk to them, and at some moment their health collapses and they die. Or they spend their life in poverty, focusing obsessively on their art that will enter the textbooks one day… but again, their family suffers, etc.
Alternatively, you can try the middle way, where you try to get good-but-not-great at everything. That’s kinda where I am: somewhat above average in most things, excellent in nothing. I am not even sure how I feel about it: when I look at all kinds of problems that people around me have, I am happy that I am not them; when I think about my ambitions, I feel like I wasted my entire life.
Now, instead of an individual human, consider a group. By the level of seriousness, there are two basic kinds of groups: hobbies and jobs. Hobbies are what people do in their free time, after they have spent most of their energy on their jobs, families, etc. Some people are obsessed with their hobbies, but that doesn’t necessarily translate into quality; people who have both the obsession and the quality are rare. People with priorities other than their hobby often disappear from the group when something with a higher priority appears in their private life; and even before that, they often don’t have enough energy left for the group activities, so the group productivity is low.
To succeed, most groups need to become jobs: at least some members need to get paid decent money for working for the group. (Not necessarily all members, not even most of them; some groups are okay with two or three paid people who coordinate dozens of volunteers.) This gives you members who can devote 8 hours a day to advancing the group goals, sustainably. On the other hand, in addition to the intrinsic group goals now you also have a new task, to secure money for these members (also, to do the accounting, etc.), which can actually cost you a large fraction of this extra time (applying for grant money, preparing documentation for the donors, even more complex accounting, etc.). You also need to recruit new members, solve problems between the existing members, take care of your reputation (PR), etc.
And this all doesn’t happen in a vacuum: if you have goals, you probably also have enemies—people whose goals oppose yours (no matter how good and prosocial your goals are; some people probably benefit from the existing problems and they’d hate to see them fixed), or simply people who compete for the same resources (apply for the same grants, recruit members from the same population), or even people who hate you for no good reason just because something about you rubs them the wrong way. (And this all optimistically assumes that you have never done nothing wrong; no mistake ever. Otherwise, also include people who want to punish you; some of them quite disproportionately.) Also, people who see that you have resources, and would like to take them away from you, by theft or blackmail.
The goal of the group can require many different tasks to be done: research what causes the problem, research how to fix the problem, do the things that you are allowed to do, lobby for changing the rules so that you can do more, explain the situation to people so that you get them on your side (while your enemies are trying to turn them against you). Short-term tasks vs long-term strategies. Again, your time and resources are limited, the more you spend on X, the less you can spend on Y.
...oh my, I make it sound so complicated as if nothing can ever succeed. That wouldn’t be exactly true. But the filters along the way are brutal. You need to do many things right and you need to get lucky. Most projects fail. Most successful projects succeed small. Many good projects fall apart later, or get subverted.
I am trying to offer a “glass half full” or maybe even “glass 90% full” perspective here. Sure, nature doesn’t grade you on a curve. The sperm that only gets 99.99% towards the egg is wasted. From that perspective, we probably lose, and then we probably all die. But I don’t think that we are losing because we keep making obvious stupid mistakes. I think we are actually doing surprisingly many things right. It’s just that the problem is so difficult that you can do many things right and still lose at the end. :( Because no matter how many filters you have already passed, the next filter still eliminates a majority of contestants. And we still get at least three more filters ahead of us: (1) the major players need to actually care about alignment, (2) they need to find a way how to cooperate, and eliminate those who don’t, (3) and if they try to align the AI, they have to actually succeed. Each one of these alone sounds unlikely.
But also, for full perspective, let’s look back and see how many filters we have already passed. A decade and half ago, you get one smart guy called Eliezer, worrying about a thing that no one else seems to care about. And his goal is to convince the entire planet to do it right, otherwise we all die (but at that moment, he seems to be the only one who believes that). At what odds would you bet your money that starting from there, a few years later there will be a global community, a blog that publishes research on that topic (and many other things, often unrelated) almost every day, there will be books, academic courses, and organizations focused on that idea, politicians will discuss it on TV… and the “only” remaining problem will be that the most advanced tech companies on the planet will only pay lip service to his ideas instead of seriously following them? Yep, even that last point is sufficient to kill us all, but still, isn’t it impressive how we actually got here, despite the odds?
don’t spend 3+ years on a PhD (cognitive rationality) but instead get 10 other people to work on the issue (winning rationality). And that 10x s your efficiency already.
This seems to assume that there is a pool of extremely smart and conscientious and rational people out there, with sufficient mathematical and technical skills, willing to bet their careers on your idea if you explain it to them the right way… and you only need to go there and recruit 10 of them for the cause.
I think that such people are rare, and I suspect that most of them have already heard about the cause. Workshops organized by CFAR (1, 2) are at least partially about recruiting for the cause. Books like Superintelligence can reach more people than individual recruitment. (Also, HP:MoR.)
I think that the pyramid strategy (don’t work on the cause, instead recruit other people to work on the cause) would seem fishy to the people you are trying to recruit. Like, why would I bet my academic career on a field where no one wants to work… not even you, despite caring a lot and having the skills? Actually doing the PhD and writing a few papers will help to make the field seem legitimate.
To pick an extreme example, who do you think has more capacity to solve alignment, Paul Christiano, or Elon Musk?
Have you seen what Elon Musk does with Grok recently? He definitely has the resources, but I don’t know if there is a person on this planet who can make Elon Musk listen to them and take alignment seriously. Especially now that his brain is drunk with politics.
(This is like discussing that e.g. Putin has enough money so that he could feed all the starving kids in Africa. Yeah, he probably does, but it’s irrelevant, because this is never going to happen anyway.)
As far as I can tell cognitive rationality helps but winning seems to be mostly about agency and power really. So maybe LW should talk more about these (and how to use them for good)?
Sure, agency and power are good. If you think there is a low-hanging fruit we should pick, please explain more specifically. Agency, we have discussed a lot already (1, 2, 3), but maybe there is an important angle we have missed, or something that needs repeating. Power is a zero-sum game that many people want to play, so I doubt there is a low-hanging fruit.
There is a guy called SBF who seemed to try this way really hard, and although many people admired him at that moment, it didn’t end up well, and probably did a lot of harm. (Also, Zizians were quite agenty.)
tl;dr—be specific; if you think we are making trivial mistakes, you are probably wrong
- azergante 18 Jul 2025 20:24 UTC
  1 point
  0
  Parent
  
  Sure, agency and power are good. If you think there is a low-hanging fruit we should pick, please explain more specifically.
  
  I cannot be more specific about winning rationality because I don’t know how to do it. One would first have to set out to create the art, go out there, win and report back.
  
  Agency, we have discussed a lot already (1, 2, 3),
  
  Power is a zero-sum game
  
  Then again I might read more of what people have published on LW and find that it’s already as good as it gets, who knows.
  - Viliam 19 Jul 2025 16:44 UTC
    7 points
    0
    Parent
    I cannot be more specific about winning rationality because I don’t know how to do it. One would first have to set out to create the art, go out there, win and report back.
    Yes. My experience with “winning” suggests that there are three basic categories of interventions:
    stop doing something harmful or wasteful
    exploit an idiosyncratic opportunity
    do the correct things and experience a slow and steady growth
    The first category includes things like: stop taking drugs, leave your abusive partner, quit your bad job, find new friends who are not losers, stop doomscrolling, quit social networks, etc. If you are making any of these mistakes, it may dramatically improve your life when you stop doing them. But it will be unimpressive for observers, because they will be like “why were they even doing this stupid thing?”. Also, it moves you from negative to neutral, which is great, but it doesn’t move you from neutral to positive. It will make you good, but not great. And once you stop doing the obviously bad stuff, there is little progress to be gained in this category.
    The second category is great and impressive, but the advice only applies to a specific kind of person in a specific situation, so most likely it is useless for you. An example would be Scott Alexander quitting his job and starting to make money on his Substack blog: it seems to have made him rich (and more importantly, financially secure, so he no longer has to worry about doxing, and can fully focus on doing the things that he wants to do); but from our perspective, the disadvantage of this strategy is that you need to be Scott Alexander first. Similarly, I met a guy in real life, who was a successful entrepreneur, but he started to hate his business, and didn’t know what else to do; we crunched some numbers and figured out that if he sells his business, he can retire early; he did that and he seems quite happy (later he started another business, but of a different kind, and most importantly, he does not depend on it: he can simply sell it the very moment it stops being fun). Again, a great solution for him, but if you don’t already have a successful company to sell, it won’t work for you. Maybe there is something approximately in this category for me, and I am just too blind to see it; but you would need to know my specific situation, and my specific strengths and weaknesses to find it; there is no general advice. There is also no guarantee that such thing exists.
    The third category is completely boring for the outside observer: get enough sleep, learn new stuff, get a good job, eat healthy food, exercise regularly, put your savings in passively managed index funds, follow dozen more pieces of wisdom like this… and unless some misfortune happens, 10 years later you will be healthy and fit; 20-30 years later you will be healthy, fit, and rich; 30-50 years later you will be healthy, fit, rich, and famous if you care about that. (This works better if you start following this strategy while you still have many years of life ahead of you.) The disadvantage is that it takes a lot of time for the benefits to appear; the first few years may be completely unimpressive, and it may require a lot of willpower to stay on the right track regardless. Notice that the entire rationality community exists only slightly longer than ten years, so even in the hypothetical case if we all followed this advice religiously (which is definitely not the case) there still wouldn’t be sufficiently dramatic evidence to convince a skeptical outsider.
    Some people believe that there is also a fourth category—an intervention that works perfectly for anyone, and delivers amazing results quickly. All you need is typically to give a lot of money to the guru who will teach you the secrets, or quit your job and join the pyramid / multi-level business / binary system / whatever people call it these days while it is still growing. In my experience, this usually does not work well. -- A more innocent version of this is taking an example from the “idiosyncratic opportunity” category and trying to apply it to everyone: “Hey, did you know that Scott Alexander is successful on Substack? Why don’t you start your own blog?” and it turns out that unfortunately, you are not Scott, you don’t have the stamina to write so many so good articles, and after a few years you only have dozen articles, three followers, and zero dollars.
    I think it would be useful to collect success stories in the three categories above. (Of course, there is a risk of selection bias: you won’t get the stories of people who tried to do the same, and failed.) But I suspect that the general outline would be: first you get some quick boost from abandoning the harmful things, from that point on it is mostly a lot of work that produces slow progress (which looks amazing in hindsight, but boring when observed on a daily basis), with an occasional quick jump by exploring a unique opportunity. It may help a lot if you are surrounded by smart and friendly people who support you at doing the right-but-boring things, and help you brainstorm the opportunities.
    There is a methodological problem of how to distinguish between “gains from rationality” vs ordinary gains from talent and hard work. Here I used Scott Alexander as an example of a famous successful rationalist… but when I imagine him in a parallel universe where the rationality community does not exist, maybe he is just as famous and successful there, too? And even if the community provided him some good ideas, a place to publish, and encouragement, maybe in that parallel universe he found a different community that provided him the same things.
    ...uh, no conclusion here, other than I agree that we should collect some rationality success stories. But expect that many will be disappointing in various ways: too simple, too specific, too slow, and with dubious relation to rationality.
Algon 14 Jul 2025 19:40 UTC
8 points
1
Strong upvoted not because I agree with your take on alignment, though there is perhaps some merit to it in spite of it being hard to buy results in alignment, but because I think LW needs to focus more on the practice of rationality, and indeed instrumental rationality.
CstineSublime 14 Jul 2025 10:31 UTC
1 point
0
So what would it look like if LW had a winning attitude towards alignment?
I’m not sure what exactly is a “winning attitude”, I’m reminded of an absurd saying of the mother of Simpson’s writer George Meyer, his mother was fond of saying:
“we’re not just winners, we’re super-winners!”
Something that according to journalist David Owen left Meyer “and his siblings felt pressure to succeed but enjoyed little sense of personal accomplishment when they did, because their triumphs were taken for granted-a dilemma that one of his sisters described to me as “a lose-lose situation.”″

I’m just not sure what you mean by “winning attitude”: it sounds very Law of Attraction-y to me—that if you think positive or “winning” thoughts it will have a self-fulfilling prophecy. Almost as if what they’re missing is to just “wish for alignment harder” or that “the reason you can’t achieve alignment is because you didn’t believe enough”. That is a very broad brush to pain the entire community and you haven’t provided any evidence to support it.
My point is that we should consider all strategies when solving a problem.
What are they all? Do you have a list? And don’t we risk dilution of resources on the ineffective ones?