If you want to solve alignment and want to be efficient about it, it seems obvious that there are better strategies than researching the problem yourself, like don’t spend 3+ years on a PhD (cognitive rationality) but instead get 10 other people to work on the issue (winning rationality). And that 10x s your efficiency already.
Alas, approximately every single person entering the field has either that idea, or the similar idea of getting thousands of AIs to work on the issue instead of researching it themselves. We have thus ended up with a field in which nearly everyone is hoping that somebody else is going to solve the hard parts, and the already-small set of people who are just directly trying to solve it has, if anything, shrunk somewhat.
It turns out that, no, hiring lots of other people isnot actually how you win when the problem is hard.
I think a big part of the problem is that people fundamentally misunderstand what the funnel is. The way to get people into a field isn’t rousing arguments, it’s cool results, accessible entry research, and opportunity.
As a kid, I didn’t go into pure mathematics because someone convinced me that it was a good use of my time, it was because I saw cool videos about mathematical theorems and decided that it looked fun. I didn’t move into applied maths because someone convinced me, but because there was interesting, non-trivial modelling that I could pick up and work on; and I didn’t move into the trading industry because someone convinced me that options liquidity is the primary measure of a civilizations virtue, it was because nobody else would hire me in Australia, but a trading firm offered me a shit tonne of money.
Doing interesting work is itself an important part of the recruitment funnel, keeping some easy problems on hand for grads is another important part, and (imo) diversifying the industry out of like 2 cities (London and SanFran) would be a great way to remove a thin wedge from the top of the funnel.
Some people are going to go into whatever field they think is maximum utility, but I reckon they’re the exception. Most scientists are fundamentally amoral people who will go into whatever they find interesting, and whatever they can get work in. I’ve seen people change fields from climate research into weapons manufacturing because the opportunity wasn’t there, and ML Safety is squandering most of the world.
i agree hiring lots of people is rarely the solution, especially for this type of problem. adding more people to a team actually slows it down at some point. generally, you want a small team of extremely competent people.
I read both of the posts you link to, I interpret the main claim as “you can’t use money to hire experts because you can’t reliably identify them”.
But the reality is that knowledge companies do manage to hire experts and acquire expertise. This implies that alignment research organizations should be able to do the same and I think it’s enough to make the the strong version of the claim irrelevant.
I agree with a weaker version which is that some amount of money is wasted because hiring is unreliable, but again it’s the same for all knowledge companies and society has many mechanisms such as reputation, diplomas and tests to better navigate these issues already.
Edit: your argument about Jeff Bezos rings very wrong to me
Last I heard, Jeff Bezos was the official richest man in the world. He can buy basically anything money can buy. But he can’t buy a cure for cancer. Is there some way he could spend a billion dollars to cure cancer in five years? Probably, yes. But Jeff Bezos does not know how to do that. Even if someone somewhere in the world does know how to turn a billion dollars into a cancer cure in five years, Jeff Bezos cannot distinguish that person from the thousands of other people who claim to know (and may even believe it themselves) but are wrong.
I was not sure how to articulate that so I pasted the 1st paragraph of the essay into Claude and I think it did a good job of pointing deficiencies, here is the full transcript:
LLM transcript
Me: evaluate that argument with critical thinking “”″First Puzzle Piece
By and large, the President of the United States can order people to do things, and they will do those things. POTUS is often considered the most powerful person in the world. And yet, the president cannot order a virus to stop replicating. The president cannot order GDP to increase. The president cannot order world peace.
Are there orders the president could give which would result in world peace, or increasing GDP, or the end of a virus? Probably, yes. Any of these could likely even be done with relatively little opportunity cost. Yet no president in history has known which orders will efficiently achieve these objectives. There are probably some people in the world who know which orders would efficiently increase GDP, but the president cannot distinguish them from the millions of people who claim to know (and may even believe it themselves) but are wrong.
Last I heard, Jeff Bezos was the official richest man in the world. He can buy basically anything money can buy. But he can’t buy a cure for cancer. Is there some way he could spend a billion dollars to cure cancer in five years? Probably, yes. But Jeff Bezos does not know how to do that. Even if someone somewhere in the world does know how to turn a billion dollars into a cancer cure in five years, Jeff Bezos cannot distinguish that person from the thousands of other people who claim to know (and may even believe it themselves) but are wrong.
When non-experts cannot distinguish true expertise from noise, money cannot buy expertise. Knowledge cannot be outsourced; we must understand things ourselves.”″”
Claude: This argument presents an interesting paradox about power and knowledge, but it contains several logical flaws and oversimplifications that weaken its conclusions.
Strengths of the argument: The core insight is valuable—that raw power (political authority, wealth) doesn’t automatically translate into the ability to solve complex problems. The distinction between having resources and knowing how to deploy them effectively is important. The argument correctly identifies that information asymmetry creates genuine challenges for decision-makers.
Critical weaknesses:
False dichotomy on expertise identification: The argument assumes leaders must either perfectly identify true experts or remain completely helpless. In reality, there are established methods for evaluating expertise: peer review, track records, institutional credentials, reproducible results, and consensus among qualified professionals. While imperfect, these mechanisms work reasonably well—we don’t see random people performing surgery or designing bridges.
Ignoring existing institutional solutions: The argument overlooks how real-world leaders actually operate. Presidents don’t personally evaluate every expert claim—they rely on established institutions (CDC, Federal Reserve, Pentagon) that have developed systematic approaches to vetting expertise. Bezos doesn’t personally screen cancer researchers—he funds established institutions and relies on scientific peer review processes.
Overstated claims about knowledge and outsourcing: The conclusion that “knowledge cannot be outsourced” is demonstrably false. We successfully outsource knowledge constantly—when you visit a doctor, fly in an airplane, or use a smartphone, you’re benefiting from outsourced expertise. The key is having reliable mechanisms for credentialing and accountability.
Survivorship bias in examples: The argument cherry-picks unsolved problems (cancer cure, world peace) while ignoring areas where wealth and power have successfully purchased expertise to solve complex problems—like developing COVID vaccines, creating global supply chains, or landing on the moon.
Conflating different types of problems: The argument treats technical problems (virus replication), economic problems (GDP growth), and political problems (world peace) as equivalent, when they have fundamentally different characteristics and solution pathways.
Better framework: Rather than concluding that power is useless without perfect knowledge, a more nuanced view would recognize that effective leadership involves building systems and institutions that can reliably identify and deploy expertise, while acknowledging the inherent uncertainties and limitations in complex problem-solving.
The argument raises important questions about the relationship between power and knowledge, but its binary thinking and sweeping conclusions undermine its analytical value.
I feel like you haven’t actually updated on “we’ve tried this a bunch, lots of people have this idea, and are currently doing it a bunch” in addition to “and it didn’t work nearly as well as you might think.” Like, you might be right that we didn’t try it right or something, but, your initial post was built on a mistaken premise.
My above comment (not focusing on the main post for a moment) does not claim that it’s easy to hire alignment researchers, but that “you can’t use money to hire experts because you can’t reliably identify them” is the wrong causal model to explain why hiring for alignment is difficult because it’s false: if that causal model were true, you’d expect no companies to be able to hire experts, which is not the case. Anyway, maybe this is nitpicking but to me something like “AI alignment is in its infancy so it’s harder to hire for it than for other fields” would be more convincing.
your initial post was built on a mistaken premise
I do miss a lot of background on what has been discussed and tried so far, in retrospect most of what I read on LW so far is Rationality: A-Z and the Codex, plus some of the posts in my feed.
If the library had a “A Short History of AI alignment” section I probably would have read it, maybe pinning something like that somewhere visible will help new users get up to speed on the subject more reliably? I do understand that this is a big time investment though
Nod. But, I think you are also wrong about the “you can hire hire experts” causal model, and “we tried this and it’s harder than you think” is entangled with why, and it didn’t seem that useful to argue the point more if you weren’t making more of an explicit effort to figure out where your model was wrong.
Normally, people can hire try to hire experts, but, it often doesn’t work very well. (I can’t find the relevant Paul Graham essay, but, if you don’t have the good taste to know what expertise looks like, you are going to end up hiring people who are good at persuading you they are experts, rather than actual experts).
It can work in vert well understood domains where it’s obvious what success looks like.
In domains where there is no consensus on what an expert would look like (and, since no one has solved the problem, expertise basically “doesn’t exist”).
(Note you didn’t actually argue that hiring experts works, just asserted it)
I agree it’d be nice to have a clearly written history of what has been tried. An awful lot of things have been tried though, and different people coming in would probably want different histories tailored for different goals, and it’s fairly hard to summarize. It could totally be done, but the people equipped to do a good job of it often have other important things to do and it’s not obviously the right call.
If you want to contribute to the overall situation I do think you should expect to need to have a pretty good understanding of the object level problem as well as what meta-level solutions have been tried. A lot of the reason meta-level solutions have failed is that people didn’t understand the object level problem well enough and scaled the wrong thing.
(try searching “postmortem” and maybe skim some of the things that come up, especially higher karma ones?)
It turns out that, no, hiring lots of other people is not actually how you win when the problem is hard.
Using money to hire people is only one strategy among many, if it does not work then maybe starting an alignment course at your local university does? Maybe talking to your mayor is useful? what about becoming the mayor? but how would we know since LW mostly does not talk about that?
These are just examples, surely among all the possible strategies we can find a few that are worth doing and give leverage over direct research? (LessWrong itself would be an example). This seems worth exploring.
Alas, approximately every single person entering the field has either that idea, or the similar idea of getting thousands of AIs to work on the issue instead of researching it themselves. We have thus ended up with a field in which nearly everyone is hoping that somebody else is going to solve the hard parts, and the already-small set of people who are just directly trying to solve it has, if anything, shrunk somewhat.
It turns out that, no, hiring lots of other people is not actually how you win when the problem is hard.
I think a big part of the problem is that people fundamentally misunderstand what the funnel is. The way to get people into a field isn’t rousing arguments, it’s cool results, accessible entry research, and opportunity.
As a kid, I didn’t go into pure mathematics because someone convinced me that it was a good use of my time, it was because I saw cool videos about mathematical theorems and decided that it looked fun. I didn’t move into applied maths because someone convinced me, but because there was interesting, non-trivial modelling that I could pick up and work on; and I didn’t move into the trading industry because someone convinced me that options liquidity is the primary measure of a civilizations virtue, it was because nobody else would hire me in Australia, but a trading firm offered me a shit tonne of money.
Doing interesting work is itself an important part of the recruitment funnel, keeping some easy problems on hand for grads is another important part, and (imo) diversifying the industry out of like 2 cities (London and SanFran) would be a great way to remove a thin wedge from the top of the funnel.
Some people are going to go into whatever field they think is maximum utility, but I reckon they’re the exception. Most scientists are fundamentally amoral people who will go into whatever they find interesting, and whatever they can get work in. I’ve seen people change fields from climate research into weapons manufacturing because the opportunity wasn’t there, and ML Safety is squandering most of the world.
i agree hiring lots of people is rarely the solution, especially for this type of problem. adding more people to a team actually slows it down at some point. generally, you want a small team of extremely competent people.
I read both of the posts you link to, I interpret the main claim as “you can’t use money to hire experts because you can’t reliably identify them”.
But the reality is that knowledge companies do manage to hire experts and acquire expertise. This implies that alignment research organizations should be able to do the same and I think it’s enough to make the the strong version of the claim irrelevant.
I agree with a weaker version which is that some amount of money is wasted because hiring is unreliable, but again it’s the same for all knowledge companies and society has many mechanisms such as reputation, diplomas and tests to better navigate these issues already.
Edit: your argument about Jeff Bezos rings very wrong to me
I was not sure how to articulate that so I pasted the 1st paragraph of the essay into Claude and I think it did a good job of pointing deficiencies, here is the full transcript:
LLM transcript
Me: evaluate that argument with critical thinking “”″First Puzzle Piece By and large, the President of the United States can order people to do things, and they will do those things. POTUS is often considered the most powerful person in the world. And yet, the president cannot order a virus to stop replicating. The president cannot order GDP to increase. The president cannot order world peace. Are there orders the president could give which would result in world peace, or increasing GDP, or the end of a virus? Probably, yes. Any of these could likely even be done with relatively little opportunity cost. Yet no president in history has known which orders will efficiently achieve these objectives. There are probably some people in the world who know which orders would efficiently increase GDP, but the president cannot distinguish them from the millions of people who claim to know (and may even believe it themselves) but are wrong. Last I heard, Jeff Bezos was the official richest man in the world. He can buy basically anything money can buy. But he can’t buy a cure for cancer. Is there some way he could spend a billion dollars to cure cancer in five years? Probably, yes. But Jeff Bezos does not know how to do that. Even if someone somewhere in the world does know how to turn a billion dollars into a cancer cure in five years, Jeff Bezos cannot distinguish that person from the thousands of other people who claim to know (and may even believe it themselves) but are wrong. When non-experts cannot distinguish true expertise from noise, money cannot buy expertise. Knowledge cannot be outsourced; we must understand things ourselves.”″”
Claude: This argument presents an interesting paradox about power and knowledge, but it contains several logical flaws and oversimplifications that weaken its conclusions.
Strengths of the argument: The core insight is valuable—that raw power (political authority, wealth) doesn’t automatically translate into the ability to solve complex problems. The distinction between having resources and knowing how to deploy them effectively is important. The argument correctly identifies that information asymmetry creates genuine challenges for decision-makers.
Critical weaknesses:
False dichotomy on expertise identification: The argument assumes leaders must either perfectly identify true experts or remain completely helpless. In reality, there are established methods for evaluating expertise: peer review, track records, institutional credentials, reproducible results, and consensus among qualified professionals. While imperfect, these mechanisms work reasonably well—we don’t see random people performing surgery or designing bridges.
Ignoring existing institutional solutions: The argument overlooks how real-world leaders actually operate. Presidents don’t personally evaluate every expert claim—they rely on established institutions (CDC, Federal Reserve, Pentagon) that have developed systematic approaches to vetting expertise. Bezos doesn’t personally screen cancer researchers—he funds established institutions and relies on scientific peer review processes.
Overstated claims about knowledge and outsourcing: The conclusion that “knowledge cannot be outsourced” is demonstrably false. We successfully outsource knowledge constantly—when you visit a doctor, fly in an airplane, or use a smartphone, you’re benefiting from outsourced expertise. The key is having reliable mechanisms for credentialing and accountability.
Survivorship bias in examples: The argument cherry-picks unsolved problems (cancer cure, world peace) while ignoring areas where wealth and power have successfully purchased expertise to solve complex problems—like developing COVID vaccines, creating global supply chains, or landing on the moon.
Conflating different types of problems: The argument treats technical problems (virus replication), economic problems (GDP growth), and political problems (world peace) as equivalent, when they have fundamentally different characteristics and solution pathways.
Better framework: Rather than concluding that power is useless without perfect knowledge, a more nuanced view would recognize that effective leadership involves building systems and institutions that can reliably identify and deploy expertise, while acknowledging the inherent uncertainties and limitations in complex problem-solving.
The argument raises important questions about the relationship between power and knowledge, but its binary thinking and sweeping conclusions undermine its analytical value.
I feel like you haven’t actually updated on “we’ve tried this a bunch, lots of people have this idea, and are currently doing it a bunch” in addition to “and it didn’t work nearly as well as you might think.” Like, you might be right that we didn’t try it right or something, but, your initial post was built on a mistaken premise.
My above comment (not focusing on the main post for a moment) does not claim that it’s easy to hire alignment researchers, but that “you can’t use money to hire experts because you can’t reliably identify them” is the wrong causal model to explain why hiring for alignment is difficult because it’s false: if that causal model were true, you’d expect no companies to be able to hire experts, which is not the case. Anyway, maybe this is nitpicking but to me something like “AI alignment is in its infancy so it’s harder to hire for it than for other fields” would be more convincing.
I do miss a lot of background on what has been discussed and tried so far, in retrospect most of what I read on LW so far is Rationality: A-Z and the Codex, plus some of the posts in my feed.
If the library had a “A Short History of AI alignment” section I probably would have read it, maybe pinning something like that somewhere visible will help new users get up to speed on the subject more reliably? I do understand that this is a big time investment though
Nod. But, I think you are also wrong about the “you can hire hire experts” causal model, and “we tried this and it’s harder than you think” is entangled with why, and it didn’t seem that useful to argue the point more if you weren’t making more of an explicit effort to figure out where your model was wrong.
Normally, people can hire try to hire experts, but, it often doesn’t work very well. (I can’t find the relevant Paul Graham essay, but, if you don’t have the good taste to know what expertise looks like, you are going to end up hiring people who are good at persuading you they are experts, rather than actual experts).
It can work in vert well understood domains where it’s obvious what success looks like.
In domains where there is no consensus on what an expert would look like (and, since no one has solved the problem, expertise basically “doesn’t exist”).
(Note you didn’t actually argue that hiring experts works, just asserted it)
I agree it’d be nice to have a clearly written history of what has been tried. An awful lot of things have been tried though, and different people coming in would probably want different histories tailored for different goals, and it’s fairly hard to summarize. It could totally be done, but the people equipped to do a good job of it often have other important things to do and it’s not obviously the right call.
If you want to contribute to the overall situation I do think you should expect to need to have a pretty good understanding of the object level problem as well as what meta-level solutions have been tried. A lot of the reason meta-level solutions have failed is that people didn’t understand the object level problem well enough and scaled the wrong thing.
(try searching “postmortem” and maybe skim some of the things that come up, especially higher karma ones?)
Using money to hire people is only one strategy among many, if it does not work then maybe starting an alignment course at your local university does? Maybe talking to your mayor is useful? what about becoming the mayor? but how would we know since LW mostly does not talk about that?
These are just examples, surely among all the possible strategies we can find a few that are worth doing and give leverage over direct research? (LessWrong itself would be an example). This seems worth exploring.
There was a post on a university class less than two weeks ago.