Why does the US spend less than $0.1 billion/year on AI alignment/safety?
Because no-one knows how to spend any more? What has come out of $0.1 billion a year?
I am not connected to work on AI alignment, but I do notice that every chatbot gets jailbroken immediately, and that I do not notice any success stories.
That’s not right. You could easily spend a billion dollars just on better evals and better interpretability.
For the real alignment problem, the fact that 0.1 bill a year hasn’t yielded returns, doesn’t mean 100 billion won’t. It’s one problem. No one has gotten much traction on it. You’d expect it to look like a step function, not a smooth curve.
The Superalignment team at OpenAI kept complaining that they did not get the 20% compute they were promised, and this was a major cause of the OpenAI drama. This shows how important resources are for alignment.
A lot of alignment researchers stayed at OpenAI despite the drama, but still quit sometime later after citing poor productivity. Maybe they consider it more important to work somewhere with better resources, than to access to OpenAI’s newest models etc.
Alignment research costs money and resources just like capabilities research. Better funded AI labs like OpenAI and DeepMind are racing ahead of poorly funded AI labs in poor countries which you never hear about. Likewise, if alignment research was better funded, it also has a better chance of winning the race.
Note: after I agreed with your comment the score dropped back to 0 because someone else disagreed. Maybe they disagree that you can easily spend a fraction of a billion on evals?
I know very little about AI evals. Are these like the IQ tests for AIs? Why would a good eval cost millions of dollars?
This is an important point. AI alignment/safety organizations take money as input and write very abstract papers as their output, which usually have no immediate applications. I agree it may appear very unproductive.
However, if we think from first principles, a lot of other things are like that. For instance, when you go to school, you study the works of Shakespeare, you learn to play the guitar, and you learn how Spanish pronouns work. These things appear to be a complete waste of time. If 50 million students in the US spend 1 hour a day on these kinds of activities, and each hour is valued at only $10, that’s $180 billion/year.
But we know these things are not a waste of time, because in hindsight, when you study how students grow up, this work somehow helps them later in life.
Lots of things appear useless, but are valuable in hindsight for reasons beyond the intuitive set of reasons we evolved to understand.
Studying the nucleus of atoms might appear like a useless curiosity, if you didn’t know it’ll lead to nuclear energy. There are no real world applications for a long time but suddenly there are enormous applications.
Pasteur’s studies on fermentation might appear limited to modest winemaking improvements, but it led to the discovery of germ theory which saved countless lives.
The stone age people studying weird rocks may have discovered obsidian and copper. Those who studied the strange seeds that plants produce may have discovered agriculture.
We don’t know how valuable this alignment work is. We should cope with this uncertainty probabilistically: if there is a 50% chance it will help us, the benefits per cost is halved, but that doesn’t reduce ideal spending to zero.
Because no-one knows how to spend any more? What has come out of $0.1 billion a year?
I am not connected to work on AI alignment, but I do notice that every chatbot gets jailbroken immediately, and that I do not notice any success stories.
That’s not right. You could easily spend a billion dollars just on better evals and better interpretability.
For the real alignment problem, the fact that 0.1 bill a year hasn’t yielded returns, doesn’t mean 100 billion won’t. It’s one problem. No one has gotten much traction on it. You’d expect it to look like a step function, not a smooth curve.
I completely agree!
The Superalignment team at OpenAI kept complaining that they did not get the 20% compute they were promised, and this was a major cause of the OpenAI drama. This shows how important resources are for alignment.
A lot of alignment researchers stayed at OpenAI despite the drama, but still quit sometime later after citing poor productivity. Maybe they consider it more important to work somewhere with better resources, than to access to OpenAI’s newest models etc.
Alignment research costs money and resources just like capabilities research. Better funded AI labs like OpenAI and DeepMind are racing ahead of poorly funded AI labs in poor countries which you never hear about. Likewise, if alignment research was better funded, it also has a better chance of winning the race.
Note: after I agreed with your comment the score dropped back to 0 because someone else disagreed. Maybe they disagree that you can easily spend a fraction of a billion on evals?
I know very little about AI evals. Are these like the IQ tests for AIs? Why would a good eval cost millions of dollars?
This is an important point. AI alignment/safety organizations take money as input and write very abstract papers as their output, which usually have no immediate applications. I agree it may appear very unproductive.
However, if we think from first principles, a lot of other things are like that. For instance, when you go to school, you study the works of Shakespeare, you learn to play the guitar, and you learn how Spanish pronouns work. These things appear to be a complete waste of time. If 50 million students in the US spend 1 hour a day on these kinds of activities, and each hour is valued at only $10, that’s $180 billion/year.
But we know these things are not a waste of time, because in hindsight, when you study how students grow up, this work somehow helps them later in life.
Lots of things appear useless, but are valuable in hindsight for reasons beyond the intuitive set of reasons we evolved to understand.
Studying the nucleus of atoms might appear like a useless curiosity, if you didn’t know it’ll lead to nuclear energy. There are no real world applications for a long time but suddenly there are enormous applications.
Pasteur’s studies on fermentation might appear limited to modest winemaking improvements, but it led to the discovery of germ theory which saved countless lives.
The stone age people studying weird rocks may have discovered obsidian and copper. Those who studied the strange seeds that plants produce may have discovered agriculture.
We don’t know how valuable this alignment work is. We should cope with this uncertainty probabilistically: if there is a 50% chance it will help us, the benefits per cost is halved, but that doesn’t reduce ideal spending to zero.