Thanks for writing this post! I’m curious to hear more about this bit of your beliefs going in:
The existential risk argument is suspiciously aligned with the commercial incentives of AI executives. It simultaneously serves to hype up capabilities and coolness while also directing attention away from the real problems that are already emerging. It’s suspicious that the apparent solution to this problem is to do more AI research as opposed to doing anything that would actually hurt AI companies financially.
Are there arguments or evidence that would have convinced you the existential risk worries in the industry were real / sincere?
For context, I work at a frontier AI lab and from where I sit it’s very clear to me that the x-risk worries aren’t coming from a place of hype, and people who know more about the technology generally get more worried rather than less. (The executives still could be disingenuous in their expressed concern, but if so they’re doing it in order to placate their employees who have real concerns about the risks, not to sound cool to their investors.)
I don’t know what sorts of things would make that clearer from the outside, though. Curious if any of the following arguments would have been compelling to you:
The AI labs most willing to take costly actions now (like hire lots of safety researchers or support AI regulation that the rest of the industry opposes or make advance commitments about the preparations they’ll take before releasing future models) are also the ones talking the most about catastrophic or existential risks.
Like if you thought this stuff was an underhanded tactic to drum up hype and get commercial success by lying to the public, then it’s strange that Meta AI, not usually known for its tremendous moral integrity, is so principled about telling the truth that they basically never bring these risks up!
People often quit their well-paying jobs at AI companies in order to speak out about existential risk or for reasons of insufficient attention paid to AI safety from catastrophic or existential risks.
The standard trajectory is for lab executives to talk about existential risk a moderate amount early on, when they’re a small research organization, and then become much quieter about it over time as they become subject to more and more commercial pressure. You actually see much more discussion of existential risk among the lower-level employees whose statements are less scrutinized for being commercially unwise. This is a weird pattern for something whose main purpose is to attract hype and investment!
The AI labs most willing to take costly actions now (like hire lots of safety researchers or support AI regulation that the rest of the industry opposes or make advance commitments about the preparations they’ll take before releasing future models) are also the ones talking the most about catastrophic or existential risks.
Are these actually costly actions to any meaningful degree? In the context of the amount of money sloshing around the AI space, hiring even “lots” of safety researchers seems like a rounding error.
I may misunderstand the commitments you’re referring to, but I think these are all purely internal? And thus not really commitments at all.
Like if you thought this stuff was an underhanded tactic to drum up hype and get commercial success by lying to the public, then it’s strange that Meta AI, not usually known for its tremendous moral integrity, is so principled about telling the truth that they basically never bring these risks up!
This seems to presume that I have some well-formed views on how AI labs compare, and I don’t have those. All I really know about Meta is that they’re behind and doing open source. I wouldn’t even know where to start an analysis of their relative level of moral integrity. So far as it goes (and, again, this is just the view of someone that reads what breaks through in mainstream news coverage), I have a very clear sense that OpenAI is run by compulsive liars but not much more to go on beyond that other than a general sense that people in the industry do a lot of hype.
People often quit their well-paying jobs at AI companies in order to speak out about existential risk or for reasons of insufficient attention paid to AI safety from catastrophic or existential risks.
I’m deliberately not looking this up and telling you my impression of this phenomenon. I’m coming up with three cases of it (my recollection is maybe garbled) that broke though into my media universe:
My understanding is that Anthropic was formed by people who broke away from OpenAI based on “safety” concerns. But then they just founded another company doing the same thing? And they got very rich doing it. So that all has roughly zero credibility.
There was an engineer at one of the big tech companies (Google? Microsoft?) who got a lot of attention for claiming that AI had achieved sentience and deserved personhood and either quit or got fired. The universal take seemed to be that he was insane.
One of the people involved in AI 2027 had quit or gotten fired from OpenAI(?) and refused to sign an NDA that would have come with a big payday so that he could go public with criticism. That seems pretty sincere and credible so far as it goes, but it’s also one person. And then AI 2027 was so overwrought that I couldn’t take it seriously.
And then, beyond that, you seem to have a lot of people signing these open letters with no cost attached. For something like this to breakthrough, it needs to be (in my estimation at least) large numbers of people acting in a coordinated way and leaving the industry entirely.
I’d analogize it to politics. In any given presidential administration, you have one or two people who get really worked up and resign angrily and then go on TV attacking their former bosses. That’s just to be expected and doesn’t really reflect anything beyond the fact that sometimes people have strong reactions or particularized grievances or whatever. The thing that (should) wake you up is when this is happening at scale.
Are there arguments or evidence that would have convinced you the existential risk worries in the industry were real / sincere?
Only steps that carry meaningful financial consequences. I agree that any individual researcher can send a credible signal by quitting and giving up their stock, at least to the extent they don’t just immediately go into a similarly compensated position. But, you’re always left with the counter-signal from all the other researchers not doing that.
On a more institutional level, it would have to be something that actually threatens the valuation of the companies.
Thanks for writing this post! I’m curious to hear more about this bit of your beliefs going in:
Are there arguments or evidence that would have convinced you the existential risk worries in the industry were real / sincere?
For context, I work at a frontier AI lab and from where I sit it’s very clear to me that the x-risk worries aren’t coming from a place of hype, and people who know more about the technology generally get more worried rather than less. (The executives still could be disingenuous in their expressed concern, but if so they’re doing it in order to placate their employees who have real concerns about the risks, not to sound cool to their investors.)
I don’t know what sorts of things would make that clearer from the outside, though. Curious if any of the following arguments would have been compelling to you:
The AI labs most willing to take costly actions now (like hire lots of safety researchers or support AI regulation that the rest of the industry opposes or make advance commitments about the preparations they’ll take before releasing future models) are also the ones talking the most about catastrophic or existential risks.
Like if you thought this stuff was an underhanded tactic to drum up hype and get commercial success by lying to the public, then it’s strange that Meta AI, not usually known for its tremendous moral integrity, is so principled about telling the truth that they basically never bring these risks up!
People often quit their well-paying jobs at AI companies in order to speak out about existential risk or for reasons of insufficient attention paid to AI safety from catastrophic or existential risks.
The standard trajectory is for lab executives to talk about existential risk a moderate amount early on, when they’re a small research organization, and then become much quieter about it over time as they become subject to more and more commercial pressure. You actually see much more discussion of existential risk among the lower-level employees whose statements are less scrutinized for being commercially unwise. This is a weird pattern for something whose main purpose is to attract hype and investment!
Are these actually costly actions to any meaningful degree? In the context of the amount of money sloshing around the AI space, hiring even “lots” of safety researchers seems like a rounding error.
I may misunderstand the commitments you’re referring to, but I think these are all purely internal? And thus not really commitments at all.
This seems to presume that I have some well-formed views on how AI labs compare, and I don’t have those. All I really know about Meta is that they’re behind and doing open source. I wouldn’t even know where to start an analysis of their relative level of moral integrity. So far as it goes (and, again, this is just the view of someone that reads what breaks through in mainstream news coverage), I have a very clear sense that OpenAI is run by compulsive liars but not much more to go on beyond that other than a general sense that people in the industry do a lot of hype.
I’m deliberately not looking this up and telling you my impression of this phenomenon. I’m coming up with three cases of it (my recollection is maybe garbled) that broke though into my media universe:
My understanding is that Anthropic was formed by people who broke away from OpenAI based on “safety” concerns. But then they just founded another company doing the same thing? And they got very rich doing it. So that all has roughly zero credibility.
There was an engineer at one of the big tech companies (Google? Microsoft?) who got a lot of attention for claiming that AI had achieved sentience and deserved personhood and either quit or got fired. The universal take seemed to be that he was insane.
One of the people involved in AI 2027 had quit or gotten fired from OpenAI(?) and refused to sign an NDA that would have come with a big payday so that he could go public with criticism. That seems pretty sincere and credible so far as it goes, but it’s also one person. And then AI 2027 was so overwrought that I couldn’t take it seriously.
And then, beyond that, you seem to have a lot of people signing these open letters with no cost attached. For something like this to breakthrough, it needs to be (in my estimation at least) large numbers of people acting in a coordinated way and leaving the industry entirely.
I’d analogize it to politics. In any given presidential administration, you have one or two people who get really worked up and resign angrily and then go on TV attacking their former bosses. That’s just to be expected and doesn’t really reflect anything beyond the fact that sometimes people have strong reactions or particularized grievances or whatever. The thing that (should) wake you up is when this is happening at scale.
Only steps that carry meaningful financial consequences. I agree that any individual researcher can send a credible signal by quitting and giving up their stock, at least to the extent they don’t just immediately go into a similarly compensated position. But, you’re always left with the counter-signal from all the other researchers not doing that.
On a more institutional level, it would have to be something that actually threatens the valuation of the companies.