I think this is straightforwardly true and basically hard to dispute in any meaningful way. A lot of this is basically downstream of AI research being part of a massive market/profit generating endeavour (the broader tech industry), which straightforwardly optimises for more and more “capabilities” (of various kinds) in the name of revenue. Indeed, one could argue that long before the current wave of LLMs the tech industry was developing powerful agentic systems that actively worked to subvert human preferences in favour of disempowering them/manipulating them, all in the name of extracting revenue from intelligent work… we just called the AI system the Google/Facebook/Youtube/Twitter Algorithm.
The trend was always clear: an idealistic mission to make good use of global telecommunication/information networks finds initial success and is a good service. Eventually pressures to make profits cause the core service to be degraded in favour of revenue generation (usually ads). Eventually the company accrues enough shaping power to actively reshape the information network in its favour, and begins dragging everything down with it. In the face of this AI/LLMs are just another product to be used as a revenue engine in the digital economy.
AI safety, by its nature, resists the idea of creating powerful new information technologies to exploit mercilessly for revenue without care for downstream consequences. However, many actors in the AI safety movement are themselves tied to the digital economy, and depend on it for their power, status, and livelihoods. Thus, it is not that there are no genuine concerns being expressed, but that at every turn these concerns must be resolved in a way that keeps the massive tech machine going. Those who don’t agree with this approach are efficiently selected against. For example:
Race dynamics are bad? Maybe we should slow down. We just need to join the race and be the more morally-minded actor. After all, there’s no stopping the race, we’re already locked in.
Our competitors/other parties are doing dangerous things? Maybe we could coordinate and share our concerns and research with them. We can’t fall behind, we’ve got to fly the AI safety flag at the conference of AI Superpowers. Let’s speed up too.
New capabilities are unknown and jagged? Let’s just leave well enough alone. Let’s invest more in R&D so we can safely understand and harness them.
Here’s a new paradigm that might lead to a lot of risk and a lot of reward. We should practice the virtue of silence and buy the world time We should make lots of noise so we can get funding. To study it. Not to use it, of course. Just to understand the safety implications.
To be honest, though, I’m not sure what to do about this. So much has been invested by now that it truly feels like history is moving with a will of its own, rather than individuals steering the ship. Every time I look at what’s going on I feel the sense that maybe I’m just the idiot that hasn’t gotten the signal to hammer that “exploit” button. After all, it’s what everyone else is doing.
Our competitors/other parties are doing dangerous things? Maybe we could coordinate and share our concerns and research with them
What probability do you put that, if Anthropic had really tried, they could have meaningfully coordinated with Openai and Google? Mine is pretty low
I think many of these are predicated on the belief that it would be plausible to get everyone to pause now. In my opinion this is extremely hard and pretty unlikely to happen. I think that, even in worlds where actors continue to race, there are actions we can take to lower the probability of x-risk, and it is a reasonable position to do so.
I separately think that many of the actions you describe historically were dumb/harmful, but are equally consistent with “25% of safety people act like this” and 100%
What probability do you put that, if Anthropic had really tried, they could have meaningfully coordinated with Openai and Google? Mine is pretty low
Not GP but I’d guess maybe 10%. Seems worth it to try. IMO what they should do is hire a team of top negotiators to work full-time on making deals with other AI companies to coordinate and slow down the race.
ETA: What I’m really trying to say is I’m concerned Anthropic (or some other company) would put in a half-assed effort to cooperate and then give up, when what they should do is Try Harder. “Hire a team to work on it full time” is one idea for what Trying Harder might look like.
Fair. My probability is more like 1-2%. I do think that having a team of professional negotiators seems a reasonable suggestion though. I predict the Anthropic position would be that this is really hard to achieve in general, but that if slowing down was ever achieved we would need much stronger evidence of safety issues. In addition to all the commercial pressure, slowing down now could be considered to violate antitrust law. And it seems way harder to get all the other actors like Meta or DeepSeek or xAI on board, meaning I don’t even know if I think it’s good for some of the leading actors to unilaterally slow things down now (I predict mildly net good, but with massive uncertainty and downsides)
I think this is straightforwardly true and basically hard to dispute in any meaningful way. A lot of this is basically downstream of AI research being part of a massive market/profit generating endeavour (the broader tech industry), which straightforwardly optimises for more and more “capabilities” (of various kinds) in the name of revenue. Indeed, one could argue that long before the current wave of LLMs the tech industry was developing powerful agentic systems that actively worked to subvert human preferences in favour of disempowering them/manipulating them, all in the name of extracting revenue from intelligent work… we just called the AI system the Google/Facebook/Youtube/Twitter Algorithm.
The trend was always clear: an idealistic mission to make good use of global telecommunication/information networks finds initial success and is a good service. Eventually pressures to make profits cause the core service to be degraded in favour of revenue generation (usually ads). Eventually the company accrues enough shaping power to actively reshape the information network in its favour, and begins dragging everything down with it. In the face of this AI/LLMs are just another product to be used as a revenue engine in the digital economy.
AI safety, by its nature, resists the idea of creating powerful new information technologies to exploit mercilessly for revenue without care for downstream consequences. However, many actors in the AI safety movement are themselves tied to the digital economy, and depend on it for their power, status, and livelihoods. Thus, it is not that there are no genuine concerns being expressed, but that at every turn these concerns must be resolved in a way that keeps the massive tech machine going. Those who don’t agree with this approach are efficiently selected against. For example:
Race dynamics are bad?
Maybe we should slow down.We just need to join the race and be the more morally-minded actor. After all, there’s no stopping the race, we’re already locked in.Our competitors/other parties are doing dangerous things?
Maybe we could coordinate and share our concerns and research with them.We can’t fall behind, we’ve got to fly the AI safety flag at the conference of AI Superpowers. Let’s speed up too.New capabilities are unknown and jagged?
Let’s just leave well enough alone.Let’s invest more in R&D so we can safely understand and harness them.Here’s a new paradigm that might lead to a lot of risk and a lot of reward.
We should practice the virtue of silence and buy the world timeWe should make lots of noise so we can get funding. To study it. Not to use it, of course. Just to understand the safety implications.Maybe progress in AI is slower than we thought.
Hooray! Maybe we can chill for a bitThat’s time for us to exploit our superior AI knowledge and accelerate progress to our benefit.We’ve seen this before.
To be honest, though, I’m not sure what to do about this. So much has been invested by now that it truly feels like history is moving with a will of its own, rather than individuals steering the ship. Every time I look at what’s going on I feel the sense that maybe I’m just the idiot that hasn’t gotten the signal to hammer that “exploit” button. After all, it’s what everyone else is doing.
What probability do you put that, if Anthropic had really tried, they could have meaningfully coordinated with Openai and Google? Mine is pretty low
I think many of these are predicated on the belief that it would be plausible to get everyone to pause now. In my opinion this is extremely hard and pretty unlikely to happen. I think that, even in worlds where actors continue to race, there are actions we can take to lower the probability of x-risk, and it is a reasonable position to do so.
I separately think that many of the actions you describe historically were dumb/harmful, but are equally consistent with “25% of safety people act like this” and 100%
Not GP but I’d guess maybe 10%. Seems worth it to try. IMO what they should do is hire a team of top negotiators to work full-time on making deals with other AI companies to coordinate and slow down the race.
ETA: What I’m really trying to say is I’m concerned Anthropic (or some other company) would put in a half-assed effort to cooperate and then give up, when what they should do is Try Harder. “Hire a team to work on it full time” is one idea for what Trying Harder might look like.
Fair. My probability is more like 1-2%. I do think that having a team of professional negotiators seems a reasonable suggestion though. I predict the Anthropic position would be that this is really hard to achieve in general, but that if slowing down was ever achieved we would need much stronger evidence of safety issues. In addition to all the commercial pressure, slowing down now could be considered to violate antitrust law. And it seems way harder to get all the other actors like Meta or DeepSeek or xAI on board, meaning I don’t even know if I think it’s good for some of the leading actors to unilaterally slow things down now (I predict mildly net good, but with massive uncertainty and downsides)