have you ever heard anyone make the argument that it’s good to have AI safety aligned frontier labs (including but not limited to Anthropic) because they will have a seat at the table with the regulators, and the regulators will take major industry players’ opinions more seriously than minor players or activists?
i’ve heard this argument but i’m trying to figure out if it’s common enough to be worth writing a post about
In my opinion, this position was sensible when the labs themselves were branded as MIRI-like but with added emphasis on technical experimentation. The second it became clear that these ‘labs’ principal reward function was not their claimed preferences (for alignment research—which OAI was explicitly communicating under the moniker of safety), our personal semantic landscapes were already trained enough by the narrative, that we missed the major conflict of interest here before establishing the norm.
They will continue to use epistemic asymmetry and leverage over information advantage to make the claim that having any other group at the table is a fruitless endeavour, and use the risks of foreign adversarial advantage to continue to maintain that position strategically. Given that all regulation is domestic, they end up regulating themselves (which is even implied by your question), which IMO can be a worst case scenario from a humanist/existential risk perspective.
I think it’s a good argument, but Anthropic doesn’t seem quite aligned enough to make it work. E.g. they don’t seem to have been pushing for a coordinated Pause to any real extent (and if they don’t think this would be a good idea, haven’t clarified their position as far as I know).
I feel like I’ve heard this argument yes, though when I read lots of Anthropic’s ‘race to the top’ language, it’s not quite that
Here’s an example that feels borderline to me:
Dario Amodei: “Where the world needs to get [is]… from “this technology doesn’t exist” to “the technology exists in a very powerful way and society has actually managed it.” And I think the only way that’s gonna happen is that if you have, at the level of a single company, and eventually at the level of the industry, you’re actually confronting those trade-offs. You have to find a way to actually be competitive, to actually lead the industry in some cases, and yet manage to do things safely. And if you can do that, the gravitational pull you exert is so great. There’s so many factors—from the regulatory environment, to the kinds of people who want to work at different places, to, even sometimes, the views of customers that kind of drive in the direction of: if you can show that you can do well on safety without sacrificing competitiveness—right—if you can find these kinds of win-wins, then others are incentivized to do the same thing.”
have you ever heard anyone make the argument that it’s good to have AI safety aligned frontier labs (including but not limited to Anthropic) because they will have a seat at the table with the regulators, and the regulators will take major industry players’ opinions more seriously than minor players or activists?
i’ve heard this argument but i’m trying to figure out if it’s common enough to be worth writing a post about
In my opinion, this position was sensible when the labs themselves were branded as MIRI-like but with added emphasis on technical experimentation. The second it became clear that these ‘labs’ principal reward function was not their claimed preferences (for alignment research—which OAI was explicitly communicating under the moniker of safety), our personal semantic landscapes were already trained enough by the narrative, that we missed the major conflict of interest here before establishing the norm.
They will continue to use epistemic asymmetry and leverage over information advantage to make the claim that having any other group at the table is a fruitless endeavour, and use the risks of foreign adversarial advantage to continue to maintain that position strategically. Given that all regulation is domestic, they end up regulating themselves (which is even implied by your question), which IMO can be a worst case scenario from a humanist/existential risk perspective.
I’ve heard this before.
I have both heard that argument, and made it myself on occasion, so I’m keen to see your post!
I think it’s a good argument, but Anthropic doesn’t seem quite aligned enough to make it work. E.g. they don’t seem to have been pushing for a coordinated Pause to any real extent (and if they don’t think this would be a good idea, haven’t clarified their position as far as I know).
I feel like I’ve heard this argument yes, though when I read lots of Anthropic’s ‘race to the top’ language, it’s not quite that
Here’s an example that feels borderline to me:
I’ve written a semi-related piece before (https://www.clear-eyed.ai/p/dont-rely-on-a-race-to-the-top), but I think yours would be different enough that it could still make sense
I don’t recall ever hearing this argument.