https://twitter.com/DavidSKrueger
https://www.davidscottkrueger.com/
https://therealartificialintelligence.substack.com/p/the-real-ai-deploys-itself
David Scott Krueger (formerly: capybaralet)
Contra Leicht on AI Pauses
Diary of a “Doomer”: 12+ years arguing about AI risk (part 1)
Yeah, this is a good point. The way I’ve put it before is: when you are thinking about what should happen, you’re basically imagining you have some sort of magic wand that makes it happen. But how powerful is the magic wand? I haven’t thought this through to my satisfaction, so for now I’m just going based on intuitive notions of what is actually realistically achievable.
But one way of trying to define the limits of the “magic wand” here would be: You get to magically choose a policy to be adopted, but you don’t get to magically control people’s behavior afterwards. So if you want to get people to limit AI uses, your policy needs to deal with their potential incentives to do otherwise.
This means, IIUC, that the answer to your final question is “yes”. But it’s more a matter of perceived incentives here, IMO, see: https://therealartificialintelligence.substack.com/p/following-the-incentives
> If someone believes that it will be hard to make international agreements to stop AI because countries will have incentives against this, does that mean that those considerations now fall under “incentives” and thus count for purpose of determining whether stopping is “hard”?
There’s not a lot of demand for human cloning. See https://wiki.aiimpacts.org/doku.php?id=responses_to_ai:technological_inevitability:incentivized_technologies_not_pursued:start
Stopping AI is easier than Regulating it.
Good point RE deskilling of alignment researchers.
You can’t trust violence
Could a single rogue AI destroy humanity?
Inkhaven menu, part 2
Inkhaven: a menu
Alignment vs. Safety, part 2: Alignment
“Alignment” and “Safety”, part one: What is “AI Safety”?
Reflections on the largest AI safety protest in US history
Right, so the response would be “just don’t worry about getting re-elected and try to get some shit done in your term”.
Ten different ways of thinking about Gradual Disempowerment
“Following the incentives”
Is AI a house of cards?
Systematically dismantle the AI compute supply chain.
Thanks for sharing your thoughts.
So your condition is “Severe or willful violation of our RSP, or misleading the public about it”.
My guess is that most people understood the RSP, or at least the part about not releasing dangerous systems, as a COMMITMENT in the sense of “we won’t do this” not a commitment in the sense of “we won’t do this… unless we publicly change our mind first”. I do think it’s hard to get good data on this, but I wonder if you disagree with my guess? It seems like there was at least substantial confusion around this point within the AI safety community (who I’d consider part of “the public”), confusion which mostly could’ve been easily remedied by Anthropic—the failure to do so seems like at least “letting a significant fraction of the public be misled”, which I think counts as “misleading the public”.
Unless, or course, the RSP ought to have been interpretted as a COMMITMENT all along, in which case, this update seems like a violation of an implicit “meta-commitment” to honor the COMMITMENT in perpituity.
If you agree with the thrust of my argument, it seems like you’d have to either 1) agree that your condition is met or 2) argue that it was clear to the public that the commitment was not a COMMITMENT, or 3) argue that there is no such implicit meta-commitment.
I’d appreciate if you would clarify where exactly our disagreement lies.
I don’t mean to put regulation and stopping in opposition. My point is that, stopping is likely a precondition for any form of regulation that would significantly slow down development or deployment. Like, you, I am trying to argue against framings that put
I think stopping unlocks a lot of ability for countries to regulate in line with their values and priorities that otherwise might not be possible because of race dynamics.
I’ve tried to edit my post to make that clearer, please let me know if you have any specific suggestions on that front.