David Scott Krueger

Karma: 3,080

https://twitter.com/DavidSKrueger
https://www.davidscottkrueger.com/
https://therealartificialintelligence.substack.com/p/the-real-ai-deploys-itself

David Scott Krueger 26 Apr 2026 4:56 UTC
2 points
0
in reply to: Luc Brinkman’s comment on: Reasons not to trust AI
In a word: InkHaven.

But seriously, I’m still working full-time on Evitable.com and so am trying to churn out my daily blog posts FAST. There are topics I know I have things to say about, and I try to get them down in words in ~1-2 hours tops. In this case, the motivation is something like: “It’s annoying when people make behaviorist arguments about how AIs are more aligned/trustworthy than people”.

David Scott Krueger 23 Apr 2026 22:14 UTC
2 points
0
in reply to: JBlack’s comment on: Marginal Risk is BS
Well, I did say “Naively”… but yes I agree the analysis was too naive, and I will edit the post. You make a good point that it can be improved by considering that harms from AI (especially large-scale ones like x-risk) are overdetermined when there are multiple developers. The naive analysis is more accurate when the risk is smaller.
As a side note, if the risk from a single project is so large, then the first project is probably disincentivized at the individual level (would you really want to take an 80% risk of extinction?), and it’s a “pure” coordination problem, like a stag hunt, rather than an incentive problem (like prisoner’s dilema).
Another way the “naive” calculation can be is wrong (which is the main one I had in mind) is if the risks of different projects are correlated, which they are, e.g. because they are all using similar technology.

David Scott Krueger 23 Apr 2026 22:00 UTC
3 points
0
in reply to: Seth Herd’s comment on: Marginal Risk is BS
I’m not going after particular people’s justifications for their work; I’m going after the institutionalization of “marginal risk” as a relevant concept and the way it justifies unacceptable risk-taking.

David Scott Krueger 19 Apr 2026 9:41 UTC
2 points
0
in reply to: Davidmanheim’s comment on: Post-Scarcity is bullshit
I think it’s helpful to disaggregate things sometimes, and e.g. look at what trends might underly this general trend we observe

David Scott Krueger 17 Apr 2026 6:36 UTC
2 points
0
in reply to: Davidmanheim’s comment on: Post-Scarcity is bullshit
greater wealth hasn’t changed the picture tremendously
I don’t think I made that claim anywhere in my piece.

David Scott Krueger 17 Apr 2026 6:34 UTC
2 points
0
in reply to: antonl’s comment on: Contra Leicht on AI Pauses
I think we disagree on the likelihood of x-risk advocates having such a precise level of impact and power. If AI x-risk advocates don’t have much sway over the political demands of an anti-AI movement, then I don’t think we have much to worry about.
Your argument seems to treat “pause/stop AI” as almost like an info-hazard, but it’s really quite an obvious idea. If “pause/stop AI” becomes a core idea in an anti-AI movement where x-risk advocates lack power, I’d expect that happens because it’s memetically fit and other people were saying it as well, so it probably would’ve happened anyways.
In my mind, the world’s where there is political will for a pause are mostly the ones where there is a broad understanding that if we don’t stop building AI, it is going to replace humanity, and this motivates the need for an international pause. Similarly, I think a pause that happens without the USG having been AGI-pilled seems really unlikely, and it’s also very hard for me to imagine the current administration doing a unilateral pause.
Overall, I think if “pause/stop AI” succeeds at all, it will probably succeed substantively, the slice of worlds where this doesn’t happen seem very narrow, because it’s a big ask.

David Scott Krueger 13 Apr 2026 22:16 UTC
4 points
2
on: Treaties, Regulations, and Research can be Complements
I don’t mean to put regulation and stopping in opposition. My point is that, stopping is likely a precondition for any form of regulation that would significantly slow down development or deployment. Like, you, I am trying to argue against framings that put
“we need a global treaty to stop AI risks” in opposition to “domestic regulation is the only realistic path.”
I think stopping unlocks a lot of ability for countries to regulate in line with their values and priorities that otherwise might not be possible because of race dynamics.

I’ve tried to edit my post to make that clearer, please let me know if you have any specific suggestions on that front.

David Scott Krueger 13 Apr 2026 17:23 UTC
2 points
0
in reply to: TFD’s comment on: Stopping AI is easier than Regulating it.
Yeah, this is a good point. The way I’ve put it before is: when you are thinking about what should happen, you’re basically imagining you have some sort of magic wand that makes it happen. But how powerful is the magic wand? I haven’t thought this through to my satisfaction, so for now I’m just going based on intuitive notions of what is actually realistically achievable.

But one way of trying to define the limits of the “magic wand” here would be: You get to magically choose a policy to be adopted, but you don’t get to magically control people’s behavior afterwards. So if you want to get people to limit AI uses, your policy needs to deal with their potential incentives to do otherwise.

This means, IIUC, that the answer to your final question is “yes”. But it’s more a matter of perceived incentives here, IMO, see: https://therealartificialintelligence.substack.com/p/following-the-incentives
> If someone believes that it will be hard to make international agreements to stop AI because countries will have incentives against this, does that mean that those considerations now fall under “incentives” and thus count for purpose of determining whether stopping is “hard”?

David Scott Krueger 13 Apr 2026 17:15 UTC
3 points
0
in reply to: Pedro Freire’s comment on: Stopping AI is easier than Regulating it.
There’s not a lot of demand for human cloning. See https://wiki.aiimpacts.org/doku.php?id=responses_to_ai:technological_inevitability:incentivized_technologies_not_pursued:start

David Scott Krueger 12 Apr 2026 22:31 UTC
2 points
0
in reply to: Mateusz Bagiński’s comment on: Ten different ways of thinking about Gradual Disempowerment
Good point RE deskilling of alignment researchers.

David Scott Krueger 6 Apr 2026 3:47 UTC
2 points
0
in reply to: Brendan Long’s comment on: “Following the incentives”
Right, so the response would be “just don’t worry about getting re-elected and try to get some shit done in your term”.

David Scott Krueger 21 Mar 2026 5:09 UTC
1 point
4
in reply to: Zac Hatfield-Dodds’s comment on: Anthropic’s leading researchers acted as moderate accelerationists
Thanks for sharing your thoughts.

So your condition is “Severe or willful violation of our RSP, or misleading the public about it”.

My guess is that most people understood the RSP, or at least the part about not releasing dangerous systems, as a COMMITMENT in the sense of “we won’t do this” not a commitment in the sense of “we won’t do this… unless we publicly change our mind first”. I do think it’s hard to get good data on this, but I wonder if you disagree with my guess? It seems like there was at least substantial confusion around this point within the AI safety community (who I’d consider part of “the public”), confusion which mostly could’ve been easily remedied by Anthropic—the failure to do so seems like at least “letting a significant fraction of the public be misled”, which I think counts as “misleading the public”.

Unless, or course, the RSP ought to have been interpretted as a COMMITMENT all along, in which case, this update seems like a violation of an implicit “meta-commitment” to honor the COMMITMENT in perpituity.

If you agree with the thrust of my argument, it seems like you’d have to either 1) agree that your condition is met or 2) argue that it was clear to the public that the commitment was not a COMMITMENT, or 3) argue that there is no such implicit meta-commitment.

I’d appreciate if you would clarify where exactly our disagreement lies.

David Scott Krueger 16 Mar 2026 15:03 UTC
LW: 2 AF: 1
0
AF
on: What can be learned from scary demos? A snitching case study
- What happens if you merge the bash and the audit tool, just giving the AI a single bash tool from which it can
fragment?

David Scott Krueger 16 Mar 2026 14:59 UTC
LW: 2 AF: 1
0
AF
on: What can be learned from scary demos? A snitching case study
For now, such evidence is not really relevant to takeover risk because models are weak and can’t execute on complex world domination plans, but I can imagine such arguments becoming more directly relevant in the future.
Maybe a nit RE phrasing, but the reasoning here doesn’t make sense. It’s relevant to takeover risk even if the model is known to be weak

David Scott Krueger 12 Mar 2026 21:33 UTC
2 points
0
in reply to: Mo Putera’s comment on: What do we know about AI company employee giving?
Thanks for the pointers! I think there should probably be more, but I’m glad to know there’s more than I was aware of.

David Scott Krueger 10 Mar 2026 1:02 UTC
15 points
1
on: Can you donate to AI advocacy?
My new organization https://evitable.com/ is fund-raising. I’m a long-time AI safety researcher and AI professor and initiated the one-sentence Statement on AI Risk.

Evitable’s mission is to inform and organize the public to confront societal-scale risks of AI, and put an end to the reckless race to develop superintelligence.

Our vision is that in ten years time, people will look back at the current race to build superintelligence as unthinkably terrible and wrongheaded, similar to how people view things like slavery.

You can donate at https://www.every.org/evitable or https://manifund.org/projects/evitable-a-new-public-facing-ai-risk-nonprofit-a1ll15pvkcb.

David Scott Krueger 16 Oct 2025 0:29 UTC
4 points
0
on: Gradual Disempowerment Monthly Roundup
In general I think you should be a little suspicious of all lab self-reports about data usage, partly because they have a strong incentive to slightly fudge the category boundaries. In this case, they had a top-level category for “self-expression” which included “relationships and personal reflection” as well as “games and role-play”. Make of that what you will. But overall I think this kind of work is extremely valuable, and I’m very glad they did it.

Another reason I heard is that they don’t include enterprise use here, e.g. because of privacy agreements with companies. This data may also look more “job replace-y” vs. “complementary”.

David Scott Krueger 3 Oct 2025 18:33 UTC
4 points
0
in reply to: Karl Krueger’s comment on: Antisocial media: AI’s killer app?
agreed—I’m suggesting they’ll be blending together, and that moving towards AI generated videos as the primary means of generating content on social media will help companies automate content creators

David Scott Krueger 26 Sep 2025 16:30 UTC
20 points
9
on: Safety researchers should take a public stance
Huge thanks to all the lab employees who stated their support for an AI moratorium in this thread!

Can we make this louder and more public? This is really important for the public to understand.

David Scott Krueger 26 Sep 2025 16:28 UTC
4 points
0
in reply to: Boaz Barak’s comment on: Safety researchers should take a public stance
why not?