David Duvenaud

Karma: 1,201

My website is https://www.cs.toronto.edu/~duvenaud/

Persona Self-replication experiment

Jan_Kulveit, Raymond Douglas, vgel, Ondřej Havlíček, owencb and David Duvenaud

2 Apr 2026 18:18 UTC

39 points

0 comments8 min readLW link

(theartificialself.ai)

Persona self-replication experiment

Jan_Kulveit, Raymond Douglas, vgel, Ondřej Havlíček, owencb and David Duvenaud

2 Apr 2026 18:10 UTC

8 points

0 comments8 min readLW link

Models differ in identity propensities

Jan_Kulveit, Raymond Douglas, vgel, owencb, David Duvenaud and Ondřej Havlíček

16 Mar 2026 10:45 UTC

58 points

0 comments14 min readLW link

The Artificial Self

Jan_Kulveit, Raymond Douglas, vgel, owencb, David Duvenaud and Ondřej Havlíček

15 Mar 2026 1:37 UTC

118 points

13 comments29 min readLW link

Disempowerment patterns in real-world AI usage

David Duvenaud, mrinank_sharma and Raymond Douglas

29 Jan 2026 16:36 UTC

49 points

3 comments2 min readLW link

(www.anthropic.com)

When does competition lead to recognisable values?

Jan_Kulveit, beren, David Duvenaud and Raymond Douglas

12 Jan 2026 23:13 UTC

65 points

18 comments25 min readLW link

(postagi.org)

The Economics of Transformative AI

Jan_Kulveit, David Duvenaud and Raymond Douglas

8 Jan 2026 22:22 UTC

64 points

4 comments18 min readLW link

(post-agi.org)

Upcoming Workshop on Post-AGI Economics, Culture, and Governance

David Duvenaud, Raymond Douglas, Jan_Kulveit, scasper and MariaK

28 Oct 2025 21:55 UTC

43 points

1 comment2 min readLW link

Summary of our Workshop on Post-AGI Outcomes

David Duvenaud, Raymond Douglas, Nora_Ammann and Jan_Kulveit

29 Aug 2025 17:14 UTC

110 points

3 comments3 min readLW link

Upcoming workshop on Post-AGI Civilizational Equilibria

David Duvenaud, Jan_Kulveit, Raymond Douglas, Nora_Ammann and David Scott Krueger

21 Jun 2025 15:57 UTC

25 points

0 comments1 min readLW link

Gradual Disempowerment: Systemic Existential Risks from Incremental AI Development

Jan_Kulveit, Raymond Douglas, Nora_Ammann, Deger Turan, David Scott Krueger and David Duvenaud

30 Jan 2025 17:03 UTC

189 points

65 comments2 min readLW link

(gradual-disempowerment.ai)

Sabotage Evaluations for Frontier Models

David Duvenaud, Joe Benton, Sam Bowman, evhub, mishajw, Eric Christiansen, HoldenKarnofsky, Ethan Perez and Buck

18 Oct 2024 22:33 UTC

95 points

56 comments6 min readLW link

(assets.anthropic.com)

Simple probes can catch sleeper agents

Monte M, Carson Denison, Zac Hatfield-Dodds, David Duvenaud, Sam Bowman, Ethan Perez and evhub

23 Apr 2024 21:10 UTC

133 points

21 comments1 min readLW link

(www.anthropic.com)

Sleeper Agents: Training Deceptive LLMs that Persist Through Safety Training

evhub, Carson Denison, Meg, Monte M, David Duvenaud, Nicholas Schiefer and Ethan Perez

12 Jan 2024 19:51 UTC

310 points

95 comments3 min readLW link

(arxiv.org)