Joe Carlsmith

Karma: 5,727

Former senior advisor at Open Philanthropy. Doctorate in philosophy from the University of Oxford. Opinions my own.

Video and transcript of talk on writing AI constitutions

Joe Carlsmith9 Apr 2026 17:14 UTC

18 points

0 comments47 min readLW link

On restraining AI development for the sake of safety

Joe Carlsmith19 Mar 2026 16:30 UTC

29 points

6 comments50 min readLW link

Building AIs that do human-like philosophy

Joe Carlsmith29 Jan 2026 17:57 UTC

31 points

5 comments21 min readLW link

Video and transcript of talk on human-like-ness in AI safety

Joe Carlsmith17 Dec 2025 4:09 UTC

10 points

0 comments36 min readLW link

How human-like do safe AI motivations need to be?

Joe Carlsmith12 Nov 2025 5:32 UTC

27 points

9 comments52 min readLW link

Leaving Open Philanthropy, going to Anthropic

Joe Carlsmith3 Nov 2025 17:38 UTC

113 points

30 comments18 min readLW link

Controlling the options AIs can pursue

Joe Carlsmith29 Sep 2025 17:23 UTC

8 points

0 comments35 min readLW link

Video and transcript of talk on giving AIs safe motivations

Joe Carlsmith22 Sep 2025 16:43 UTC

14 points

2 comments50 min readLW link

Giving AIs safe motivations

Joe Carlsmith18 Aug 2025 18:00 UTC

36 points

5 comments51 min readLW link

Video and transcript of talk on “Can goodness compete?”

Joe Carlsmith17 Jul 2025 17:54 UTC

98 points

19 comments34 min readLW link

(joecarlsmith.substack.com)

Video and transcript of talk on AI welfare

Joe Carlsmith22 May 2025 16:15 UTC

24 points

1 comment28 min readLW link

(joecarlsmith.substack.com)

The stakes of AI moral status

Joe Carlsmith21 May 2025 18:20 UTC

79 points

65 comments14 min readLW link

(joecarlsmith.substack.com)

Video and transcript of talk on automating alignment research

Joe Carlsmith30 Apr 2025 17:43 UTC

31 points

0 comments24 min readLW link

(joecarlsmith.com)

Can we safely automate alignment research?

Joe Carlsmith30 Apr 2025 17:37 UTC

65 points

30 comments48 min readLW link

(joecarlsmith.com)

AI for AI safety

Joe Carlsmith14 Mar 2025 15:00 UTC

79 points

13 comments17 min readLW link

(joecarlsmith.substack.com)

Paths and waystations in AI safety

Joe Carlsmith11 Mar 2025 18:52 UTC

42 points

1 comment11 min readLW link

(joecarlsmith.substack.com)

When should we worry about AI power-seeking?

Joe Carlsmith19 Feb 2025 19:44 UTC

30 points

0 comments18 min readLW link

(joecarlsmith.substack.com)

What is it to solve the alignment problem?

Joe Carlsmith13 Feb 2025 18:42 UTC

31 points

6 comments19 min readLW link

(joecarlsmith.substack.com)

How do we solve the alignment problem?

Joe Carlsmith13 Feb 2025 18:27 UTC

69 points

11 comments9 min readLW link

(joecarlsmith.substack.com)

Fake thinking and real thinking

Joe Carlsmith28 Jan 2025 20:05 UTC

117 points

17 comments38 min readLW link