Joe Carlsmith

Karma: 5,068

Senior research analyst at Open Philanthropy. Doctorate in philosophy from the University of Oxford. Opinions my own.

Video and transcript of presentation on Scheming AIs

Joe CarlsmithMar 22, 2024, 3:52 PM

32 points

1 comment32 min readLW link

On green

Joe CarlsmithMar 21, 2024, 5:38 PM

269 points

35 comments31 min readLW link

On the abolition of man

Joe CarlsmithJan 18, 2024, 6:17 PM

90 points

18 comments41 min readLW link

Being nicer than Clippy

Joe CarlsmithJan 16, 2024, 7:44 PM

109 points

32 comments27 min readLW link

An even deeper atheism

Joe CarlsmithJan 11, 2024, 5:28 PM

125 points

47 comments15 min readLW link

Does AI risk “other” the AIs?

Joe CarlsmithJan 9, 2024, 5:51 PM

60 points

3 comments8 min readLW link

When “yang” goes wrong

Joe CarlsmithJan 8, 2024, 4:35 PM

73 points

6 comments13 min readLW link

Deep atheism and AI risk

Joe CarlsmithJan 4, 2024, 6:58 PM

153 points

22 comments27 min readLW link

Gentleness and the artificial Other

Joe CarlsmithJan 2, 2024, 6:21 PM

313 points

33 comments11 min readLW link

Otherness and control in the age of AGI

Joe CarlsmithJan 2, 2024, 6:15 PM

43 points

0 comments7 min readLW link

Empirical work that might shed light on scheming (Section 6 of “Scheming AIs”)

Joe CarlsmithDec 11, 2023, 4:30 PM

8 points

0 comments21 min readLW link

Summing up “Scheming AIs” (Section 5)

Joe CarlsmithDec 9, 2023, 3:48 PM

2 points

1 comment11 min readLW link

Speed arguments against scheming (Section 4.4-4.7 of “Scheming AIs”)

Joe CarlsmithDec 8, 2023, 9:09 PM

9 points

0 comments15 min readLW link

Simplicity arguments for scheming (Section 4.3 of “Scheming AIs”)

Joe CarlsmithDec 7, 2023, 3:05 PM

10 points

1 comment19 min readLW link

The counting argument for scheming (Sections 4.1 and 4.2 of “Scheming AIs”)

Joe CarlsmithDec 6, 2023, 7:28 PM

10 points

0 comments10 min readLW link

Arguments for/against scheming that focus on the path SGD takes (Section 3 of “Scheming AIs”)

Joe CarlsmithDec 5, 2023, 6:48 PM

10 points

0 comments23 min readLW link

Non-classic stories about scheming (Section 2.3.2 of “Scheming AIs”)

Joe CarlsmithDec 4, 2023, 6:44 PM

9 points

0 comments20 min readLW link

Does scheming lead to adequate future empowerment? (Section 2.3.1.2 of “Scheming AIs”)

Joe CarlsmithDec 3, 2023, 6:32 PM

9 points

0 comments17 min readLW link

The goal-guarding hypothesis (Section 2.3.1.1 of “Scheming AIs”)

Joe CarlsmithDec 2, 2023, 3:20 PM

8 points

1 comment15 min readLW link

How useful for alignment-relevant work are AIs with short-term goals? (Section 2.2.4.3 of “Scheming AIs”)

Joe CarlsmithDec 1, 2023, 2:51 PM

10 points

1 comment7 min readLW link