RSS

Joe Carlsmith

Karma: 5,068

Senior research analyst at Open Philanthropy. Doctorate in philosophy from the University of Oxford. Opinions my own.

Video and tran­script of pre­sen­ta­tion on Schem­ing AIs

Joe CarlsmithMar 22, 2024, 3:52 PM
32 points
1 comment32 min readLW link

On green

Joe CarlsmithMar 21, 2024, 5:38 PM
269 points
35 comments31 min readLW link

On the abo­li­tion of man

Joe CarlsmithJan 18, 2024, 6:17 PM
90 points
18 comments41 min readLW link

Be­ing nicer than Clippy

Joe CarlsmithJan 16, 2024, 7:44 PM
109 points
32 comments27 min readLW link

An even deeper atheism

Joe CarlsmithJan 11, 2024, 5:28 PM
125 points
47 comments15 min readLW link

Does AI risk “other” the AIs?

Joe CarlsmithJan 9, 2024, 5:51 PM
60 points
3 comments8 min readLW link

When “yang” goes wrong

Joe CarlsmithJan 8, 2024, 4:35 PM
73 points
6 comments13 min readLW link

Deep athe­ism and AI risk

Joe CarlsmithJan 4, 2024, 6:58 PM
153 points
22 comments27 min readLW link

Gentle­ness and the ar­tifi­cial Other

Joe CarlsmithJan 2, 2024, 6:21 PM
313 points
33 comments11 min readLW link

Oth­er­ness and con­trol in the age of AGI

Joe CarlsmithJan 2, 2024, 6:15 PM
43 points
0 comments7 min readLW link

Em­piri­cal work that might shed light on schem­ing (Sec­tion 6 of “Schem­ing AIs”)

Joe CarlsmithDec 11, 2023, 4:30 PM
8 points
0 comments21 min readLW link

Sum­ming up “Schem­ing AIs” (Sec­tion 5)

Joe CarlsmithDec 9, 2023, 3:48 PM
2 points
1 comment11 min readLW link

Speed ar­gu­ments against schem­ing (Sec­tion 4.4-4.7 of “Schem­ing AIs”)

Joe CarlsmithDec 8, 2023, 9:09 PM
9 points
0 comments15 min readLW link

Sim­plic­ity ar­gu­ments for schem­ing (Sec­tion 4.3 of “Schem­ing AIs”)

Joe CarlsmithDec 7, 2023, 3:05 PM
10 points
1 comment19 min readLW link

The count­ing ar­gu­ment for schem­ing (Sec­tions 4.1 and 4.2 of “Schem­ing AIs”)

Joe CarlsmithDec 6, 2023, 7:28 PM
10 points
0 comments10 min readLW link

Ar­gu­ments for/​against schem­ing that fo­cus on the path SGD takes (Sec­tion 3 of “Schem­ing AIs”)

Joe CarlsmithDec 5, 2023, 6:48 PM
10 points
0 comments23 min readLW link

Non-clas­sic sto­ries about schem­ing (Sec­tion 2.3.2 of “Schem­ing AIs”)

Joe CarlsmithDec 4, 2023, 6:44 PM
9 points
0 comments20 min readLW link

Does schem­ing lead to ad­e­quate fu­ture em­pow­er­ment? (Sec­tion 2.3.1.2 of “Schem­ing AIs”)

Joe CarlsmithDec 3, 2023, 6:32 PM
9 points
0 comments17 min readLW link

The goal-guard­ing hy­poth­e­sis (Sec­tion 2.3.1.1 of “Schem­ing AIs”)

Joe CarlsmithDec 2, 2023, 3:20 PM
8 points
1 comment15 min readLW link

How use­ful for al­ign­ment-rele­vant work are AIs with short-term goals? (Sec­tion 2.2.4.3 of “Schem­ing AIs”)

Joe CarlsmithDec 1, 2023, 2:51 PM
10 points
1 comment7 min readLW link