CAIS Philosophy Fellowship Midpoint Deliverables

7 Jun 2023 23:24 UTC

Conceptual AI safety researchers aim to help orient the broader field of AI safety, but in doing so, they must wrestle with imprecise, nebulous, hard-to-define problems. Philosophers specialize in dealing with problems like these. The CAIS Philosophy supports PhD students, postdocs, and professors of philosophy to produce novel conceptual AI safety research.

This sequence is a collection of drafts written by the CAIS Philosophy Fellows meant to elicit feedback.

Instrumental Convergence? [Draft]

J. Dmitri Gallow14 Jun 2023 20:21 UTC

48 points

19 comments33 min readLW link

The Polarity Problem [Draft]

Dan H, cdkg and Simon Goldstein

23 May 2023 21:05 UTC

24 points

3 comments44 min readLW link

Shutdown-Seeking AI

Simon Goldstein31 May 2023 22:19 UTC

50 points

32 comments15 min readLW link

Is Deontological AI Safe? [Feedback Draft]

Dan H and William D'Alessandro

27 May 2023 16:39 UTC

19 points

15 comments20 min readLW link

There are no coherence theorems

20 Feb 2023 21:25 UTC

155 points

130 comments19 min readLW link 1 review

Aggregating Utilities for Corrigible AI [Feedback Draft]

Dan H and Simon Goldstein

12 May 2023 20:57 UTC

28 points

7 comments22 min readLW link

AI Will Not Want to Self-Improve

petersalib16 May 2023 20:53 UTC

28 points

24 comments20 min readLW link

Group Prioritarianism: Why AI Should Not Replace Humanity [draft]

fsh15 Jun 2023 17:33 UTC

8 points

0 comments25 min readLW link

Language Agents Reduce the Risk of Existential Catastrophe

cdkg and Simon Goldstein

28 May 2023 19:10 UTC

39 points

14 comments26 min readLW link