paulfchristiano(Paul Christiano)

Karma: 27,036

Where I agree and disagree with Eliezer

paulfchristiano19 Jun 2022 19:15 UTC

874 points

219 comments18 min readLW link 2 reviews

What failure looks like

paulfchristiano17 Mar 2019 20:18 UTC

401 points

54 comments8 min readLW link 2 reviews

AI alignment is distinct from its near-term applications

paulfchristiano13 Dec 2022 7:10 UTC

254 points

21 comments2 min readLW link

(ai-alignment.com)

My views on “doom”

paulfchristiano27 Apr 2023 17:50 UTC

242 points

33 comments2 min readLW link

(ai-alignment.com)

Another (outer) alignment failure story

paulfchristiano7 Apr 2021 20:12 UTC

241 points

38 comments12 min readLW link 1 review

Thoughts on the impact of RLHF research

paulfchristiano25 Jan 2023 17:23 UTC

236 points

101 comments9 min readLW link

Self-driving car bets

paulfchristiano29 Jul 2023 18:10 UTC

229 points

41 comments5 min readLW link

(sideways-view.com)

ARC’s first technical report: Eliciting Latent Knowledge

paulfchristiano, Mark Xu and Ajeya Cotra

14 Dec 2021 20:09 UTC

225 points

90 comments1 min readLW link 3 reviews

(docs.google.com)

Thoughts on responsible scaling policies and regulation

paulfchristiano24 Oct 2023 22:21 UTC

214 points

33 comments6 min readLW link

Hiring engineers and researchers to help align GPT-3

paulfchristiano1 Oct 2020 18:54 UTC

206 points

13 comments3 min readLW link

Thoughts on sharing information about language model capabilities

paulfchristiano31 Jul 2023 16:04 UTC

191 points

34 comments11 min readLW link

Announcing the Alignment Research Center

paulfchristiano26 Apr 2021 23:30 UTC

178 points

6 comments1 min readLW link

(ai-alignment.com)

Impossibility results for unbounded utilities

paulfchristiano2 Feb 2022 3:52 UTC

166 points

109 comments8 min readLW link 1 review

IMO challenge bet with Eliezer

paulfchristiano26 Feb 2022 4:50 UTC

166 points

25 comments3 min readLW link

Prizes for matrix completion problems

paulfchristiano3 May 2023 23:30 UTC

163 points

51 comments1 min readLW link

(www.alignment.org)

Secure homes for digital people

paulfchristiano10 Oct 2021 15:50 UTC

161 points

37 comments9 min readLW link 1 review

(sideways-view.com)

My research methodology

paulfchristiano22 Mar 2021 21:20 UTC

159 points

38 comments16 min readLW link 1 review

(ai-alignment.com)

Prizes for ELK proposals

paulfchristiano3 Jan 2022 20:23 UTC

150 points

152 comments7 min readLW link

Moral public goods

paulfchristiano26 Jan 2020 0:10 UTC

147 points

74 comments4 min readLW link

(sideways-view.com)

AI-Written Critiques Help Humans Notice Flaws

paulfchristiano25 Jun 2022 17:22 UTC

137 points

5 comments3 min readLW link

(openai.com)