RSS

paulfchristiano(Paul Christiano)

Karma: 27,036

Where I agree and dis­agree with Eliezer

paulfchristiano19 Jun 2022 19:15 UTC
874 points
219 comments18 min readLW link2 reviews

What failure looks like

paulfchristiano17 Mar 2019 20:18 UTC
401 points
54 comments8 min readLW link2 reviews

AI al­ign­ment is dis­tinct from its near-term applications

paulfchristiano13 Dec 2022 7:10 UTC
254 points
21 comments2 min readLW link
(ai-alignment.com)

My views on “doom”

paulfchristiano27 Apr 2023 17:50 UTC
242 points
33 comments2 min readLW link
(ai-alignment.com)

Another (outer) al­ign­ment failure story

paulfchristiano7 Apr 2021 20:12 UTC
241 points
38 comments12 min readLW link1 review

Thoughts on the im­pact of RLHF research

paulfchristiano25 Jan 2023 17:23 UTC
236 points
101 comments9 min readLW link

Self-driv­ing car bets

paulfchristiano29 Jul 2023 18:10 UTC
229 points
41 comments5 min readLW link
(sideways-view.com)

ARC’s first tech­ni­cal re­port: Elic­it­ing La­tent Knowledge

14 Dec 2021 20:09 UTC
225 points
90 comments1 min readLW link3 reviews
(docs.google.com)

Thoughts on re­spon­si­ble scal­ing poli­cies and regulation

paulfchristiano24 Oct 2023 22:21 UTC
214 points
33 comments6 min readLW link

Hiring en­g­ineers and re­searchers to help al­ign GPT-3

paulfchristiano1 Oct 2020 18:54 UTC
206 points
13 comments3 min readLW link

Thoughts on shar­ing in­for­ma­tion about lan­guage model capabilities

paulfchristiano31 Jul 2023 16:04 UTC
191 points
34 comments11 min readLW link

An­nounc­ing the Align­ment Re­search Center

paulfchristiano26 Apr 2021 23:30 UTC
178 points
6 comments1 min readLW link
(ai-alignment.com)

Im­pos­si­bil­ity re­sults for un­bounded utilities

paulfchristiano2 Feb 2022 3:52 UTC
166 points
109 comments8 min readLW link1 review

IMO challenge bet with Eliezer

paulfchristiano26 Feb 2022 4:50 UTC
166 points
25 comments3 min readLW link

Prizes for ma­trix com­ple­tion problems

paulfchristiano3 May 2023 23:30 UTC
163 points
51 comments1 min readLW link
(www.alignment.org)

Se­cure homes for digi­tal people

paulfchristiano10 Oct 2021 15:50 UTC
161 points
37 comments9 min readLW link1 review
(sideways-view.com)

My re­search methodology

paulfchristiano22 Mar 2021 21:20 UTC
159 points
38 comments16 min readLW link1 review
(ai-alignment.com)

Prizes for ELK proposals

paulfchristiano3 Jan 2022 20:23 UTC
150 points
152 comments7 min readLW link

Mo­ral pub­lic goods

paulfchristiano26 Jan 2020 0:10 UTC
147 points
74 comments4 min readLW link
(sideways-view.com)

AI-Writ­ten Cri­tiques Help Hu­mans No­tice Flaws

paulfchristiano25 Jun 2022 17:22 UTC
137 points
5 comments3 min readLW link
(openai.com)