RSS

paulfchristiano(Paul Christiano)

Karma: 24,593

Prizes for ma­trix com­ple­tion problems

paulfchristiano3 May 2023 23:30 UTC
154 points
41 comments1 min readLW link
(www.alignment.org)

My views on “doom”

paulfchristiano27 Apr 2023 17:50 UTC
232 points
31 comments2 min readLW link
(ai-alignment.com)

Chris­ti­ano (ARC) and GA (Con­jec­ture) Dis­cuss Align­ment Cruxes

24 Feb 2023 23:03 UTC
64 points
7 comments47 min readLW link

Thoughts on the im­pact of RLHF research

paulfchristiano25 Jan 2023 17:23 UTC
231 points
101 comments9 min readLW link

Can we effi­ciently dis­t­in­guish differ­ent mechanisms?

paulfchristiano27 Dec 2022 0:20 UTC
86 points
30 comments16 min readLW link
(ai-alignment.com)

Three rea­sons to cooperate

paulfchristiano24 Dec 2022 17:40 UTC
78 points
14 comments10 min readLW link
(sideways-view.com)

Can we effi­ciently ex­plain model be­hav­iors?

paulfchristiano16 Dec 2022 19:40 UTC
64 points
3 comments9 min readLW link
(ai-alignment.com)

AI al­ign­ment is dis­tinct from its near-term applications

paulfchristiano13 Dec 2022 7:10 UTC
253 points
21 comments2 min readLW link
(ai-alignment.com)

Find­ing gliders in the game of life

paulfchristiano1 Dec 2022 20:40 UTC
94 points
7 comments16 min readLW link
(ai-alignment.com)

Mechanis­tic anomaly de­tec­tion and ELK

paulfchristiano25 Nov 2022 18:50 UTC
132 points
18 comments21 min readLW link
(ai-alignment.com)

De­ci­sion the­ory and dy­namic inconsistency

paulfchristiano3 Jul 2022 22:20 UTC
71 points
33 comments10 min readLW link
(sideways-view.com)

AI-Writ­ten Cri­tiques Help Hu­mans No­tice Flaws

paulfchristiano25 Jun 2022 17:22 UTC
137 points
5 comments3 min readLW link
(openai.com)

Where I agree and dis­agree with Eliezer

paulfchristiano19 Jun 2022 19:15 UTC
838 points
212 comments20 min readLW link

What is causal­ity to an ev­i­den­tial de­ci­sion the­o­rist?

paulfchristiano17 Apr 2022 16:00 UTC
45 points
26 comments5 min readLW link
(sideways-view.com)

ELK prize results

9 Mar 2022 0:01 UTC
133 points
50 comments21 min readLW link

IMO challenge bet with Eliezer

paulfchristiano26 Feb 2022 4:50 UTC
163 points
25 comments3 min readLW link

Bet­ter im­pos­si­bil­ity re­sult for un­bounded utilities

paulfchristiano9 Feb 2022 6:10 UTC
30 points
22 comments5 min readLW link

Im­pos­si­bil­ity re­sults for un­bounded utilities

paulfchristiano2 Feb 2022 3:52 UTC
157 points
103 comments8 min readLW link

ELK First Round Con­test Winners

26 Jan 2022 2:56 UTC
63 points
6 comments1 min readLW link

Ap­ply for re­search in­tern­ships at ARC!

paulfchristiano3 Jan 2022 20:26 UTC
61 points
0 comments1 min readLW link