RSS

Quintin Pope

Karma: 4,754

My Ob­jec­tions to “We’re All Gonna Die with Eliezer Yud­kowsky”

Quintin Pope21 Mar 2023 0:06 UTC
355 points
224 comments39 min readLW link

The shard the­ory of hu­man values

4 Sep 2022 4:28 UTC
235 points
66 comments24 min readLW link2 reviews

Evolu­tion pro­vides no ev­i­dence for the sharp left turn

Quintin Pope11 Apr 2023 18:43 UTC
193 points
62 comments15 min readLW link

Quintin’s al­ign­ment pa­pers roundup—week 1

Quintin Pope10 Sep 2022 6:39 UTC
120 points
6 comments9 min readLW link

QAPR 5: grokking is maybe not *that* big a deal?

Quintin Pope23 Jul 2023 20:14 UTC
114 points
15 comments9 min readLW link

Evolu­tion is a bad anal­ogy for AGI: in­ner alignment

Quintin Pope13 Aug 2022 22:15 UTC
78 points
15 comments8 min readLW link

Re­search agenda: Su­per­vis­ing AIs im­prov­ing AIs

29 Apr 2023 17:09 UTC
76 points
5 comments19 min readLW link

Quintin’s al­ign­ment pa­pers roundup—week 2

Quintin Pope19 Sep 2022 13:41 UTC
67 points
2 comments10 min readLW link

QAPR 4: In­duc­tive biases

Quintin Pope10 Oct 2022 22:08 UTC
67 points
2 comments18 min readLW link

The Case for Rad­i­cal Op­ti­mism about Interpretability

Quintin Pope16 Dec 2021 23:38 UTC
66 points
16 comments8 min readLW link1 review

QAPR 3: in­ter­pretabil­ity-guided train­ing of neu­ral nets

Quintin Pope28 Sep 2022 16:02 UTC
58 points
2 comments10 min readLW link

Meta learn­ing to gra­di­ent hack

Quintin Pope1 Oct 2021 19:25 UTC
55 points
11 comments3 min readLW link

Hy­poth­e­sis: gra­di­ent de­scent prefers gen­eral circuits

Quintin Pope8 Feb 2022 21:12 UTC
46 points
26 comments11 min readLW link

[Linkpost] A Gen­eral Lan­guage As­sis­tant as a Lab­o­ra­tory for Alignment

Quintin Pope3 Dec 2021 19:42 UTC
37 points
2 comments2 min readLW link

[Question] What’s the “This AI is of moral con­cern.” fire alarm?

Quintin Pope13 Jun 2022 8:05 UTC
37 points
56 comments2 min readLW link

New GPT-3 competitor

Quintin Pope12 Aug 2021 7:05 UTC
32 points
10 comments1 min readLW link

[Question] How good is se­cu­rity for LessWrong and the Align­ment Fo­rum?

Quintin Pope4 Oct 2021 22:27 UTC
20 points
4 comments1 min readLW link

[Question] Any prior work on mu­ti­a­gent dy­nam­ics for con­tin­u­ous dis­tri­bu­tions over agents?

Quintin Pope1 Jun 2022 18:12 UTC
15 points
2 comments1 min readLW link

Idea: build al­ign­ment dataset for very ca­pa­ble models

Quintin Pope12 Feb 2022 19:30 UTC
14 points
2 comments3 min readLW link

A sim­ple way to make GPT-3 fol­low instructions

Quintin Pope8 Mar 2021 2:57 UTC
11 points
5 comments4 min readLW link