Quintin Pope

Karma: 4,754

My Objections to “We’re All Gonna Die with Eliezer Yudkowsky”

Quintin Pope21 Mar 2023 0:06 UTC

355 points

224 comments39 min readLW link

The shard theory of human values

Quintin Pope and TurnTrout

4 Sep 2022 4:28 UTC

235 points

66 comments24 min readLW link 2 reviews

Evolution provides no evidence for the sharp left turn

Quintin Pope11 Apr 2023 18:43 UTC

193 points

62 comments15 min readLW link

Quintin’s alignment papers roundup—week 1

Quintin Pope10 Sep 2022 6:39 UTC

120 points

6 comments9 min readLW link

QAPR 5: grokking is maybe not that big a deal?

Quintin Pope23 Jul 2023 20:14 UTC

114 points

15 comments9 min readLW link

Evolution is a bad analogy for AGI: inner alignment

Quintin Pope13 Aug 2022 22:15 UTC

78 points

15 comments8 min readLW link

Research agenda: Supervising AIs improving AIs

Quintin Pope, Owen D, Roman Engeler and jacquesthibs

29 Apr 2023 17:09 UTC

76 points

5 comments19 min readLW link

Quintin’s alignment papers roundup—week 2

Quintin Pope19 Sep 2022 13:41 UTC

67 points

2 comments10 min readLW link

QAPR 4: Inductive biases

Quintin Pope10 Oct 2022 22:08 UTC

67 points

2 comments18 min readLW link

The Case for Radical Optimism about Interpretability

Quintin Pope16 Dec 2021 23:38 UTC

66 points

16 comments8 min readLW link 1 review

QAPR 3: interpretability-guided training of neural nets

Quintin Pope28 Sep 2022 16:02 UTC

58 points

2 comments10 min readLW link

Meta learning to gradient hack

Quintin Pope1 Oct 2021 19:25 UTC

55 points

11 comments3 min readLW link

Hypothesis: gradient descent prefers general circuits

Quintin Pope8 Feb 2022 21:12 UTC

46 points

26 comments11 min readLW link

[Linkpost] A General Language Assistant as a Laboratory for Alignment

Quintin Pope3 Dec 2021 19:42 UTC

37 points

2 comments2 min readLW link

[Question] What’s the “This AI is of moral concern.” fire alarm?

Quintin Pope13 Jun 2022 8:05 UTC

37 points

56 comments2 min readLW link

New GPT-3 competitor

Quintin Pope12 Aug 2021 7:05 UTC

32 points

10 comments1 min readLW link

[Question] How good is security for LessWrong and the Alignment Forum?

Quintin Pope4 Oct 2021 22:27 UTC

20 points

4 comments1 min readLW link

[Question] Any prior work on mutiagent dynamics for continuous distributions over agents?

Quintin Pope1 Jun 2022 18:12 UTC

15 points

2 comments1 min readLW link

Idea: build alignment dataset for very capable models

Quintin Pope12 Feb 2022 19:30 UTC

14 points

2 comments3 min readLW link

A simple way to make GPT-3 follow instructions

Quintin Pope8 Mar 2021 2:57 UTC

11 points

5 comments4 min readLW link