Arthur Conmy

Karma: 1,031

Intepretability

Views my own

RLHF does not appear to differentially cause mode-collapse

Arthur Conmy and beren

20 Mar 2023 15:39 UTC

95 points

9 comments3 min readLW link

Three ways interpretability could be impactful

Arthur Conmy18 Sep 2023 1:02 UTC

47 points

8 comments4 min readLW link

My best guess at the important tricks for training 1L SAEs

Arthur Conmy21 Dec 2023 1:59 UTC

35 points

4 comments3 min readLW link

OpenAI introduce ChatGPT API at 1/10th the previous $/token

Arthur Conmy1 Mar 2023 20:48 UTC

28 points

4 comments1 min readLW link

(openai.com)