RSS

Arthur Conmy

Karma: 835

Intepretability

Views my own

RLHF does not ap­pear to differ­en­tially cause mode-collapse

20 Mar 2023 15:39 UTC
95 points
9 comments3 min readLW link

Three ways in­ter­pretabil­ity could be impactful

Arthur Conmy18 Sep 2023 1:02 UTC
47 points
8 comments4 min readLW link

My best guess at the im­por­tant tricks for train­ing 1L SAEs

Arthur Conmy21 Dec 2023 1:59 UTC
35 points
4 comments3 min readLW link

OpenAI in­tro­duce ChatGPT API at 1/​10th the pre­vi­ous $/​token

Arthur Conmy1 Mar 2023 20:48 UTC
28 points
4 comments1 min readLW link
(openai.com)