RSS

Chain-of-thought Alignment

TagLast edit: 23 Oct 2022 19:13 UTC by elifland

(Feel free to rename, and write a description)

Pod­cast: Tam­era Lan­ham on AI risk, threat mod­els, al­ign­ment pro­pos­als, ex­ter­nal­ized rea­son­ing over­sight, and work­ing at Anthropic

Akash20 Dec 2022 21:39 UTC
18 points
2 comments11 min readLW link

Ex­ter­nal­ized rea­son­ing over­sight: a re­search di­rec­tion for lan­guage model alignment

tamera3 Aug 2022 12:03 UTC
106 points
22 comments6 min readLW link

Steganog­ra­phy in Chain of Thought Reasoning

A Ray8 Aug 2022 3:47 UTC
50 points
13 comments6 min readLW link

Distil­led Rep­re­sen­ta­tions Re­search Agenda

18 Oct 2022 20:59 UTC
15 points
2 comments8 min readLW link

[Question] Im­pact of ” ‘Let’s think step by step’ is all you need”?

yrimon24 Jul 2022 20:59 UTC
20 points
2 comments1 min readLW link

Paper: Large Lan­guage Models Can Self-im­prove [Linkpost]

Evan R. Murphy2 Oct 2022 1:29 UTC
52 points
14 comments1 min readLW link
(openreview.net)

[ASoT] Si­mu­la­tors show us be­havi­oural prop­er­ties by default

Jozdien13 Jan 2023 18:42 UTC
26 points
1 comment3 min readLW link

Imi­ta­tion Learn­ing from Lan­guage Feedback

30 Mar 2023 14:11 UTC
45 points
1 comment10 min readLW link
No comments.