RSS

beren(Beren Millidge)

Karma: 1,975

Interested in many things. I have a personal blog at https://​​www.beren.io/​​

The Sin­gu­lar Value De­com­po­si­tions of Trans­former Weight Ma­tri­ces are Highly Interpretable

28 Nov 2022 12:54 UTC
183 points
31 comments31 min readLW link

Gra­di­ent hack­ing is ex­tremely difficult

beren24 Jan 2023 15:45 UTC
146 points
19 comments5 min readLW link

Ba­sic Facts about Lan­guage Model Internals

4 Jan 2023 13:01 UTC
122 points
18 comments9 min readLW link

Deep learn­ing mod­els might be se­cretly (al­most) linear

beren24 Apr 2023 18:43 UTC
98 points
20 comments4 min readLW link

De­con­fus­ing Direct vs Amor­tised Optimization

beren2 Dec 2022 11:30 UTC
92 points
14 comments10 min readLW link

Ba­sic facts about lan­guage mod­els dur­ing training

beren21 Feb 2023 11:46 UTC
85 points
14 comments18 min readLW link

Scaf­folded LLMs as nat­u­ral lan­guage computers

beren12 Apr 2023 10:47 UTC
78 points
9 comments11 min readLW link

The sur­pris­ing pa­ram­e­ter effi­ciency of vi­sion models

beren8 Apr 2023 19:44 UTC
73 points
28 comments4 min readLW link

The Com­pu­ta­tional Anatomy of Hu­man Values

beren6 Apr 2023 10:33 UTC
63 points
30 comments30 min readLW link

Against ubiquitous al­ign­ment taxes

beren6 Mar 2023 19:50 UTC
56 points
10 comments2 min readLW link

The case for re­mov­ing al­ign­ment and ML re­search from the train­ing dataset

beren30 May 2023 20:54 UTC
46 points
8 comments5 min readLW link

Scal­ing laws vs in­di­vi­d­ual differences

beren10 Jan 2023 13:22 UTC
42 points
21 comments7 min readLW link

Em­pa­thy as a nat­u­ral con­se­quence of learnt re­ward models

beren4 Feb 2023 15:35 UTC
38 points
27 comments13 min readLW link

Hu­man sex­u­al­ity as an in­ter­est­ing case study of alignment

beren30 Dec 2022 13:37 UTC
38 points
26 comments3 min readLW link

An ML in­ter­pre­ta­tion of Shard Theory

beren3 Jan 2023 20:30 UTC
38 points
5 comments4 min readLW link

The ul­ti­mate limits of al­ign­ment will de­ter­mine the shape of the long term future

beren2 Jan 2023 12:47 UTC
34 points
2 comments6 min readLW link

Orthog­o­nal­ity is expensive

beren3 Apr 2023 10:20 UTC
33 points
8 comments3 min readLW link

AGI will have learnt util­ity functions

beren25 Jan 2023 19:42 UTC
33 points
3 comments13 min readLW link

Ev­i­dence on re­cur­sive self-im­prove­ment from cur­rent ML

beren30 Dec 2022 20:53 UTC
31 points
12 comments6 min readLW link

Ad­den­dum: ba­sic facts about lan­guage mod­els dur­ing training

beren6 Mar 2023 19:24 UTC
22 points
2 comments5 min readLW link