Ar­ti­cle Re­view: Dis­cov­er­ing La­tent Knowl­edge (Burns, Ye, et al)

Robert_AIZI22 Dec 2022 18:16 UTC
13 points
4 comments6 min readLW link
(aizi.substack.com)

Let’s think about slow­ing down AI

KatjaGrace22 Dec 2022 17:40 UTC
546 points
183 comments38 min readLW link3 reviews
(aiimpacts.org)

Some Notes on the math­e­mat­ics of Toy Au­toen­cod­ing Problems

Spencer Becker-Kahn22 Dec 2022 17:21 UTC
16 points
1 comment12 min readLW link

De­cem­ber 2022 up­dates and fundraising

AI Impacts22 Dec 2022 17:20 UTC
39 points
1 comment3 min readLW link
(aiimpacts.org)

Covid 12/​22/​22: Ree­val­u­at­ing Past Options

Zvi22 Dec 2022 16:50 UTC
30 points
2 comments9 min readLW link
(thezvi.wordpress.com)

China Covid #4

Zvi22 Dec 2022 16:30 UTC
50 points
2 comments11 min readLW link
(thezvi.wordpress.com)

Rac­ing through a minefield: the AI de­ploy­ment problem

HoldenKarnofsky22 Dec 2022 16:10 UTC
38 points
2 comments13 min readLW link
(www.cold-takes.com)

Lead in Cho­co­late?

jefftk22 Dec 2022 16:10 UTC
41 points
6 comments2 min readLW link
(www.jefftk.com)

Re­sponse to Holden’s al­ign­ment plan

Alex Flint22 Dec 2022 16:08 UTC
36 points
4 comments6 min readLW link

Star­ing into the abyss as a core life skill

benkuhn22 Dec 2022 15:30 UTC
321 points
20 comments12 min readLW link1 review
(www.benkuhn.net)

Sec­u­lar Sols­tice for children

22 Dec 2022 14:33 UTC
30 points
1 comment3 min readLW link

Men­tal ac­cep­tance and reflection

22 Dec 2022 14:32 UTC
34 points
1 comment2 min readLW link

Against Diversification

Jack Malde22 Dec 2022 13:29 UTC
3 points
0 comments3 min readLW link
(ethicaleconomist.substack.com)

Notes on Meta’s Di­plo­macy-Play­ing AI

Erich_Grunewald22 Dec 2022 11:34 UTC
9 points
2 comments14 min readLW link
(www.erichgrunewald.com)

Take 13: RLHF bad, con­di­tion­ing good.

Charlie Steiner22 Dec 2022 10:44 UTC
53 points
4 comments2 min readLW link

Ap­plied Lin­ear Alge­bra Lec­ture Series

johnswentworth22 Dec 2022 6:57 UTC
102 points
7 comments1 min readLW link

Naive Set The­ory, Halmos

David Udell22 Dec 2022 2:34 UTC
11 points
1 comment8 min readLW link

Not Get­ting Hacked

jefftk21 Dec 2022 21:40 UTC
40 points
14 comments7 min readLW link
(www.jefftk.com)

Me­taphor.systems

the gears to ascension21 Dec 2022 21:31 UTC
25 points
9 comments1 min readLW link
(metaphor.systems)

[Question] How much is DQC (Dy­namic Quan­tum Clus­ter­ing) cur­rently looked into in AI Ca­pa­bil­ities Re­search?

macmillan21 Dec 2022 20:46 UTC
1 point
0 comments1 min readLW link

Think wider about the root causes of progress

jasoncrawford21 Dec 2022 20:05 UTC
49 points
11 comments4 min readLW link
(rootsofprogress.org)

[Question] What read­ings did you con­sider best for the happy parts of the sec­u­lar sols­tice?

ChristianKl21 Dec 2022 15:45 UTC
17 points
0 comments1 min readLW link

Re­cre­at­ing logic in type theory

Thomas Kehrenberg21 Dec 2022 15:19 UTC
12 points
0 comments13 min readLW link

You be­come the UI you use

Viliam21 Dec 2022 15:04 UTC
21 points
7 comments2 min readLW link

Price’s equa­tion for neu­ral networks

tailcalled21 Dec 2022 13:09 UTC
29 points
4 comments2 min readLW link

De­ci­sions: On­tolog­i­cally Shift­ing to Determinism

Chris_Leong21 Dec 2022 12:41 UTC
8 points
11 comments6 min readLW link

A Com­pre­hen­sive Mechanis­tic In­ter­pretabil­ity Ex­plainer & Glossary

Neel Nanda21 Dec 2022 12:35 UTC
82 points
6 comments2 min readLW link
(neelnanda.io)

Google Search loses to ChatGPT fair and square

shminux21 Dec 2022 8:11 UTC
14 points
17 comments1 min readLW link
(www.surgehq.ai)

Sazen

[DEACTIVATED] Duncan Sabien21 Dec 2022 7:54 UTC
275 points
83 comments12 min readLW link2 reviews

Pod­cast: What’s Wrong With LessWrong

Alfred21 Dec 2022 7:06 UTC
−32 points
11 comments1 min readLW link
(youtu.be)

New AI risk in­tro from Vox [link post]

JakubK21 Dec 2022 6:00 UTC
5 points
1 comment2 min readLW link
(www.vox.com)

Lo­cal Memes Against Geo­met­ric Rationality

Scott Garrabrant21 Dec 2022 3:53 UTC
85 points
3 comments6 min readLW link

Log­ging Shell His­tory in Zsh

jefftk21 Dec 2022 3:30 UTC
19 points
2 comments1 min readLW link
(www.jefftk.com)

CIRL Cor­rigi­bil­ity is Fragile

21 Dec 2022 1:40 UTC
58 points
9 comments12 min readLW link

[Question] [DISC] Are Values Ro­bust?

DragonGod21 Dec 2022 1:00 UTC
12 points
9 comments2 min readLW link

Perform­ing an SVD on a time-se­ries ma­trix of gra­di­ent up­dates on an MNIST net­work pro­duces 92.5 sin­gu­lar values

Garrett Baker21 Dec 2022 0:44 UTC
9 points
10 comments5 min readLW link

Progress links and tweets, 2022-12-20

jasoncrawford21 Dec 2022 0:35 UTC
12 points
0 comments2 min readLW link
(rootsofprogress.org)

K-com­plex­ity is silly; use cross-en­tropy instead

So8res20 Dec 2022 23:06 UTC
137 points
53 comments4 min readLW link2 reviews

Pod­cast: Tam­era Lan­ham on AI risk, threat mod­els, al­ign­ment pro­pos­als, ex­ter­nal­ized rea­son­ing over­sight, and work­ing at Anthropic

Akash20 Dec 2022 21:39 UTC
18 points
2 comments11 min readLW link

Dis­cov­er­ing Lan­guage Model Be­hav­iors with Model-Writ­ten Evaluations

20 Dec 2022 20:08 UTC
100 points
34 comments1 min readLW link
(www.anthropic.com)

Reflec­tions: Bureau­cratic Hell

Haris Rashid20 Dec 2022 19:22 UTC
−5 points
1 comment1 min readLW link
(www.harisrab.com)

Pro­lifer­at­ing Education

Haris Rashid20 Dec 2022 19:22 UTC
−1 points
2 comments5 min readLW link
(www.harisrab.com)

AGI is here, but no­body wants it. Why should we even care?

MGow20 Dec 2022 19:14 UTC
−22 points
0 comments17 min readLW link

Prop­er­ties of cur­rent AIs and some pre­dic­tions of the evolu­tion of AI from the per­spec­tive of scale-free the­o­ries of agency and reg­u­la­tive development

Roman Leventov20 Dec 2022 17:13 UTC
33 points
3 comments36 min readLW link

I be­lieve some AI doomers are overconfident

FTPickle20 Dec 2022 17:09 UTC
8 points
15 comments2 min readLW link

Note on al­gorithms with mul­ti­ple trained components

Steven Byrnes20 Dec 2022 17:08 UTC
23 points
4 comments2 min readLW link

Marvel Snap: Phase 2

Zvi20 Dec 2022 14:50 UTC
11 points
1 comment13 min readLW link
(thezvi.wordpress.com)

(Ex­tremely) Naive Gra­di­ent Hack­ing Doesn’t Work

ojorgensen20 Dec 2022 14:35 UTC
14 points
0 comments6 min readLW link

An Open Agency Ar­chi­tec­ture for Safe Trans­for­ma­tive AI

davidad20 Dec 2022 13:04 UTC
79 points
22 comments4 min readLW link

Un­der-Ap­pre­ci­ated Ways to Use Flash­cards—Part I

Florence Hinder20 Dec 2022 12:43 UTC
22 points
5 comments5 min readLW link
(thoughtsaver.ghost.io)