A strange twist on the road to AGI

cveres12 Oct 2022 23:27 UTC
−8 points
0 comments1 min readLW link

Help out Red­wood Re­search’s in­ter­pretabil­ity team by find­ing heuris­tics im­ple­mented by GPT-2 small

12 Oct 2022 21:25 UTC
50 points
11 comments4 min readLW link

Towards a com­pre­hen­sive study of po­ten­tial psy­cholog­i­cal causes of the or­di­nary range of vari­a­tion of af­fec­tive gen­der iden­tity in males

tailcalled12 Oct 2022 21:10 UTC
52 points
4 comments37 min readLW link

Six (and a half) in­tu­itions for KL divergence

CallumMcDougall12 Oct 2022 21:07 UTC
154 points
25 comments10 min readLW link1 review
(www.perfectlynormal.co.uk)

[MLSN #6]: Trans­parency sur­vey, prov­able ro­bust­ness, ML mod­els that pre­dict the future

Dan H12 Oct 2022 20:56 UTC
27 points
0 comments6 min readLW link

[Question] Pre­vi­ous Work on Re­cre­at­ing Neu­ral Net­work In­put from In­ter­me­di­ate Layer Activations

bglass12 Oct 2022 19:28 UTC
1 point
3 comments1 min readLW link

Be more effec­tive by learn­ing im­por­tant prac­ti­cal knowl­edge us­ing flashcards

Stenemo12 Oct 2022 18:05 UTC
5 points
2 comments1 min readLW link

Ar­ti­cle Re­view: Google’s AlphaTensor

Robert_AIZI12 Oct 2022 18:04 UTC
8 points
4 comments10 min readLW link

Align­ment 201 curriculum

Richard_Ngo12 Oct 2022 18:03 UTC
102 points
3 comments1 min readLW link
(www.agisafetyfundamentals.com)

Progress links and tweets, 2022-10-12

jasoncrawford12 Oct 2022 16:59 UTC
8 points
0 comments1 min readLW link
(rootsofprogress.org)

Build­ing a trans­former from scratch—AI safety up-skil­ling challenge

Marius Hobbhahn12 Oct 2022 15:40 UTC
42 points
1 comment5 min readLW link

some simu­la­tion hypotheses

Tamsin Leake12 Oct 2022 13:34 UTC
13 points
3 comments5 min readLW link
(carado.moe)

In­stru­men­tal con­ver­gence in sin­gle-agent systems

12 Oct 2022 12:24 UTC
31 points
4 comments8 min readLW link
(www.gladstone.ai)

Sin­ga­pore—Small ca­sual din­ner in Chi­na­town #5

Joe Rocca12 Oct 2022 8:59 UTC
3 points
1 comment1 min readLW link

A game of mattering

KatjaGrace12 Oct 2022 8:50 UTC
28 points
2 comments5 min readLW link
(worldspiritsockpuppet.com)

Cal­ibra­tion of a thou­sand predictions

KatjaGrace12 Oct 2022 8:50 UTC
57 points
7 comments5 min readLW link
(worldspiritsockpuppet.com)

My ar­gu­ment against AGI

cveres12 Oct 2022 6:33 UTC
7 points
5 comments1 min readLW link

Ac­tu­ally, All Nu­clear Famine Papers are Bunk

Lao Mein12 Oct 2022 5:58 UTC
113 points
37 comments2 min readLW link1 review

Contin­gency is not arbitrary

Gordon Seidoh Worley12 Oct 2022 4:35 UTC
13 points
0 comments3 min readLW link

That one apoc­a­lyp­tic nu­clear famine pa­per is bunk

Lao Mein12 Oct 2022 3:33 UTC
110 points
10 comments1 min readLW link

As­tralCodexTen and Ra­tion­al­ity Meetup Or­ganisers’ Re­treat Asia Pa­cific region

12 Oct 2022 3:20 UTC
14 points
4 comments2 min readLW link

Ab­bots Brom­ley Horn Dance History

jefftk12 Oct 2022 2:10 UTC
11 points
0 comments2 min readLW link
(www.jefftk.com)

Power-Seek­ing AI and Ex­is­ten­tial Risk

Antonio Franca11 Oct 2022 22:50 UTC
6 points
0 comments9 min readLW link

From tech­noc­racy to the counterculture

jasoncrawford11 Oct 2022 19:37 UTC
28 points
1 comment26 min readLW link
(rootsofprogress.org)

Pret­tified AI Safety Game Cards

abramdemski11 Oct 2022 19:35 UTC
47 points
6 comments1 min readLW link

On the proper pi­lot­ing of flesh shoots

Mordecai Weynberg11 Oct 2022 18:52 UTC
−4 points
6 comments1 min readLW link

Why I think nu­clear war trig­gered by Rus­sian tac­ti­cal nukes in Ukraine is unlikely

Dave Orr11 Oct 2022 18:30 UTC
50 points
7 comments3 min readLW link

Anony­mous ad­vice: If you want to re­duce AI risk, should you take roles that ad­vance AI ca­pa­bil­ities?

Benjamin Hilton11 Oct 2022 14:16 UTC
54 points
9 comments1 min readLW link

Misal­ign­ment Harms Can Be Caused by Low In­tel­li­gence Systems

DialecticEel11 Oct 2022 13:39 UTC
11 points
3 comments1 min readLW link

[Sketch] Val­idity Cri­te­rion for Log­i­cal Counterfactuals

DragonGod11 Oct 2022 13:31 UTC
6 points
0 comments4 min readLW link

[Question] How much does the risk of dy­ing from nu­clear war differ within and be­tween coun­tries?

amarai11 Oct 2022 11:55 UTC
4 points
7 comments1 min readLW link

Did you en­joy Ramez Naam’s “Nexus” tril­ogy? Check out this in­ter­view on neu­rotech and the law.

fowlertm11 Oct 2022 11:10 UTC
5 points
0 comments1 min readLW link

What “The Mes­sage” Was For Me

Alex Beyman11 Oct 2022 8:08 UTC
−3 points
14 comments4 min readLW link

Up­dates and Clarifications

SD Marlow11 Oct 2022 5:34 UTC
−5 points
1 comment1 min readLW link

What if hu­man rea­son­ing is anti-in­duc­tive?

Q Home11 Oct 2022 5:15 UTC
4 points
2 comments13 min readLW link

Ful­l­ness to Indi­cate Cleanliness

jefftk11 Oct 2022 0:40 UTC
9 points
12 comments1 min readLW link
(www.jefftk.com)

[Question] What hap­pened to the an­nual LW de­mo­graphic sur­veys?

ROM11 Oct 2022 0:19 UTC
5 points
2 comments1 min readLW link

EA & LW Fo­rums Weekly Sum­mary (26 Sep − 9 Oct 22′)

Zoe Williams10 Oct 2022 23:58 UTC
13 points
2 comments1 min readLW link

Don’t ex­pect AGI any­time soon

cveres10 Oct 2022 22:38 UTC
−14 points
6 comments1 min readLW link

QAPR 4: In­duc­tive biases

Quintin Pope10 Oct 2022 22:08 UTC
67 points
2 comments18 min readLW link

Apollo

Jarred Filmer10 Oct 2022 21:30 UTC
46 points
0 comments3 min readLW link

[Question] Does biol­ogy re­li­ably find the global max­i­mum, or at least get close?

Noosphere8910 Oct 2022 20:55 UTC
24 points
70 comments1 min readLW link

Disen­tan­gling in­ner al­ign­ment failures

Erik Jenner10 Oct 2022 18:50 UTC
20 points
5 comments4 min readLW link

ACX meetup [Oc­to­ber]

sallatik10 Oct 2022 17:23 UTC
1 point
0 comments1 min readLW link

Nat­u­ral Cat­e­gories Update

Logan Zoellner10 Oct 2022 15:19 UTC
33 points
6 comments2 min readLW link

When re­port­ing AI timelines, be clear who you’re defer­ring to

Sam Clarke10 Oct 2022 14:24 UTC
38 points
6 comments1 min readLW link

Why Balsa Re­search is Worthwhile

Zvi10 Oct 2022 13:50 UTC
87 points
12 comments8 min readLW link
(thezvi.wordpress.com)

Les­sons learned from talk­ing to >100 aca­demics about AI safety

Marius Hobbhahn10 Oct 2022 13:16 UTC
214 points
17 comments12 min readLW link1 review

We can do bet­ter than argmax

Jan_Kulveit10 Oct 2022 10:32 UTC
48 points
4 comments1 min readLW link

Vege­tar­i­anism and depression

Maggy10 Oct 2022 9:11 UTC
2 points
2 comments1 min readLW link