Ex­plor­ing toy neu­ral nets un­der node re­moval. Sec­tion 1.

Donald Hobson13 Apr 2022 23:30 UTC
12 points
7 comments8 min readLW link

Make a Movie Show­ing Align­ment Failures

Logan Riggs13 Apr 2022 21:54 UTC
75 points
11 comments2 min readLW link

Sum­mary: “How to Do Re­search” by OSP’s Red

Pablo Repetto13 Apr 2022 19:46 UTC
9 points
0 comments3 min readLW link
(pabloernesto.github.io)

A Quick Guide to Con­fronting Doom

Ruby13 Apr 2022 19:30 UTC
240 points
33 comments2 min readLW link

De­sign, Im­ple­ment and Verify

rwallace13 Apr 2022 18:14 UTC
32 points
13 comments4 min readLW link

Take­off speeds have a huge effect on what it means to work on AI x-risk

Buck13 Apr 2022 17:38 UTC
139 points
27 comments2 min readLW link2 reviews

Bu­dapest Meetup

Richard Horvath13 Apr 2022 17:23 UTC
2 points
0 comments1 min readLW link

[Question] What to in­clude in a guest lec­ture on ex­is­ten­tial risks from AI?

Aryeh Englander13 Apr 2022 17:03 UTC
20 points
9 comments1 min readLW link

Com­mon Knowl­edge is a Cir­cle Game for Toddlers

ryan_b13 Apr 2022 15:24 UTC
58 points
1 comment1 min readLW link

Another list of the­o­ries of im­pact for interpretability

Beth Barnes13 Apr 2022 13:29 UTC
33 points
1 comment5 min readLW link

The Cage of the Language

Martin Sustrik13 Apr 2022 5:20 UTC
53 points
19 comments2 min readLW link

[Question] What’s a good prob­a­bil­ity dis­tri­bu­tion fam­ily (e.g. “log-nor­mal”) to use for AGI timelines?

David Scott Krueger (formerly: capybaralet)13 Apr 2022 4:45 UTC
9 points
11 comments1 min readLW link

How dath ilan co­or­di­nates around solv­ing alignment

Thomas Kwa13 Apr 2022 4:22 UTC
64 points
45 comments5 min readLW link

What more com­pute does for brain-like mod­els: re­sponse to Rohin

Nathan Helm-Burger13 Apr 2022 3:40 UTC
22 points
14 comments12 min readLW link

[Question] “Frag­ility of Value” vs. LLMs

Not Relevant13 Apr 2022 2:02 UTC
34 points
33 comments1 min readLW link

The Peerless

Tamsin Leake13 Apr 2022 1:07 UTC
18 points
2 comments1 min readLW link
(carado.moe)

Com­men­su­rable Scien­tific Paradigms; or, com­putable induction

samshap13 Apr 2022 0:01 UTC
14 points
0 comments5 min readLW link

Con­vinc­ing Peo­ple of Align­ment with Street Epistemology

Logan Riggs12 Apr 2022 23:43 UTC
54 points
4 comments3 min readLW link

Use­ful Vices for Wicked Problems

HoldenKarnofsky12 Apr 2022 19:30 UTC
64 points
2 comments17 min readLW link1 review
(www.cold-takes.com)

SSC/​ACX, San Diego, Schel­ling Point, Mee­tups Everywhere

CitizenTen12 Apr 2022 18:50 UTC
2 points
0 comments1 min readLW link

SSC/​ACX San Diego Rock Climbing

CitizenTen12 Apr 2022 18:46 UTC
2 points
0 comments1 min readLW link

[Question] Does the ra­tio­nal­ist com­mu­nity have a mem­ber­ship fun­nel?

Alex_Altair12 Apr 2022 18:44 UTC
37 points
17 comments1 min readLW link

A Small Nega­tive Re­sult on Debate

Sam Bowman12 Apr 2022 18:19 UTC
42 points
11 comments1 min readLW link

US Taxes: Ad­just With­hold­ing When Donat­ing?

jefftk12 Apr 2022 15:50 UTC
15 points
1 comment1 min readLW link
(www.jefftk.com)

In­tro­duc­ing Effec­tive Self-Help

Ben Williamson12 Apr 2022 15:01 UTC
18 points
0 comments16 min readLW link

Ukraine Post #10: Next Phase

Zvi12 Apr 2022 13:40 UTC
45 points
13 comments14 min readLW link
(thezvi.wordpress.com)

Is tech­ni­cal AI al­ign­ment re­search a net pos­i­tive?

cranberry_bear12 Apr 2022 13:07 UTC
6 points
2 comments2 min readLW link

[Question] What is your ad­vice for el­der care, par­tic­u­larly tak­ing care of de­men­tia pa­tients?

JohannWolfgang12 Apr 2022 11:33 UTC
4 points
6 comments1 min readLW link

Re­ward model hack­ing as a challenge for re­ward learning

Erik Jenner12 Apr 2022 9:39 UTC
25 points
1 comment9 min readLW link

How I use Anki: ex­pand­ing the scope of SRS

CallumMcDougall12 Apr 2022 8:28 UTC
36 points
8 comments19 min readLW link

[Question] What do you think will most prob­a­bly hap­pen to our con­scious­ness when our simu­la­tion ends?

ArtMi12 Apr 2022 8:23 UTC
1 point
5 comments1 min readLW link

Fa­vorites & Performers

Soma12 Apr 2022 5:50 UTC
9 points
0 comments1 min readLW link

A broad basin of at­trac­tion around hu­man val­ues?

Wei Dai12 Apr 2022 5:15 UTC
109 points
17 comments2 min readLW link

AI gov­er­nance stu­dent hackathon on Satur­day, April 23: reg­ister now!

mic12 Apr 2022 4:48 UTC
14 points
0 comments1 min readLW link

The Pla­ton­ist’s Dilemma: A Remix on the Pri­soner’s.

James Camacho12 Apr 2022 3:49 UTC
5 points
2 comments5 min readLW link

[Question] Three ques­tions about mesa-optimizers

Eric Neyman12 Apr 2022 2:58 UTC
24 points
5 comments3 min readLW link

The Amish

PeterMcCluskey12 Apr 2022 2:54 UTC
49 points
5 comments6 min readLW link
(www.bayesianinvestor.com)

Ra­tion­al­ist Should Win. Not Dy­ing with Dig­nity and Fund­ing WBE.

CitizenTen12 Apr 2022 2:14 UTC
32 points
14 comments5 min readLW link

[Question] How can I de­ter­mine that Elicit is not some weak AGI’s at­tempt at tak­ing over the world ?

Lucie Philippon12 Apr 2022 0:54 UTC
5 points
3 comments1 min readLW link

Sum­mary: “How to Write Quickly...” by John Wentworth

Pablo Repetto11 Apr 2022 23:26 UTC
4 points
0 comments2 min readLW link
(pabloernesto.github.io)

Ram­bling thoughts on hav­ing mul­ti­ple selves

cranberry_bear11 Apr 2022 22:43 UTC
15 points
1 comment3 min readLW link

An AI-in-a-box suc­cess model

azsantosk11 Apr 2022 22:28 UTC
16 points
1 comment10 min readLW link

The Reg­u­la­tory Op­tion: A re­sponse to near 0% sur­vival odds

Matthew Lowenstein11 Apr 2022 22:00 UTC
46 points
21 comments6 min readLW link

The Effi­cient LessWrong Hy­poth­e­sis—Stock In­vest­ing Competition

MrThink11 Apr 2022 20:43 UTC
30 points
35 comments2 min readLW link

Re­view: Struc­ture and In­ter­pre­ta­tion of Com­puter Programs

L Rudolf L11 Apr 2022 20:27 UTC
16 points
9 comments10 min readLW link
(www.strataoftheworld.com)

[Question] Un­der­ap­pre­ci­ated con­tent on LessWrong

Ege Erdil11 Apr 2022 17:40 UTC
22 points
5 comments1 min readLW link

Edit­ing Ad­vice for LessWrong Users

JustisMills11 Apr 2022 16:32 UTC
231 points
14 comments6 min readLW link1 review

Post-his­tory is writ­ten by the martyrs

Veedrac11 Apr 2022 15:45 UTC
50 points
2 comments19 min readLW link
(www.royalroad.com)

What Chords Do You Need?

jefftk11 Apr 2022 15:00 UTC
11 points
0 comments3 min readLW link
(www.jefftk.com)

What can peo­ple not smart/​tech­ni­cal/​”com­pe­tent” enough for AI re­search/​AI risk work do to re­duce AI-risk/​max­i­mize AI safety? (which is most peo­ple?)

Alex K. Chen (parrot)11 Apr 2022 14:05 UTC
7 points
3 comments3 min readLW link