Ex­plor­ing toy neu­ral nets un­der node re­moval. Sec­tion 1.

Donald HobsonApr 13, 2022, 11:30 PM
12 points
7 comments8 min readLW link

Make a Movie Show­ing Align­ment Failures

Logan RiggsApr 13, 2022, 9:54 PM
75 points
11 comments2 min readLW link

Sum­mary: “How to Do Re­search” by OSP’s Red

Pablo RepettoApr 13, 2022, 7:46 PM
9 points
0 comments3 min readLW link
(pabloernesto.github.io)

A Quick Guide to Con­fronting Doom

RubyApr 13, 2022, 7:30 PM
243 points
33 comments2 min readLW link

De­sign, Im­ple­ment and Verify

rwallaceApr 13, 2022, 6:14 PM
32 points
13 comments4 min readLW link

Take­off speeds have a huge effect on what it means to work on AI x-risk

BuckApr 13, 2022, 5:38 PM
139 points
27 comments2 min readLW link2 reviews

Bu­dapest Meetup

Richard HorvathApr 13, 2022, 5:23 PM
2 points
0 comments1 min readLW link

[Question] What to in­clude in a guest lec­ture on ex­is­ten­tial risks from AI?

Aryeh EnglanderApr 13, 2022, 5:03 PM
20 points
9 comments1 min readLW link

Com­mon Knowl­edge is a Cir­cle Game for Toddlers

ryan_bApr 13, 2022, 3:24 PM
58 points
1 comment1 min readLW link

Another list of the­o­ries of im­pact for interpretability

Beth BarnesApr 13, 2022, 1:29 PM
33 points
1 comment5 min readLW link

The Cage of the Language

Martin SustrikApr 13, 2022, 5:20 AM
54 points
19 comments2 min readLW link

[Question] What’s a good prob­a­bil­ity dis­tri­bu­tion fam­ily (e.g. “log-nor­mal”) to use for AGI timelines?

David Scott Krueger (formerly: capybaralet)Apr 13, 2022, 4:45 AM
9 points
11 comments1 min readLW link

How dath ilan co­or­di­nates around solv­ing alignment

Thomas KwaApr 13, 2022, 4:22 AM
65 points
46 comments5 min readLW link

What more com­pute does for brain-like mod­els: re­sponse to Rohin

Nathan Helm-BurgerApr 13, 2022, 3:40 AM
24 points
14 comments12 min readLW link

[Question] “Frag­ility of Value” vs. LLMs

Not RelevantApr 13, 2022, 2:02 AM
34 points
33 comments1 min readLW link

Com­men­su­rable Scien­tific Paradigms; or, com­putable induction

samshapApr 13, 2022, 12:01 AM
14 points
0 comments5 min readLW link

Con­vinc­ing Peo­ple of Align­ment with Street Epistemology

Logan RiggsApr 12, 2022, 11:43 PM
54 points
4 comments3 min readLW link

Use­ful Vices for Wicked Problems

HoldenKarnofskyApr 12, 2022, 7:30 PM
76 points
2 comments17 min readLW link1 review
(www.cold-takes.com)

SSC/​ACX, San Diego, Schel­ling Point, Mee­tups Everywhere

CitizenTenApr 12, 2022, 6:50 PM
2 points
0 comments1 min readLW link

SSC/​ACX San Diego Rock Climbing

CitizenTenApr 12, 2022, 6:46 PM
2 points
0 comments1 min readLW link

[Question] Does the ra­tio­nal­ist com­mu­nity have a mem­ber­ship fun­nel?

Alex_AltairApr 12, 2022, 6:44 PM
38 points
17 comments1 min readLW link

A Small Nega­tive Re­sult on Debate

Sam BowmanApr 12, 2022, 6:19 PM
42 points
11 comments1 min readLW link

US Taxes: Ad­just With­hold­ing When Donat­ing?

jefftkApr 12, 2022, 3:50 PM
15 points
1 comment1 min readLW link
(www.jefftk.com)

In­tro­duc­ing Effec­tive Self-Help

Ben WilliamsonApr 12, 2022, 3:01 PM
19 points
0 comments16 min readLW link

Ukraine Post #10: Next Phase

ZviApr 12, 2022, 1:40 PM
47 points
13 comments14 min readLW link
(thezvi.wordpress.com)

Is tech­ni­cal AI al­ign­ment re­search a net pos­i­tive?

cranberry_bearApr 12, 2022, 1:07 PM
6 points
2 comments2 min readLW link

[Question] What is your ad­vice for el­der care, par­tic­u­larly tak­ing care of de­men­tia pa­tients?

RasmusHBApr 12, 2022, 11:33 AM
4 points
6 comments1 min readLW link

Re­ward model hack­ing as a challenge for re­ward learning

Erik JennerApr 12, 2022, 9:39 AM
25 points
1 comment9 min readLW link

How I use Anki: ex­pand­ing the scope of SRS

CallumMcDougallApr 12, 2022, 8:28 AM
37 points
8 comments19 min readLW link

[Question] What do you think will most prob­a­bly hap­pen to our con­scious­ness when our simu­la­tion ends?

ArtMiApr 12, 2022, 8:23 AM
1 point
5 comments1 min readLW link

Fa­vorites & Performers

SomaApr 12, 2022, 5:50 AM
9 points
0 comments1 min readLW link

A broad basin of at­trac­tion around hu­man val­ues?

Wei DaiApr 12, 2022, 5:15 AM
114 points
18 comments2 min readLW link

AI gov­er­nance stu­dent hackathon on Satur­day, April 23: reg­ister now!

micApr 12, 2022, 4:48 AM
14 points
0 comments1 min readLW link

The Pla­ton­ist’s Dilemma: A Remix on the Pri­soner’s.

James CamachoApr 12, 2022, 3:49 AM
7 points
2 comments5 min readLW link

[Question] Three ques­tions about mesa-optimizers

Eric NeymanApr 12, 2022, 2:58 AM
26 points
5 comments3 min readLW link

The Amish

PeterMcCluskeyApr 12, 2022, 2:54 AM
49 points
5 comments6 min readLW link
(www.bayesianinvestor.com)

Ra­tion­al­ist Should Win. Not Dy­ing with Dig­nity and Fund­ing WBE.

CitizenTenApr 12, 2022, 2:14 AM
32 points
15 comments5 min readLW link

[Question] How can I de­ter­mine that Elicit is not some weak AGI’s at­tempt at tak­ing over the world ?

Lucie PhilipponApr 12, 2022, 12:54 AM
5 points
3 comments1 min readLW link

Sum­mary: “How to Write Quickly...” by John Wentworth

Pablo RepettoApr 11, 2022, 11:26 PM
4 points
0 comments2 min readLW link
(pabloernesto.github.io)

Ram­bling thoughts on hav­ing mul­ti­ple selves

cranberry_bearApr 11, 2022, 10:43 PM
15 points
1 comment3 min readLW link

An AI-in-a-box suc­cess model

azsantoskApr 11, 2022, 10:28 PM
16 points
1 comment10 min readLW link

The Reg­u­la­tory Op­tion: A re­sponse to near 0% sur­vival odds

Matthew LowensteinApr 11, 2022, 10:00 PM
46 points
21 comments6 min readLW link

The Effi­cient LessWrong Hy­poth­e­sis—Stock In­vest­ing Competition

MrThinkApr 11, 2022, 8:43 PM
30 points
35 comments2 min readLW link

Re­view: Struc­ture and In­ter­pre­ta­tion of Com­puter Programs

L Rudolf L11 Apr 2022 20:27 UTC
17 points
9 comments10 min readLW link
(www.strataoftheworld.com)

[Question] Un­der­ap­pre­ci­ated con­tent on LessWrong

Ege Erdil11 Apr 2022 17:40 UTC
22 points
5 comments1 min readLW link

Edit­ing Ad­vice for LessWrong Users

JustisMills11 Apr 2022 16:32 UTC
234 points
14 comments6 min readLW link1 review

Post-his­tory is writ­ten by the martyrs

Veedrac11 Apr 2022 15:45 UTC
50 points
2 comments19 min readLW link
(www.royalroad.com)

What Chords Do You Need?

jefftk11 Apr 2022 15:00 UTC
11 points
0 comments3 min readLW link
(www.jefftk.com)

What can peo­ple not smart/​tech­ni­cal/​”com­pe­tent” enough for AI re­search/​AI risk work do to re­duce AI-risk/​max­i­mize AI safety? (which is most peo­ple?)

Alex K. Chen (parrot)11 Apr 2022 14:05 UTC
7 points
3 comments3 min readLW link

Good­hart’s Law Causal Diagrams

11 Apr 2022 13:52 UTC
35 points
6 comments6 min readLW link