Should we “go against na­ture”?

jasoncrawford4 Nov 2022 22:14 UTC
10 points
3 comments2 min readLW link
(rootsofprogress.org)

How much should we care about non-hu­man an­i­mals?

bokov4 Nov 2022 21:36 UTC
16 points
8 comments2 min readLW link
(www.lesswrong.com)

For ELK truth is mostly a distraction

c.trout4 Nov 2022 21:14 UTC
44 points
0 comments21 min readLW link

Toy Models and Tegum Products

Adam Jermyn4 Nov 2022 18:51 UTC
28 points
7 comments5 min readLW link

Ethan Ca­ballero on Bro­ken Neu­ral Scal­ing Laws, De­cep­tion, and Re­cur­sive Self Improvement

4 Nov 2022 18:09 UTC
13 points
11 comments10 min readLW link
(theinsideview.ai)

Fol­low up to med­i­cal miracle

Elizabeth4 Nov 2022 18:00 UTC
75 points
5 comments6 min readLW link
(acesounderglass.com)

Cross-Void Optimization

pneumynym4 Nov 2022 17:47 UTC
1 point
1 comment8 min readLW link

Monthly Shorts 10/​22

Celer4 Nov 2022 16:30 UTC
12 points
0 comments6 min readLW link
(keller.substack.com)

Weekly Roundup #4

Zvi4 Nov 2022 15:00 UTC
42 points
1 comment6 min readLW link
(thezvi.wordpress.com)

A new place to dis­cuss cog­ni­tive sci­ence, ethics and hu­man alignment

Daniel_Friedrich4 Nov 2022 14:34 UTC
3 points
4 comments1 min readLW link

A new­comer’s guide to the tech­ni­cal AI safety field

zeshen4 Nov 2022 14:29 UTC
42 points
3 comments10 min readLW link

[Question] Are al­ign­ment re­searchers de­vot­ing enough time to im­prov­ing their re­search ca­pac­ity?

Carson Jones4 Nov 2022 0:58 UTC
13 points
3 comments3 min readLW link

[Question] Don’t you think RLHF solves outer al­ign­ment?

Charbel-Raphaël4 Nov 2022 0:36 UTC
9 points
23 comments1 min readLW link

Mechanis­tic In­ter­pretabil­ity as Re­v­erse Eng­ineer­ing (fol­low-up to “cars and elephants”)

David Scott Krueger (formerly: capybaralet)3 Nov 2022 23:19 UTC
28 points
3 comments1 min readLW link

[Question] Could a Supreme Court suit work to solve NEPA prob­lems?

ChristianKl3 Nov 2022 21:10 UTC
15 points
0 comments1 min readLW link

[Video] How hav­ing Fast Fourier Trans­forms sooner could have helped with Nu­clear Disar­ma­ment—Veritaserum

mako yass3 Nov 2022 21:04 UTC
17 points
1 comment1 min readLW link

Fur­ther con­sid­er­a­tions on the Ev­i­den­tial­ist’s Wager

Martín Soto3 Nov 2022 20:06 UTC
3 points
9 comments8 min readLW link

AI as a Civ­i­liza­tional Risk Part 6/​6: What can be done

PashaKamyshev3 Nov 2022 19:48 UTC
2 points
4 comments4 min readLW link

A Mys­tery About High Di­men­sional Con­cept Encoding

Fabien Roger3 Nov 2022 17:05 UTC
46 points
13 comments7 min readLW link

Why do we post our AI safety plans on the In­ter­net?

Peter S. Park3 Nov 2022 16:02 UTC
4 points
4 comments11 min readLW link

Mul­ti­ple De­ploy-Key Repos

jefftk3 Nov 2022 15:10 UTC
15 points
0 comments1 min readLW link
(www.jefftk.com)

Covid 11/​3/​22: Ask­ing Forgiveness

Zvi3 Nov 2022 13:50 UTC
23 points
3 comments6 min readLW link
(thezvi.wordpress.com)

Ad­ver­sar­ial Poli­cies Beat Pro­fes­sional-Level Go AIs

sanxiyn3 Nov 2022 13:27 UTC
31 points
35 comments1 min readLW link
(goattack.alignmentfund.org)

K-types vs T-types — what pri­ors do you have?

Cleo Nardo3 Nov 2022 11:29 UTC
71 points
25 comments7 min readLW link

In­for­ma­tion Mar­kets 2: Op­ti­mally Shaped Re­ward Bets

eva_3 Nov 2022 11:08 UTC
9 points
0 comments3 min readLW link

The Ra­tional Utili­tar­ian Love Move­ment (A His­tor­i­cal Ret­ro­spec­tive)

CBiddulph3 Nov 2022 7:11 UTC
3 points
0 comments1 min readLW link

The Mir­ror Cham­ber: A short story ex­plor­ing the an­thropic mea­sure func­tion and why it can matter

mako yass3 Nov 2022 6:47 UTC
30 points
13 comments10 min readLW link

Open Let­ter Against Reck­less Nu­clear Es­ca­la­tion and Use

Max Tegmark3 Nov 2022 5:34 UTC
27 points
23 comments1 min readLW link

Lazy Python Ar­gu­ment Parsing

jefftk3 Nov 2022 2:20 UTC
20 points
3 comments1 min readLW link
(www.jefftk.com)

AI as a Civ­i­liza­tional Risk Part 5/​6: Re­la­tion­ship be­tween C-risk and X-risk

PashaKamyshev3 Nov 2022 2:19 UTC
2 points
0 comments7 min readLW link

[Question] Is there a good way to award a fixed prize in a pre­dic­tion con­test?

jchan2 Nov 2022 21:37 UTC
18 points
5 comments1 min readLW link

“Are Ex­per­i­ments Pos­si­ble?” Seeds of Science call for reviewers

rogersbacon2 Nov 2022 20:05 UTC
8 points
0 comments1 min readLW link

Hu­mans do acausal co­or­di­na­tion all the time

Adam Jermyn2 Nov 2022 14:40 UTC
57 points
35 comments3 min readLW link

Far-UVC Light Up­date: No, LEDs are not around the cor­ner (tweet­storm)

Davidmanheim2 Nov 2022 12:57 UTC
70 points
27 comments4 min readLW link
(twitter.com)

Hous­ing and Tran­sit Thoughts #1

Zvi2 Nov 2022 12:10 UTC
35 points
5 comments16 min readLW link
(thezvi.wordpress.com)

Mind is uncountable

Filip Sondej2 Nov 2022 11:51 UTC
18 points
22 comments1 min readLW link

AI Safety Needs Great Product Builders

goodgravy2 Nov 2022 11:33 UTC
14 points
2 comments1 min readLW link

Why is fiber good for you?

braces2 Nov 2022 2:04 UTC
18 points
2 comments2 min readLW link

In­for­ma­tion Markets

eva_2 Nov 2022 1:24 UTC
46 points
6 comments12 min readLW link

Se­quence Reread: Fake Beliefs [plus se­quence spotlight meta]

Raemon2 Nov 2022 0:09 UTC
27 points
3 comments1 min readLW link

Real-Time Re­search Record­ing: Can a Trans­former Re-Derive Po­si­tional Info?

Neel Nanda1 Nov 2022 23:56 UTC
69 points
16 comments1 min readLW link
(youtu.be)

All AGI Safety ques­tions wel­come (es­pe­cially ba­sic ones) [~monthly thread]

Robert Miles1 Nov 2022 23:23 UTC
68 points
105 comments2 min readLW link

[Question] Which Is­sues in Con­cep­tual Align­ment have been For­mal­ised or Ob­served (or not)?

ojorgensen1 Nov 2022 22:32 UTC
4 points
0 comments1 min readLW link

AI as a Civ­i­liza­tional Risk Part 4/​6: Bioweapons and Philos­o­phy of Modification

PashaKamyshev1 Nov 2022 20:50 UTC
7 points
1 comment8 min readLW link

Open & Wel­come Thread—Novem­ber 2022

MondSemmel1 Nov 2022 18:47 UTC
14 points
46 comments1 min readLW link

Mildly Against Donor Lotteries

jefftk1 Nov 2022 18:10 UTC
10 points
9 comments3 min readLW link
(www.jefftk.com)

Progress links and tweets, 2022-11-01

jasoncrawford1 Nov 2022 17:48 UTC
16 points
4 comments3 min readLW link
(rootsofprogress.org)

On the cor­re­spon­dence be­tween AI-mis­al­ign­ment and cog­ni­tive dis­so­nance us­ing a be­hav­ioral eco­nomics model

Stijn Bruers1 Nov 2022 17:39 UTC
4 points
0 comments6 min readLW link

a ca­sual in­tro to AI doom and alignment

Tamsin Leake1 Nov 2022 16:38 UTC
18 points
0 comments4 min readLW link
(carado.moe)

Threat Model Liter­a­ture Review

1 Nov 2022 11:03 UTC
75 points
4 comments25 min readLW link