“Ru­de­ness”, a use­ful co­or­di­na­tion mechanic

Raemon11 Nov 2022 22:27 UTC
51 points
20 comments2 min readLW link

In­ter­nal­iz­ing the dam­age of bad-act­ing part­ners cre­ates in­cen­tives for due diligence

tailcalled11 Nov 2022 20:57 UTC
17 points
7 comments1 min readLW link

Spec­u­la­tion on Cur­rent Op­por­tu­ni­ties for Unusu­ally High Im­pact in Global Health

johnswentworth11 Nov 2022 20:47 UTC
114 points
31 comments4 min readLW link

[Question] Is acausal ex­tor­tion pos­si­ble?

sisyphus11 Nov 2022 19:48 UTC
−20 points
35 comments3 min readLW link

Cathar­sis in Bb

jefftk11 Nov 2022 17:40 UTC
6 points
0 comments1 min readLW link
(www.jefftk.com)

In­stru­men­tal con­ver­gence is what makes gen­eral in­tel­li­gence possible

tailcalled11 Nov 2022 16:38 UTC
105 points
11 comments4 min readLW link

Weekly Roundup #5

Zvi11 Nov 2022 16:20 UTC
33 points
0 comments6 min readLW link
(thezvi.wordpress.com)

Charg­ing for the Dharma

jchan11 Nov 2022 14:02 UTC
32 points
18 comments5 min readLW link

[Question] EA (& AI Safety) has over­es­ti­mated its pro­jected fund­ing — which de­ci­sions must be re­vised?

Cleo Nardo11 Nov 2022 13:50 UTC
22 points
7 comments1 min readLW link
(forum.effectivealtruism.org)

Where the log­i­cal fal­lacy is not (Gen­er­al­iza­tion From Fic­tional Ev­i­dence)

banev11 Nov 2022 10:41 UTC
−12 points
14 comments1 min readLW link

Why I’m Work­ing On Model Ag­nos­tic Interpretability

Jessica Rumbelow11 Nov 2022 9:24 UTC
27 points
9 comments2 min readLW link

How likely are ma­lign pri­ors over ob­jec­tives? [aborted WIP]

David Johnston11 Nov 2022 5:36 UTC
−1 points
0 comments8 min readLW link

Do Time­less De­ci­sion The­o­rists re­ject all black­mail from other Time­less De­ci­sion The­o­rists?

myren11 Nov 2022 0:38 UTC
7 points
8 comments3 min readLW link

We must be very clear: fraud in the ser­vice of effec­tive al­tru­ism is unacceptable

evhub10 Nov 2022 23:31 UTC
42 points
56 comments3 min readLW link

[simu­la­tion] 4chan user claiming to be the at­tor­ney hired by Google’s sen­tient chat­bot LaMDA shares wild de­tails of encounter

janus10 Nov 2022 21:39 UTC
19 points
1 comment13 min readLW link
(generative.ink)

di­v­ine carrot

Alok Singh10 Nov 2022 20:50 UTC
18 points
2 comments1 min readLW link
(alok.github.io)

Me­tac­u­lus An­nounces The Million Pre­dic­tions Hackathon

ChristianWilliams10 Nov 2022 20:00 UTC
7 points
0 comments1 min readLW link
(metaculus.medium.com)

The har­ness­ing of complexity

geduardo10 Nov 2022 18:44 UTC
6 points
2 comments3 min readLW link

[Question] I there a demo of “You can’t fetch the coffee if you’re dead”?

Ram Rachum10 Nov 2022 18:41 UTC
8 points
9 comments1 min readLW link

Mastodon Link­ing Norms

jefftk10 Nov 2022 15:10 UTC
9 points
8 comments2 min readLW link
(www.jefftk.com)

Covid 11/​10/​22: Into the Background

Zvi10 Nov 2022 13:40 UTC
31 points
5 comments4 min readLW link
(thezvi.wordpress.com)

LessWrong Poll on AGI

Niclas Kupper10 Nov 2022 13:13 UTC
12 points
6 comments1 min readLW link

The op­ti­mal an­gle for a so­lar boiler is differ­ent than for a so­lar panel

Yair Halberstadt10 Nov 2022 10:32 UTC
42 points
4 comments2 min readLW link

What it’s like to dis­sect a cadaver

Alok Singh10 Nov 2022 6:40 UTC
208 points
24 comments5 min readLW link
(alok.github.io)

I Con­verted Book I of The Se­quences Into A Zoomer-Read­able Format

dkirmani10 Nov 2022 2:59 UTC
200 points
32 comments2 min readLW link

Ad­ver­sar­ial Pri­ors: Not Pay­ing Peo­ple to Lie to You

eva_10 Nov 2022 2:29 UTC
22 points
9 comments3 min readLW link

Is full self-driv­ing an AGI-com­plete prob­lem?

kraemahz10 Nov 2022 2:04 UTC
10 points
5 comments1 min readLW link

[Question] What are ex­am­ples of prob­lems that were caused by in­tel­li­gence, that couldn’t be solved with in­tel­li­gence?

Peter O'Malley10 Nov 2022 2:04 UTC
1 point
2 comments1 min readLW link

Desider­ata for an Ad­ver­sar­ial Prior

Shmi9 Nov 2022 23:45 UTC
13 points
2 comments1 min readLW link

Chord Notation

jefftk9 Nov 2022 21:30 UTC
12 points
5 comments1 min readLW link
(www.jefftk.com)

[ASoT] In­stru­men­tal con­ver­gence is useful

Ulisse Mini9 Nov 2022 20:20 UTC
5 points
9 comments1 min readLW link

Me­satrans­la­tion and Metatranslation

jdp9 Nov 2022 18:46 UTC
25 points
4 comments11 min readLW link

Try­ing to Make a Treach­er­ous Mesa-Optimizer

MadHatter9 Nov 2022 18:07 UTC
95 points
14 comments4 min readLW link
(attentionspan.blog)

A caveat to the Orthog­o­nal­ity Thesis

Wuschel Schulz9 Nov 2022 15:06 UTC
38 points
10 comments2 min readLW link

Wed­nes­day South Bay Mee­tups, Novem­ber 16

Leonard Zabarsky9 Nov 2022 2:21 UTC
1 point
0 comments1 min readLW link

FTX Cri­sis. What we know and some fore­casts on what will hap­pen next

Nathan Young9 Nov 2022 2:14 UTC
60 points
21 comments3 min readLW link

A first suc­cess story for Outer Align­ment: In­struc­tGPT

Noosphere898 Nov 2022 22:52 UTC
6 points
1 comment1 min readLW link
(openai.com)

Try­ing Mastodon

jefftk8 Nov 2022 19:10 UTC
12 points
4 comments1 min readLW link
(www.jefftk.com)

In­verse scal­ing can be­come U-shaped

Edouard Harris8 Nov 2022 19:04 UTC
27 points
15 comments1 min readLW link
(arxiv.org)

Peo­ple care about each other even though they have im­perfect mo­ti­va­tional poin­t­ers?

TurnTrout8 Nov 2022 18:15 UTC
33 points
25 comments7 min readLW link

Ap­ply­ing su­per­in­tel­li­gence with­out col­lu­sion

Eric Drexler8 Nov 2022 18:08 UTC
109 points
63 comments4 min readLW link

[Question] Bi­nance is buy­ing FTX.com: How did it hap­pen and what are the im­pli­ca­tions?

Caerulean8 Nov 2022 17:14 UTC
16 points
6 comments1 min readLW link

Some ad­vice on in­de­pen­dent research

Marius Hobbhahn8 Nov 2022 14:46 UTC
56 points
5 comments10 min readLW link

Mys­ter­ies of mode collapse

janus8 Nov 2022 10:37 UTC
284 points
57 comments14 min readLW link1 review

[ASoT] Thoughts on GPT-N

Ulisse Mini8 Nov 2022 7:14 UTC
8 points
0 comments1 min readLW link

Michael Simm—In­tro­duc­ing Myself

Michael Simm8 Nov 2022 5:45 UTC
4 points
0 comments2 min readLW link

EA & LW Fo­rums Weekly Sum­mary (31st Oct − 6th Nov 22′)

Zoe Williams8 Nov 2022 3:58 UTC
12 points
1 comment18 min readLW link

[Question] Value of Query­ing 100+ Peo­ple About Hu­man­ity’s Future

T4318 Nov 2022 0:41 UTC
9 points
3 comments2 min readLW link

How could we know that an AGI sys­tem will have good con­se­quences?

So8res7 Nov 2022 22:42 UTC
111 points
25 comments5 min readLW link

A Walk­through of In­ter­pretabil­ity in the Wild (w/​ au­thors Kevin Wang, Arthur Conmy & Alexan­dre Variengien)

Neel Nanda7 Nov 2022 22:39 UTC
30 points
15 comments3 min readLW link
(youtu.be)