Desider­ata for an Ad­ver­sar­ial Prior

shminux9 Nov 2022 23:45 UTC
13 points
2 comments1 min readLW link

Chord Notation

jefftk9 Nov 2022 21:30 UTC
12 points
5 comments1 min readLW link
(www.jefftk.com)

[ASoT] In­stru­men­tal con­ver­gence is useful

Ulisse Mini9 Nov 2022 20:20 UTC
5 points
9 comments1 min readLW link

Me­satrans­la­tion and Metatranslation

jdp9 Nov 2022 18:46 UTC
25 points
4 comments11 min readLW link

Try­ing to Make a Treach­er­ous Mesa-Optimizer

MadHatter9 Nov 2022 18:07 UTC
95 points
14 comments4 min readLW link
(attentionspan.blog)

A caveat to the Orthog­o­nal­ity Thesis

Wuschel Schulz9 Nov 2022 15:06 UTC
37 points
10 comments2 min readLW link

Wed­nes­day South Bay Mee­tups, Novem­ber 16

Leonard Zabarsky9 Nov 2022 2:21 UTC
1 point
0 comments1 min readLW link

FTX will prob­a­bly be sold at a steep dis­count. What we know and some fore­casts on what will hap­pen next

Nathan Young9 Nov 2022 2:14 UTC
60 points
21 comments1 min readLW link

A first suc­cess story for Outer Align­ment: In­struc­tGPT

Noosphere898 Nov 2022 22:52 UTC
6 points
1 comment1 min readLW link
(openai.com)

Try­ing Mastodon

jefftk8 Nov 2022 19:10 UTC
12 points
4 comments1 min readLW link
(www.jefftk.com)

In­verse scal­ing can be­come U-shaped

Edouard Harris8 Nov 2022 19:04 UTC
27 points
15 comments1 min readLW link
(arxiv.org)

Peo­ple care about each other even though they have im­perfect mo­ti­va­tional poin­t­ers?

TurnTrout8 Nov 2022 18:15 UTC
33 points
25 comments7 min readLW link

Ap­ply­ing su­per­in­tel­li­gence with­out col­lu­sion

Eric Drexler8 Nov 2022 18:08 UTC
107 points
63 comments4 min readLW link

[Question] Bi­nance is buy­ing FTX.com: How did it hap­pen and what are the im­pli­ca­tions?

Caerulean8 Nov 2022 17:14 UTC
16 points
6 comments1 min readLW link

Some ad­vice on in­de­pen­dent research

Marius Hobbhahn8 Nov 2022 14:46 UTC
53 points
5 comments10 min readLW link

Mys­ter­ies of mode collapse

janus8 Nov 2022 10:37 UTC
282 points
57 comments14 min readLW link1 review

[ASoT] Thoughts on GPT-N

Ulisse Mini8 Nov 2022 7:14 UTC
8 points
0 comments1 min readLW link

Michael Simm—In­tro­duc­ing Myself

Michael Simm8 Nov 2022 5:45 UTC
4 points
0 comments2 min readLW link

EA & LW Fo­rums Weekly Sum­mary (31st Oct − 6th Nov 22′)

Zoe Williams8 Nov 2022 3:58 UTC
12 points
1 comment1 min readLW link

[Question] Value of Query­ing 100+ Peo­ple About Hu­man­ity’s Future

Fer32dwt34r3dfsz8 Nov 2022 0:41 UTC
9 points
3 comments2 min readLW link

How could we know that an AGI sys­tem will have good con­se­quences?

So8res7 Nov 2022 22:42 UTC
110 points
25 comments5 min readLW link

A Walk­through of In­ter­pretabil­ity in the Wild (w/​ au­thors Kevin Wang, Arthur Conmy & Alexan­dre Variengien)

Neel Nanda7 Nov 2022 22:39 UTC
30 points
15 comments3 min readLW link
(youtu.be)

In­ter­cept ar­ti­cle about lab accidents

ChristianKl7 Nov 2022 21:10 UTC
23 points
9 comments1 min readLW link
(theintercept.com)

The biolog­i­cal func­tion of love for non-kin is to gain the trust of peo­ple we can­not deceive

chaosmage7 Nov 2022 20:26 UTC
43 points
3 comments8 min readLW link

Distil­la­tion Ex­per­i­ment: Chunk-Knitting

DirectedEvolution7 Nov 2022 19:56 UTC
9 points
1 comment6 min readLW link

Think­ing About Mastodon

jefftk7 Nov 2022 19:40 UTC
33 points
17 comments1 min readLW link
(www.jefftk.com)

[Question] Ideas for tiny re­search pro­jects re­lated to ra­tio­nal­ity?

Frej7 Nov 2022 18:45 UTC
3 points
1 comment1 min readLW link

Loss of con­trol of AI is not a likely source of AI x-risk

squek7 Nov 2022 18:44 UTC
−6 points
0 comments5 min readLW link

AI Safety Un­con­fer­ence NeurIPS 2022

Orpheus7 Nov 2022 15:39 UTC
25 points
0 comments1 min readLW link
(aisafetyevents.org)

Hacker-AI – Does it already ex­ist?

Erland Wittkotter7 Nov 2022 14:01 UTC
3 points
13 comments11 min readLW link

What’s the Deal with Elon Musk and Twit­ter?

Zvi7 Nov 2022 13:50 UTC
60 points
11 comments31 min readLW link
(thezvi.wordpress.com)

How to Make Easy De­ci­sions

lynettebye7 Nov 2022 13:17 UTC
17 points
3 comments2 min readLW link

Op­por­tu­ni­ties that sur­prised us dur­ing our Clearer Think­ing Re­grants program

spencerg7 Nov 2022 13:09 UTC
20 points
0 comments1 min readLW link

4 Key As­sump­tions in AI Safety

Prometheus7 Nov 2022 10:50 UTC
20 points
5 comments7 min readLW link

Google Search as a Washed Up Ser­vice Dog: “I HALP!”

shminux7 Nov 2022 7:02 UTC
20 points
8 comments1 min readLW link

[Book Re­view] “Sta­tion Eleven” by Emily St. John Mandel

lsusr7 Nov 2022 5:56 UTC
17 points
1 comment1 min readLW link

Counterfactability

Scott Garrabrant7 Nov 2022 5:39 UTC
40 points
4 comments11 min readLW link

2022 LessWrong Cen­sus?

SurfingOrca7 Nov 2022 5:16 UTC
67 points
13 comments1 min readLW link

A philoso­pher’s cri­tique of RLHF

ThomasW7 Nov 2022 2:42 UTC
55 points
8 comments2 min readLW link

[Question] Is there any dis­cus­sion on avoid­ing be­ing Dutch-booked or oth­er­wise taken ad­van­tage of one’s bounded ra­tio­nal­ity by re­fus­ing to en­gage?

shminux7 Nov 2022 2:36 UTC
38 points
29 comments1 min readLW link

Ex­ams-Only Universities

Mati_Roy6 Nov 2022 22:05 UTC
80 points
40 comments2 min readLW link

Democ­racy Is in Danger, but Not for the Rea­sons You Think

ExCeph6 Nov 2022 21:15 UTC
−7 points
4 comments12 min readLW link
(ginnungagapfoundation.wordpress.com)

Play­ground Game: Monster

jefftk6 Nov 2022 16:00 UTC
14 points
4 comments1 min readLW link
(www.jefftk.com)

[Question] Has Pas­cal’s Mug­ging prob­lem been com­pletely solved yet?

EniScien6 Nov 2022 12:52 UTC
3 points
11 comments1 min readLW link

[Question] Should I Pur­sue a PhD?

DragonGod6 Nov 2022 10:58 UTC
8 points
8 comments2 min readLW link

You won’t solve al­ign­ment with­out agent foundations

Mikhail Samin6 Nov 2022 8:07 UTC
24 points
3 comments8 min readLW link

Word-Dis­tance vs Idea-Dis­tance: The Case for Lanoitaring

Sable6 Nov 2022 5:25 UTC
7 points
7 comments7 min readLW link
(affablyevil.substack.com)

Ap­ple Cider Syrup

jefftk6 Nov 2022 2:10 UTC
11 points
6 comments1 min readLW link
(www.jefftk.com)

What is epi­ge­net­ics?

Metacelsus6 Nov 2022 1:24 UTC
74 points
4 comments6 min readLW link
(denovo.substack.com)

Response

Jarred Filmer6 Nov 2022 1:03 UTC
26 points
2 comments12 min readLW link