Dis­cus­sion with Nate Soares on a key al­ign­ment difficulty

HoldenKarnofsky13 Mar 2023 21:20 UTC
250 points
38 comments22 min readLW link

What Dis­cov­er­ing La­tent Knowl­edge Did and Did Not Find

Fabien Roger13 Mar 2023 19:29 UTC
164 points
16 comments11 min readLW link

South Bay ACX/​LW Meetup

IS13 Mar 2023 18:25 UTC
2 points
0 comments1 min readLW link

Could Roko’s basilisk acausally bar­gain with a pa­per­clip max­i­mizer?

Christopher King13 Mar 2023 18:21 UTC
1 point
8 comments1 min readLW link

Bayesian op­ti­miza­tion to find molecules that bind to proteins

rotatingpaguro13 Mar 2023 18:17 UTC
1 point
0 comments1 min readLW link
(www.youtube.com)

Linkpost: ‘Dis­solv­ing’ AI Risk – Pa­ram­e­ter Uncer­tainty in AI Fu­ture Forecasting

DavidW13 Mar 2023 16:52 UTC
6 points
0 comments1 min readLW link
(forum.effectivealtruism.org)

De­cen­tral­ized Exclusion

jefftk13 Mar 2023 15:50 UTC
23 points
19 comments2 min readLW link
(www.jefftk.com)

Linkpost: A Con­tra AI FOOM Read­ing List

DavidW13 Mar 2023 14:45 UTC
25 points
4 comments1 min readLW link
(magnusvinding.com)

Linkpost: A tale of 2.5 or­thog­o­nal­ity theses

DavidW13 Mar 2023 14:19 UTC
9 points
3 comments1 min readLW link
(forum.effectivealtruism.org)

Plan for mediocre al­ign­ment of brain-like [model-based RL] AGI

Steven Byrnes13 Mar 2023 14:11 UTC
63 points
24 comments12 min readLW link

your ter­mi­nal val­ues are com­plex and not objective

Tamsin Leake13 Mar 2023 13:34 UTC
60 points
6 comments2 min readLW link
(carado.moe)

Against AGI Timelines

Jonathan Yan13 Mar 2023 13:33 UTC
13 points
3 comments1 min readLW link
(benlandautaylor.com)

What is cal­ibra­tion?

AlexMennen13 Mar 2023 6:30 UTC
27 points
1 comment4 min readLW link

On tak­ing AI risk se­ri­ously

Eleni Angelou13 Mar 2023 5:50 UTC
6 points
0 comments1 min readLW link
(www.nytimes.com)

Nose /​ throat treat­ments for res­pi­ra­tory infections

juliawise13 Mar 2023 2:41 UTC
46 points
6 comments8 min readLW link

Gold, Silver, Red: A color scheme for un­der­stand­ing people

Michael Soareverix13 Mar 2023 1:06 UTC
17 points
2 comments4 min readLW link

Yud­kowsky on AGI risk on the Ban­kless podcast

Rob Bensinger13 Mar 2023 0:42 UTC
83 points
5 comments1 min readLW link

Thoughts on self-in­spect­ing neu­ral net­works.

Deruwyn12 Mar 2023 23:58 UTC
4 points
2 comments5 min readLW link

An AI risk ar­gu­ment that res­onates with NYTimes readers

Julian Bradshaw12 Mar 2023 23:09 UTC
203 points
14 comments1 min readLW link

Mu­si­ci­ans and Mouths

jefftk12 Mar 2023 22:50 UTC
13 points
7 comments2 min readLW link
(www.jefftk.com)

Are there cog­ni­tive realms?

TsviBT12 Mar 2023 19:28 UTC
25 points
2 comments10 min readLW link

[Question] What hap­pened on the Ex­tropi­ans mes­sage board?

politicalpersuasion12 Mar 2023 19:22 UTC
−53 points
1 comment1 min readLW link

Creat­ing a Dis­cord server for Mechanis­tic In­ter­pretabil­ity Projects

Victor Levoso12 Mar 2023 18:00 UTC
30 points
6 comments2 min readLW link

Paper Repli­ca­tion Walk­through: Re­v­erse-Eng­ineer­ing Mo­du­lar Addition

Neel Nanda12 Mar 2023 13:25 UTC
18 points
0 comments1 min readLW link
(neelnanda.io)

the quan­tum am­pli­tude ar­gu­ment against ethics deduplication

Tamsin Leake12 Mar 2023 13:02 UTC
11 points
16 comments2 min readLW link
(carado.moe)

What prob­lems do Afri­can-Amer­i­cans face? An ini­tial in­ves­ti­ga­tion us­ing Stand­point Episte­mol­ogy and Surveys

tailcalled12 Mar 2023 11:42 UTC
41 points
26 comments15 min readLW link

“Liquidity” vs “solvency” in bank runs (and some notes on Sili­con Valley Bank)

rossry12 Mar 2023 9:16 UTC
107 points
27 comments12 min readLW link

“You’ll Never Per­suade Peo­ple Like That”

Zack_M_Davis12 Mar 2023 5:38 UTC
14 points
31 comments2 min readLW link

Par­a­sitic Lan­guage Games: main­tain­ing am­bi­guity to hide con­flict while burn­ing the commons

Hazard12 Mar 2023 5:25 UTC
100 points
16 comments13 min readLW link

[Question] Is there a way to sort LW search re­sults by date posted?

zeshen12 Mar 2023 4:56 UTC
5 points
1 comment1 min readLW link

Is “Reg­u­lar­ity” an­other Phlo­gis­ton?

Cole Wyeth12 Mar 2023 3:13 UTC
2 points
3 comments3 min readLW link
(colewyeth.com)

Minor Life Op­ti­miza­tion: Con­sider Order­ing Your Food To-Go

sudo12 Mar 2023 2:08 UTC
9 points
20 comments1 min readLW link

A bunch of videos for in­tu­ition build­ing (2x speed, skip ones that bore you)

the gears to ascension12 Mar 2023 0:51 UTC
72 points
5 comments4 min readLW link

The is­sue of mean­ing in large lan­guage mod­els (LLMs)

Bill Benzon11 Mar 2023 23:00 UTC
1 point
34 comments8 min readLW link

[Linkpost] Scott Alexan­der re­acts to OpenAI’s lat­est post

Akash11 Mar 2023 22:24 UTC
27 points
0 comments5 min readLW link
(astralcodexten.substack.com)

Com­po­si­tional lan­guage for hy­pothe­ses about computations

Vanessa Kosoy11 Mar 2023 19:43 UTC
30 points
2 comments11 min readLW link

Un­der­stand­ing and con­trol­ling a maze-solv­ing policy network

11 Mar 2023 18:59 UTC
312 points
22 comments23 min readLW link

[Question] How can we pro­mote AI al­ign­ment in Ja­pan?

Shoka Kadoi11 Mar 2023 18:52 UTC
24 points
8 comments1 min readLW link

How to Sup­port Some­one Who is Struggling

David Zeller11 Mar 2023 18:52 UTC
76 points
13 comments5 min readLW link

[Question] Given one AI, why not more?

Frank Adk11 Mar 2023 18:52 UTC
7 points
12 comments1 min readLW link

Agents synchronization

Ben Amitay11 Mar 2023 18:41 UTC
12 points
1 comment5 min readLW link

Against Com­plete Black­out Cur­tains For Sleep

jp11 Mar 2023 18:29 UTC
19 points
11 comments1 min readLW link

[Question] Coun­ter­ar­gu­ments to Core AI X-Risk Sto­ries?

DavidW11 Mar 2023 17:55 UTC
10 points
2 comments1 min readLW link

The Power of In­tel­li­gence—The Animation

Writer11 Mar 2023 16:15 UTC
41 points
3 comments1 min readLW link
(youtu.be)

[Question] Hoard­ing Gmail-ac­counts in a post-CAPTCHA world?

Alexander Gietelink Oldenziel11 Mar 2023 16:08 UTC
7 points
3 comments1 min readLW link

[Question] Will the Bit­coin fee mar­ket ac­tu­ally work?

TropicalFruit11 Mar 2023 0:02 UTC
10 points
6 comments1 min readLW link

Ra­tion­al­ism and so­cial rationalism

philosophybear10 Mar 2023 23:20 UTC
17 points
5 comments10 min readLW link
(philosophybear.substack.com)

Meetup Tip: Nametags

Screwtape10 Mar 2023 21:00 UTC
16 points
2 comments3 min readLW link

[Question] Is ChatGPT (or other LLMs) more ‘sen­tient’/​’con­scious/​etc. then a baby with­out a brain?

M. Y. Zuo10 Mar 2023 19:00 UTC
−4 points
2 comments1 min readLW link

The hu­man­ity’s biggest mistake

RomanS10 Mar 2023 16:30 UTC
0 points
1 comment2 min readLW link