Align­ment is not enough

Alan ChanJan 12, 2023, 12:33 AM
12 points
6 comments11 min readLW link
(coordination.substack.com)

How it feels to have your mind hacked by an AI

blakedJan 12, 2023, 12:33 AM
367 points
222 comments17 min readLW link

Cat­e­gor­i­cal-mea­sure-the­o­retic ap­proach to op­ti­mal poli­cies tend­ing to seek power

jacekJan 12, 2023, 12:32 AM
31 points
3 comments6 min readLW link

Any per­son/​mind should have the right to suicide

askofaJan 12, 2023, 12:32 AM
14 points
13 comments2 min readLW link

Have we re­ally for­saken nat­u­ral se­lec­tion?

KatjaGraceJan 12, 2023, 12:10 AM
34 points
7 comments2 min readLW link
(worldspiritsockpuppet.com)

[Question] Us­ing Finite Fac­tored Sets for Causal Rep­re­sen­ta­tion Learn­ing?

David ReberJan 11, 2023, 10:06 PM
2 points
3 comments1 min readLW link

GWWC’s Han­dling of Con­flict­ing Fund­ing Bars

jefftkJan 11, 2023, 8:30 PM
19 points
0 comments3 min readLW link
(www.jefftk.com)

How to write a big carte­sian product sym­bol in MathJax

Matthias G. MayerJan 11, 2023, 8:21 PM
8 points
1 comment1 min readLW link

What’s the deal with AI con­scious­ness?

TW123Jan 11, 2023, 4:37 PM
6 points
13 comments9 min readLW link
(aiwatchtower.substack.com)

[Question] Any sig­nifi­cant up­dates on long covid risk anal­y­sis?

Randomized, ControlledJan 11, 2023, 2:31 PM
23 points
11 comments1 min readLW link

in­ter­nal in non­stan­dard analysis

Alok SinghJan 11, 2023, 9:58 AM
9 points
1 comment1 min readLW link

Com­pound­ing Re­source X

RaemonJan 11, 2023, 3:14 AM
77 points
6 comments9 min readLW link

Run­ning With a Backpack

jefftkJan 11, 2023, 3:00 AM
19 points
11 comments1 min readLW link
(www.jefftk.com)

A sim­ple thought ex­per­i­ment show­ing why re­ces­sions are an un­nec­es­sary bug in our eco­nomic system

skogsnisseJan 11, 2023, 12:43 AM
1 point
1 comment1 min readLW link

We don’t trade with ants

KatjaGraceJan 10, 2023, 11:50 PM
272 points
109 comments7 min readLW link1 review
(worldspiritsockpuppet.com)

[Question] Who are the peo­ple who are cur­rently prof­it­ing from in­fla­tion?

skogsnisseJan 10, 2023, 9:39 PM
1 point
2 comments1 min readLW link

Is Progress Real?

rogersbaconJan 10, 2023, 5:42 PM
5 points
14 comments14 min readLW link
(www.secretorum.life)

200 COP in MI: In­ter­pret­ing Re­in­force­ment Learning

Neel NandaJan 10, 2023, 5:37 PM
25 points
1 comment10 min readLW link

AGI and the EMH: mar­kets are not ex­pect­ing al­igned or un­al­igned AI in the next 30 years

Jan 10, 2023, 4:06 PM
119 points
44 comments26 min readLW link

The Align­ment Prob­lem from a Deep Learn­ing Per­spec­tive (ma­jor rewrite)

Jan 10, 2023, 4:06 PM
84 points
8 comments39 min readLW link
(arxiv.org)

Against us­ing stock prices to fore­cast AI timelines

Jan 10, 2023, 4:03 PM
23 points
2 comments2 min readLW link

Sort­ing Peb­bles Into Cor­rect Heaps: The Animation

WriterJan 10, 2023, 3:58 PM
26 points
2 comments1 min readLW link
(youtu.be)

Es­cape Ve­loc­ity from Bul­lshit Jobs

ZviJan 10, 2023, 2:30 PM
61 points
18 comments5 min readLW link
(thezvi.wordpress.com)

Scal­ing laws vs in­di­vi­d­ual differences

berenJan 10, 2023, 1:22 PM
45 points
21 comments7 min readLW link

Notes on writing

RPJan 10, 2023, 4:01 AM
35 points
11 comments3 min readLW link

Idea: Learn­ing How To Move Towards The Metagame

AlgonJan 10, 2023, 12:58 AM
10 points
3 comments1 min readLW link

Re­view AI Align­ment posts to help figure out how to make a proper AI Align­ment review

Jan 10, 2023, 12:19 AM
85 points
31 comments2 min readLW link

Against the para­dox of tolerance

pchvykovJan 10, 2023, 12:12 AM
1 point
11 comments3 min readLW link

In­creased Scam Qual­ity/​Quan­tity (Hy­poth­e­sis in need of data)?

BeeblebroxJan 9, 2023, 10:57 PM
9 points
6 comments1 min readLW link

Went­worth and Larsen on buy­ing time

Jan 9, 2023, 9:31 PM
74 points
6 comments12 min readLW link

EA & LW Fo­rum Sum­maries—Holi­day Edi­tion (19th Dec − 8th Jan)

Zoe WilliamsJan 9, 2023, 9:06 PM
11 points
0 commentsLW link

GWWC Should Re­quire Public Char­ity Evaluations

jefftkJan 9, 2023, 8:10 PM
28 points
0 comments4 min readLW link
(www.jefftk.com)

[MLSN #7]: an ex­am­ple of an emer­gent in­ter­nal optimizer

Jan 9, 2023, 7:39 PM
28 points
0 comments6 min readLW link

Try­ing to iso­late ob­jec­tives: ap­proaches to­ward high-level interpretability

JozdienJan 9, 2023, 6:33 PM
49 points
14 comments8 min readLW link

The spe­cial na­ture of spe­cial relativity

adamShimiJan 9, 2023, 5:30 PM
37 points
1 comment3 min readLW link
(epistemologicalvigilance.substack.com)

Pierre Me­nard, pixel art, and entropy

Joey MarcellinoJan 9, 2023, 4:34 PM
1 point
1 comment6 min readLW link

Fore­cast­ing ex­treme outcomes

AidanGothJan 9, 2023, 4:34 PM
4 points
1 comment2 min readLW link
(docs.google.com)

Ev­i­dence un­der Ad­ver­sar­ial Conditions

PeterMcCluskeyJan 9, 2023, 4:21 PM
57 points
1 comment3 min readLW link
(bayesianinvestor.com)

How to Bounded Distrust

ZviJan 9, 2023, 1:10 PM
122 points
17 comments4 min readLW link1 review
(thezvi.wordpress.com)

Reifi­ca­tion bias

Jan 9, 2023, 12:22 PM
25 points
6 comments2 min readLW link

Big list of AI safety videos

JakubKJan 9, 2023, 6:12 AM
11 points
2 comments1 min readLW link
(docs.google.com)

Ra­tion­al­ity Prac­tice: Self-Deception

Darmani9 Jan 2023 4:07 UTC
6 points
0 comments1 min readLW link

Wolf In­ci­dent Postmortem

jefftk9 Jan 2023 3:20 UTC
137 points
13 comments1 min readLW link
(www.jefftk.com)

You’re Not One “You”—How De­ci­sion The­o­ries Are Talk­ing Past Each Other

keith_wynroe9 Jan 2023 1:21 UTC
28 points
11 comments8 min readLW link

On Blog­ging and Podcasting

DanielFilan9 Jan 2023 0:40 UTC
18 points
6 comments11 min readLW link
(danielfilan.com)

ChatGPT tells sto­ries about XP-708-DQ, Eliezer, drag­ons, dark sor­cer­esses, and un­al­igned robots be­com­ing aligned

Bill Benzon8 Jan 2023 23:21 UTC
6 points
2 comments18 min readLW link

Si­mu­lacra are Things

janus8 Jan 2023 23:03 UTC
63 points
7 comments2 min readLW link

[Question] GPT learn­ing from smarter texts?

Viliam8 Jan 2023 22:23 UTC
26 points
7 comments1 min readLW link

La­tent vari­able pre­dic­tion mar­kets mockup + de­signer request

tailcalled8 Jan 2023 22:18 UTC
25 points
4 comments1 min readLW link

Cita­bil­ity of Less­wrong and the Align­ment Forum

Leon Lang8 Jan 2023 22:12 UTC
48 points
2 comments1 min readLW link