Shard The­ory in Nine Th­e­ses: a Distil­la­tion and Crit­i­cal Appraisal

LawrenceC19 Dec 2022 22:52 UTC
138 points
30 comments18 min readLW link

[Question] Will re­search in AI risk jinx it? Con­se­quences of train­ing AI on AI risk arguments

Yann Dubois19 Dec 2022 22:42 UTC
5 points
6 comments1 min readLW link

AGI Timelines in Gover­nance: Differ­ent Strate­gies for Differ­ent Timeframes

19 Dec 2022 21:31 UTC
63 points
28 comments10 min readLW link

Towards Hodge-podge Alignment

Cleo Nardo19 Dec 2022 20:12 UTC
91 points
30 comments9 min readLW link

Com­pu­ta­tional sig­na­tures of psychopathy

Cameron Berg19 Dec 2022 17:01 UTC
28 points
3 comments20 min readLW link

Re­sults from a sur­vey on tool use and work­flows in al­ign­ment research

19 Dec 2022 15:19 UTC
79 points
2 comments19 min readLW link

Does ChatGPT’s perfor­mance war­rant work­ing on a tu­tor for chil­dren? [It’s time to take it to the lab.]

Bill Benzon19 Dec 2022 15:12 UTC
13 points
5 comments4 min readLW link
(new-savanna.blogspot.com)

Con­di­tions for Su­per­ra­tional­ity-mo­ti­vated Co­op­er­a­tion in a one-shot Pri­soner’s Dilemma

Jim Buhler19 Dec 2022 15:00 UTC
24 points
4 comments5 min readLW link

Next Level Seinfeld

Zvi19 Dec 2022 13:30 UTC
50 points
8 comments1 min readLW link
(thezvi.wordpress.com)

CEA Disambiguation

jefftk19 Dec 2022 13:20 UTC
24 points
0 comments1 min readLW link
(www.jefftk.com)

Why mechanis­tic in­ter­pretabil­ity does not and can­not con­tribute to long-term AGI safety (from mes­sages with a friend)

Remmelt19 Dec 2022 12:02 UTC
−3 points
9 comments31 min readLW link

Hacker-AI and Cy­ber­war 2.0+

Erland Wittkotter19 Dec 2022 11:46 UTC
2 points
0 comments15 min readLW link

Non-Tech­ni­cal Prepa­ra­tion for Hacker-AI and Cy­ber­war 2.0+

Erland Wittkotter19 Dec 2022 11:42 UTC
2 points
0 comments25 min readLW link

An Effec­tive Grab Bag

stavros19 Dec 2022 10:29 UTC
20 points
1 comment7 min readLW link

Slick hy­per­finite Ram­sey the­ory proof

Alok Singh19 Dec 2022 8:40 UTC
8 points
3 comments1 min readLW link
(alok.github.io)

The True Spirit of Sols­tice?

Raemon19 Dec 2022 8:00 UTC
69 points
31 comments9 min readLW link

The Risk of Or­bital De­bris and One (Cheap) Way to Miti­gate It

clans19 Dec 2022 3:16 UTC
13 points
1 comment4 min readLW link
(locationtbd.home.blog)

Why I think that teach­ing philos­o­phy is high impact

Eleni Angelou19 Dec 2022 3:11 UTC
5 points
0 comments2 min readLW link

A tem­plate for do­ing an­nual reviews

peterslattery19 Dec 2022 3:09 UTC
2 points
0 comments1 min readLW link

Event [Berkeley]: Align­ment Col­lab­o­ra­tor Speed-Meeting

19 Dec 2022 2:24 UTC
18 points
2 comments1 min readLW link

An eas­ier(?) end to the elec­toral college

ejacob19 Dec 2022 2:09 UTC
2 points
2 comments2 min readLW link

How Death Feels

sisyphus18 Dec 2022 23:47 UTC
−7 points
9 comments1 min readLW link

Why Are Women Hot?

Jacob Falkovich18 Dec 2022 23:20 UTC
17 points
19 comments11 min readLW link

[Question] Can we, in prin­ci­ple, know the mea­sure of coun­ter­fac­tual quan­tum branches?

sisyphus18 Dec 2022 22:07 UTC
1 point
15 comments1 min readLW link

Bos­ton Sols­tice 2022 Retrospective

jefftk18 Dec 2022 19:00 UTC
19 points
3 comments5 min readLW link
(www.jefftk.com)

Take 11: “Align­ing lan­guage mod­els” should be weirder.

Charlie Steiner18 Dec 2022 14:14 UTC
32 points
0 comments2 min readLW link

Bad at Arith­metic, Promis­ing at Math

cohenmacaulay18 Dec 2022 5:40 UTC
100 points
19 comments20 min readLW link1 review

Over­con­fi­dence bubbles

kaputmi18 Dec 2022 2:07 UTC
3 points
0 comments2 min readLW link

Pos­i­tive val­ues seem more ro­bust and last­ing than prohibitions

TurnTrout17 Dec 2022 21:43 UTC
51 points
13 comments2 min readLW link

What we owe the microbiome

weverka17 Dec 2022 19:40 UTC
2 points
0 comments1 min readLW link
(forum.effectivealtruism.org)

Why write more: im­prove your epistemics, self-care, & 28 other reasons

KatWoods17 Dec 2022 19:25 UTC
22 points
1 comment6 min readLW link

Look­ing for an al­ign­ment tutor

JanB17 Dec 2022 19:08 UTC
15 points
2 comments1 min readLW link

[Question] How to Con­vince my Son that Drugs are Bad

concerned_dad17 Dec 2022 18:47 UTC
139 points
84 comments2 min readLW link

Or­di­nary hu­man life

David Hugh-Jones17 Dec 2022 16:46 UTC
24 points
1 comment14 min readLW link
(wyclif.substack.com)

Pre­dic­tive Pro­cess­ing, Hetero­sex­u­al­ity and Delu­sions of Grandeur

lsusr17 Dec 2022 7:37 UTC
36 points
12 comments5 min readLW link

[Link] Es­cape the Echo Cham­ber (2018)

CronoDAS17 Dec 2022 6:14 UTC
13 points
0 comments2 min readLW link
(aeon.co)

“Starry Night” Sols­tice Cookies

maia17 Dec 2022 5:31 UTC
17 points
0 comments1 min readLW link

There have been 3 planes (billion­aire donors) and 2 have crashed

trevor17 Dec 2022 3:58 UTC
16 points
10 comments2 min readLW link

[Question] What about non-de­gree seek­ing?

Lao Mein17 Dec 2022 2:22 UTC
5 points
5 comments1 min readLW link

Us­ing In­for­ma­tion The­ory to tackle AI Align­ment: A Prac­ti­cal Approach

Daniel Salami17 Dec 2022 1:37 UTC
10 points
4 comments7 min readLW link

Paper: Con­sti­tu­tional AI: Harm­less­ness from AI Feed­back (An­thropic)

LawrenceC16 Dec 2022 22:12 UTC
68 points
11 comments1 min readLW link
(www.anthropic.com)

Vaguely in­ter­ested in Effec­tive Altru­ism? Please Take the Offi­cial 2022 EA Survey

Peter Wildeford16 Dec 2022 21:07 UTC
22 points
4 comments1 min readLW link
(rethinkpriorities.qualtrics.com)

Ab­stract con­cepts and met­al­in­gual defi­ni­tion: Does ChatGPT un­der­stand jus­tice and char­ity?

Bill Benzon16 Dec 2022 21:01 UTC
2 points
0 comments13 min readLW link

Beyond the mo­ment of invention

jasoncrawford16 Dec 2022 20:18 UTC
35 points
0 comments2 min readLW link
(rootsofprogress.org)

[Question] What’s the best time-effi­cient al­ter­na­tive to the Se­quences?

trevor16 Dec 2022 20:17 UTC
6 points
7 comments1 min readLW link

Can we effi­ciently ex­plain model be­hav­iors?

paulfchristiano16 Dec 2022 19:40 UTC
64 points
3 comments9 min readLW link
(ai-alignment.com)

Proper scor­ing rules don’t guaran­tee pre­dict­ing fixed points

16 Dec 2022 18:22 UTC
68 points
8 comments21 min readLW link

A learned agent is not the same as a learn­ing agent

Ben Amitay16 Dec 2022 17:27 UTC
4 points
5 comments4 min readLW link

[Question] Col­lege Selec­tion Ad­vice for Tech­ni­cal Alignment

TempCollegeAsk16 Dec 2022 17:11 UTC
11 points
8 comments1 min readLW link

How im­por­tant are ac­cu­rate AI timelines for the op­ti­mal spend­ing sched­ule on AI risk in­ter­ven­tions?

Tristan Cook16 Dec 2022 16:05 UTC
27 points
2 comments1 min readLW link