[Linkpost] Per­sonal and Psy­cholog­i­cal Di­men­sions of AI Re­searchers Con­fronting AI Catas­trophic Risks

Bogdan Ionut CirsteaAug 12, 2023, 10:02 PM
42 points
0 comments1 min readLW link

The Em­pa­thy Eng­ine: A De­con­struc­tion of the So­cietal Me­ta­mor­pho­sis through Tech­nolog­i­cal Em­pa­thy Augmentation

bigdickproblemsAug 12, 2023, 6:23 PM
−30 points
3 comments2 min readLW link

The Benev­olent Ruler’s Hand­book (Part 2): Mo­ral­ity Rules

FCCCAug 12, 2023, 2:25 PM
5 points
0 comments4 min readLW link

Learn­ing as you play: an­thropic shadow in deadly games

dr_sAug 12, 2023, 7:34 AM
37 points
28 comments35 min readLW link

Biolog­i­cal An­chors: The Trick that Might or Might Not Work

Scott AlexanderAug 12, 2023, 12:53 AM
91 points
3 comments33 min readLW link
(astralcodexten.substack.com)

Si­mu­late the CEO

robotelvisAug 12, 2023, 12:09 AM
23 points
5 comments5 min readLW link
(messyprogress.substack.com)

How to de­cide un­der low-stakes uncertainty

dkl9Aug 11, 2023, 6:07 PM
11 points
4 comments1 min readLW link
(dkl9.net)

The Pan­demic is Only Begin­ning: The Long COVID Disaster

salvatore matteraAug 11, 2023, 5:36 PM
−6 points
15 comments8 min readLW link

When dis­cussing AI risks, talk about ca­pa­bil­ities, not intelligence

VikaAug 11, 2023, 1:38 PM
124 points
7 comments3 min readLW link
(vkrakovna.wordpress.com)

What are the flaws in this AGI ar­gu­ment?

William the Kiwi Aug 11, 2023, 11:31 AM
5 points
14 comments1 min readLW link

Google Deep­Mind’s RT-2

SandXboxAug 11, 2023, 11:26 AM
9 points
1 comment1 min readLW link
(robotics-transformer2.github.io)

Linkpost: We need an­other Ex­pert Sur­vey on Progress in AI, urgently

David MearsAug 11, 2023, 8:22 AM
25 points
2 comments2 min readLW link
(open.substack.com)

What Does a Marginal Grant at LTFF Look Like? Fund­ing Pri­ori­ties and Grant­mak­ing Thresh­olds at the Long-Term Fu­ture Fund

Aug 11, 2023, 3:59 AM
64 points
0 comments1 min readLW link
(forum.effectivealtruism.org)

[Question] Will post­ing any thread on LW guaran­tee that a LLM will in­dex all my con­tent, and if ques­tions peo­ple ask to the LLM af­ter my name will sur­face up all my LW con­tent?

Alex K. Chen (parrot)Aug 11, 2023, 1:40 AM
0 points
0 comments1 min readLW link

AI Safety Con­cepts Wri­teup: WebGPT

JustisMillsAug 11, 2023, 1:35 AM
9 points
1 comment7 min readLW link

[Question] What is sci­ence?

Adam ZernerAug 11, 2023, 12:00 AM
6 points
4 comments1 min readLW link

Three con­figurable prettyprinters

philhAug 10, 2023, 11:10 PM
9 points
0 comments22 min readLW link
(reasonableapproximation.net)

Ilya Sutskever’s thoughts on AI safety (July 2023): a tran­script with my comments

mishkaAug 10, 2023, 7:07 PM
21 points
3 comments5 min readLW link

Seek­ing In­put to AI Safety Book for non-tech­ni­cal audience

Darren McKeeAug 10, 2023, 5:58 PM
10 points
4 comments1 min readLW link

Eval­u­at­ing GPT-4 The­ory of Mind Capabilities

Aug 10, 2023, 5:57 PM
15 points
2 comments14 min readLW link

Some al­ign­ment ideas

SelonNeriasAug 10, 2023, 5:51 PM
1 point
0 comments11 min readLW link

Self Su­per­vised Learn­ing (SSL)

Varshul GuptaAug 10, 2023, 5:43 PM
5 points
1 comment2 min readLW link
(dubverseblack.substack.com)

Pre­dict­ing Virus Rel­a­tive Abun­dance in Wastewater

jefftkAug 10, 2023, 3:46 PM
33 points
2 commentsLW link
(naobservatory.org)

AI #24: Week of the Podcast

ZviAug 10, 2023, 3:00 PM
49 points
5 comments44 min readLW link
(thezvi.wordpress.com)

Could We Au­to­mate AI Align­ment Re­search?

Stephen McAleeseAug 10, 2023, 12:17 PM
34 points
10 comments21 min readLW link

The po­si­tional em­bed­ding ma­trix and pre­vi­ous-to­ken heads: how do they ac­tu­ally work?

AdamYedidiaAug 10, 2023, 1:58 AM
27 points
4 comments13 min readLW link

LLMs are (mostly) not helped by filler tokens

Kshitij SachanAug 10, 2023, 12:48 AM
66 points
35 comments6 min readLW link

2023 ACX Mee­tups Every­where—New­ton, MA

duck_masterAug 9, 2023, 10:47 PM
6 points
2 comments1 min readLW link

Progress links di­gest, 2023-08-09: US adds new nu­clear, Katalin Kar­ikó in­ter­view, and more

jasoncrawfordAug 9, 2023, 7:22 PM
18 points
0 comments3 min readLW link
(rootsofprogress.org)

Mech In­terp Challenge: Au­gust—De­ci­pher­ing the First Unique Char­ac­ter Model

CallumMcDougallAug 9, 2023, 7:14 PM
36 points
1 comment3 min readLW link

Real Mean­ing of life has been found. Eliezer dis­cov­ered it in 2000′s.

JorterderAug 9, 2023, 6:13 PM
−15 points
1 comment1 min readLW link
(docs.google.com)

Marginal Revolu­tion un­offi­cial birth­day party

Derek M. JonesAug 9, 2023, 2:35 PM
4 points
0 comments1 min readLW link

A con­tent anal­y­sis of the SQ-R ques­tion­naire and a pro­posal for test­ing EQ-SQ theory

tailcalledAug 9, 2023, 1:51 PM
10 points
2 comments13 min readLW link

[Question] Does LessWrong al­low ex­empt­ing posts from be­ing scraped by GPTBot?

micAug 9, 2023, 1:02 PM
29 points
3 comments1 min readLW link

If I Was An Ec­cen­tric Trillionaire

niplavAug 9, 2023, 7:56 AM
9 points
8 comments26 min readLW link

Mo­du­lat­ing syco­phancy in an RLHF model via ac­ti­va­tion steering

Nina PanicksseryAug 9, 2023, 7:06 AM
69 points
20 comments12 min readLW link

Open Thread—Au­gust 2023

habrykaAug 9, 2023, 3:52 AM
18 points
49 comments1 min readLW link

marine cloud brightening

bhauthAug 9, 2023, 2:50 AM
40 points
14 comments3 min readLW link
(www.bhauth.com)

In­flec­tion.ai is a ma­jor AGI lab

Nikola JurkovicAug 9, 2023, 1:05 AM
137 points
13 comments2 min readLW link

Acausal Now: We could to­tally acausally bar­gain with aliens at our cur­rent tech level if desired

Christopher KingAug 9, 2023, 12:50 AM
1 point
5 comments4 min readLW link

Ne­cro­mancy’s un­in­tended con­se­quences.

Christopher KingAug 9, 2023, 12:08 AM
−6 points
2 comments2 min readLW link

What’s A “Mar­ket”?

johnswentworthAug 8, 2023, 11:29 PM
94 points
16 comments10 min readLW link

Pod­cast (+tran­script): Nathan Barnard on how US fi­nan­cial reg­u­la­tion can in­form AI governance

Aaron BergmanAug 8, 2023, 9:46 PM
8 points
0 commentsLW link
(www.aaronbergman.net)

What are the flaws in this ar­gu­ment about p(Doom)?

William the Kiwi Aug 8, 2023, 8:34 PM
−2 points
26 comments1 min readLW link

A Sim­ple The­ory Of Consciousness

SherlockHolmesAug 8, 2023, 6:05 PM
2 points
5 comments1 min readLW link
(peterholmes.medium.com)

[Linkpost] Ra­tion­ally awake

jpcAug 8, 2023, 5:59 PM
−1 points
0 comments4 min readLW link
(jpc.dev)

Yet more UFO Bet­ting: Put Up or Shut Up

MoreRatsWrongReUAPAug 8, 2023, 5:50 PM
10 points
18 comments1 min readLW link

AISN #18: Challenges of Re­in­force­ment Learn­ing from Hu­man Feed­back, Microsoft’s Se­cu­rity Breach, and Con­cep­tual Re­search on AI Safety

Dan HAug 8, 2023, 3:52 PM
13 points
0 commentsLW link
(newsletter.safe.ai)

[Question] Begin­ner’s ques­tion about RLHF

FTPickleAug 8, 2023, 3:48 PM
1 point
3 comments1 min readLW link

My Trial Pe­riod as an In­de­pen­dent Align­ment Researcher

Bart BussmannAug 8, 2023, 2:16 PM
34 points
1 comment3 min readLW link