[Question] What ve­gan food re­sources have you found use­ful?

Elizabeth25 May 2023 22:46 UTC
29 points
6 comments1 min readLW link

Mob and Bailey

Screwtape25 May 2023 22:14 UTC
75 points
15 comments7 min readLW link

Look At What’s In Front Of You (Con­clu­sion to The Nuts and Bolts of Nat­u­ral­ism)

LoganStrohl25 May 2023 19:00 UTC
50 points
1 comment2 min readLW link

[Mar­ket] Will AI xrisk seem to be han­dled se­ri­ously by the end of 2026?

tailcalled25 May 2023 18:51 UTC
15 points
2 comments1 min readLW link
(manifold.markets)

[Question] What should my col­lege ma­jor be if I want to do AI al­ign­ment re­search?

metachirality25 May 2023 18:23 UTC
8 points
7 comments1 min readLW link

Is be­hav­ioral safety “solved” in non-ad­ver­sar­ial con­di­tions?

Robert_AIZI25 May 2023 17:56 UTC
26 points
8 comments2 min readLW link
(aizi.substack.com)

Book Re­view: How Minds Change

bc4026bd4aaa5b7fe25 May 2023 17:55 UTC
298 points
52 comments15 min readLW link

Self-ad­ministered EMDR with­out a ther­a­pist is very use­ful for a lot of things!

Anton Rodenhauser25 May 2023 17:54 UTC
42 points
10 comments11 min readLW link

Re­cur­ren­tGPT: a loom-type tool with a twist

mishka25 May 2023 17:09 UTC
10 points
0 comments3 min readLW link
(arxiv.org)

The Ge­nie in the Bot­tle: An In­tro­duc­tion to AI Align­ment and Risk

Snorkelfarsan25 May 2023 16:30 UTC
5 points
1 comment25 min readLW link

AI #13: Po­ten­tial Al­gorith­mic Improvements

Zvi25 May 2023 15:40 UTC
45 points
4 comments67 min readLW link
(thezvi.wordpress.com)

Solv­ing the Mechanis­tic In­ter­pretabil­ity challenges: EIS VII Challenge 2

25 May 2023 15:37 UTC
71 points
1 comment13 min readLW link

Malthu­sian Com­pe­ti­tion (not as bad as it seems)

Logan Zoellner25 May 2023 15:30 UTC
6 points
11 comments2 min readLW link

You Don’t Always Need Indexes

jefftk25 May 2023 14:20 UTC
22 points
6 comments1 min readLW link
(www.jefftk.com)

The­o­ries of Biolog­i­cal Inspiration

Eric Zhang25 May 2023 13:07 UTC
7 points
3 comments1 min readLW link

Eval­u­at­ing strate­gic rea­son­ing in GPT models

phelps-sg25 May 2023 11:51 UTC
4 points
1 comment8 min readLW link

Re­quire­ments for a STEM-ca­pa­ble AGI Value Learner (my Case for Less Doom)

RogerDearnaley25 May 2023 9:26 UTC
32 points
3 comments15 min readLW link

Align­ment solu­tions for weak AI don’t (nec­es­sar­ily) scale to strong AI

Michael Tontchev25 May 2023 8:26 UTC
6 points
0 comments5 min readLW link

[Question] What fea­tures would you like to see in a per­sonal for­cast­ing /​ pre­dic­tion track­ing app?

regnarg25 May 2023 8:18 UTC
9 points
0 comments1 min readLW link

An­nounc­ing the Con­fido app: bring­ing fore­cast­ing to everyone

regnarg25 May 2023 8:18 UTC
6 points
2 comments10 min readLW link
(forum.effectivealtruism.org)

But What If We Ac­tu­ally Want To Max­i­mize Paper­clips?

snerx25 May 2023 7:13 UTC
−17 points
6 comments7 min readLW link

Ex­ploit­ing New­comb’s Game Show

carterallen25 May 2023 4:01 UTC
8 points
2 comments2 min readLW link

Deep­Mind: Model eval­u­a­tion for ex­treme risks

Zach Stein-Perlman25 May 2023 3:00 UTC
94 points
11 comments1 min readLW link
(arxiv.org)

Why I’m Not (Yet) A Full-Time Tech­ni­cal Align­ment Researcher

NicholasKross25 May 2023 1:26 UTC
39 points
21 comments4 min readLW link
(www.thinkingmuchbetter.com)

Two ideas for al­ign­ment, per­pet­ual mu­tual dis­trust and induction

APaleBlueDot25 May 2023 0:56 UTC
1 point
2 comments4 min readLW link

Eval­u­at­ing Ev­i­dence Re­con­struc­tions of Mock Crimes -Sub­mis­sion 2

Alan E Dunne24 May 2023 22:17 UTC
−1 points
1 comment3 min readLW link

[Linkpost] In­ter­pretabil­ity Dreams

DanielFilan24 May 2023 21:08 UTC
39 points
2 comments2 min readLW link
(transformer-circuits.pub)

Rishi Su­nak men­tions “ex­is­ten­tial threats” in talk with OpenAI, Deep­Mind, An­thropic CEOs

24 May 2023 21:06 UTC
34 points
1 comment1 min readLW link
(www.gov.uk)

If you’re not a morn­ing per­son, con­sider quit­ting allergy pills

Brendan Long24 May 2023 20:11 UTC
8 points
3 comments1 min readLW link

Adum­bra­tions on AGI from an outsider

nicholashalden24 May 2023 17:41 UTC
55 points
44 comments8 min readLW link
(nicholashalden.home.blog)

Open Thread With Ex­per­i­men­tal Fea­ture: Reactions

jimrandomh24 May 2023 16:46 UTC
101 points
189 comments3 min readLW link

A re­jec­tion of the Orthog­o­nal­ity Thesis

ArisC24 May 2023 16:37 UTC
−2 points
11 comments2 min readLW link
(medium.com)

Aligned AI via mon­i­tor­ing ob­jec­tives in Au­toGPT-like systems

Paul Colognese24 May 2023 15:59 UTC
27 points
4 comments4 min readLW link

The Office of Science and Tech­nol­ogy Policy put out a re­quest for in­for­ma­tion on A.I.

HiroSakuraba24 May 2023 13:33 UTC
59 points
4 comments1 min readLW link
(www.whitehouse.gov)

ChatGPT (May 2023) on De­sign­ing Friendly Superintelligence

Mitchell_Porter24 May 2023 10:47 UTC
5 points
0 comments1 min readLW link
(singularitypolitics.wordpress.com)

No—AI is just as en­ergy-effi­cient as your brain.

Maxwell Clarke24 May 2023 2:30 UTC
9 points
7 comments1 min readLW link

[Question] What pro­jects and efforts are there to pro­mote AI safety re­search?

Christopher King24 May 2023 0:33 UTC
4 points
0 comments1 min readLW link

My May 2023 pri­ori­ties for AI x-safety: more em­pa­thy, more unifi­ca­tion of con­cerns, and less vil­ifi­ca­tion of OpenAI

Andrew_Critch24 May 2023 0:02 UTC
272 points
39 comments8 min readLW link

AI Safety Newslet­ter #7: Dis­in­for­ma­tion, Gover­nance Recom­men­da­tions for AI labs, and Se­nate Hear­ings on AI

23 May 2023 21:47 UTC
25 points
0 comments6 min readLW link
(newsletter.safe.ai)

The Po­lar­ity Prob­lem [Draft]

23 May 2023 21:05 UTC
24 points
3 comments44 min readLW link

Progress links and tweets, 2023-05-23

jasoncrawford23 May 2023 20:15 UTC
16 points
0 comments1 min readLW link
(rootsofprogress.org)

Yoshua Ben­gio: How Rogue AIs may Arise

harfe23 May 2023 18:28 UTC
92 points
12 comments18 min readLW link
(yoshuabengio.org)

‘Fun­da­men­tal’ vs ‘ap­plied’ mechanis­tic in­ter­pretabil­ity research

Lee Sharkey23 May 2023 18:26 UTC
62 points
6 comments3 min readLW link

Co­er­cion is an adap­ta­tion to scarcity; trust is an adap­ta­tion to abundance

Richard_Ngo23 May 2023 18:14 UTC
86 points
11 comments4 min readLW link

[Question] Is “brit­tle al­ign­ment” good enough?

the8thbit23 May 2023 17:35 UTC
9 points
5 comments3 min readLW link

Will Ar­tifi­cial Su­per­in­tel­li­gence Kill Us?

James_Miller23 May 2023 16:27 UTC
33 points
2 comments22 min readLW link

Phone Num­ber Jingle

jefftk23 May 2023 15:20 UTC
11 points
12 comments1 min readLW link
(www.jefftk.com)

GPT4 is ca­pa­ble of writ­ing de­cent long-form sci­ence fic­tion (with the right prompts)

RomanS23 May 2023 13:41 UTC
22 points
28 comments65 min readLW link

[Question] Do hu­mans still provide value in cor­re­spon­dence chess?

Jonathan Paulson23 May 2023 12:15 UTC
24 points
16 comments1 min readLW link

[Linkpost] The AGI Show podcast

Soroush Pour23 May 2023 9:52 UTC
4 points
0 comments1 min readLW link