AI Safety Newslet­ter #7: Dis­in­for­ma­tion, Gover­nance Recom­men­da­tions for AI labs, and Se­nate Hear­ings on AI

May 23, 2023, 9:47 PM
25 points
0 comments6 min readLW link
(newsletter.safe.ai)

The Po­lar­ity Prob­lem [Draft]

May 23, 2023, 9:05 PM
24 points
3 comments44 min readLW link

Progress links and tweets, 2023-05-23

jasoncrawfordMay 23, 2023, 8:15 PM
16 points
0 comments1 min readLW link
(rootsofprogress.org)

Yoshua Ben­gio: How Rogue AIs may Arise

harfeMay 23, 2023, 6:28 PM
92 points
12 comments18 min readLW link
(yoshuabengio.org)

‘Fun­da­men­tal’ vs ‘ap­plied’ mechanis­tic in­ter­pretabil­ity research

Lee SharkeyMay 23, 2023, 6:26 PM
65 points
6 comments3 min readLW link

Co­er­cion is an adap­ta­tion to scarcity; trust is an adap­ta­tion to abundance

Richard_NgoMay 23, 2023, 6:14 PM
90 points
11 comments4 min readLW link

[Question] Is “brit­tle al­ign­ment” good enough?

the8thbitMay 23, 2023, 5:35 PM
9 points
5 comments3 min readLW link

Will Ar­tifi­cial Su­per­in­tel­li­gence Kill Us?

James_MillerMay 23, 2023, 4:27 PM
33 points
2 comments22 min readLW link

Phone Num­ber Jingle

jefftkMay 23, 2023, 3:20 PM
11 points
12 comments1 min readLW link
(www.jefftk.com)

GPT4 is ca­pa­ble of writ­ing de­cent long-form sci­ence fic­tion (with the right prompts)

RomanSMay 23, 2023, 1:41 PM
22 points
28 comments65 min readLW link

[Question] Do hu­mans still provide value in cor­re­spon­dence chess?

Jonathan PaulsonMay 23, 2023, 12:15 PM
24 points
16 comments1 min readLW link

[Linkpost] The AGI Show podcast

Soroush PourMay 23, 2023, 9:52 AM
4 points
0 comments1 min readLW link

Data and “to­kens” a 30 year old hu­man “trains” on

Jose Miguel Cruz y CelisMay 23, 2023, 5:34 AM
14 points
15 comments1 min readLW link

How I learned to stop wor­ry­ing and love skill trees

junk heap homotopyMay 23, 2023, 4:08 AM
81 points
3 comments1 min readLW link

T-Shirt Size Distribution

jefftkMay 23, 2023, 2:40 AM
9 points
0 comments1 min readLW link
(www.jefftk.com)

AI self-im­prove­ment is possible

bhauthMay 23, 2023, 2:32 AM
18 points
3 comments8 min readLW link

Wor­ry­ing less about acausal extortion

RaemonMay 23, 2023, 2:08 AM
41 points
11 comments13 min readLW link

Self-lead­er­ship and self-love dis­solve anger and trauma

Richard_NgoMay 22, 2023, 10:30 PM
73 points
7 comments5 min readLW link

A Man­i­fold mar­ket no­tice: Binance

Scrooge McduckMay 22, 2023, 10:24 PM
15 points
13 comments1 min readLW link

I don’t want to talk about AI

KirstenHMay 22, 2023, 9:23 PM
34 points
11 comments2 min readLW link
(ealifestyles.substack.com)

Ac­ti­va­tion ad­di­tions in a small resi­d­ual network

Garrett BakerMay 22, 2023, 8:28 PM
22 points
4 comments3 min readLW link

[Linkpost] “Gover­nance of su­per­in­tel­li­gence” by OpenAI

Daniel_EthMay 22, 2023, 8:15 PM
67 points
20 commentsLW link

Two Pie­ces of Ad­vice About How to Re­mem­ber Things

Bentham's BulldogMay 22, 2023, 6:10 PM
13 points
3 comments4 min readLW link

Why I Believe LLMs Do Not Have Hu­man-like Emotions

OneManyNoneMay 22, 2023, 3:46 PM
13 points
6 comments7 min readLW link

AI Safety in China: Part 2

Lao MeinMay 22, 2023, 2:50 PM
103 points
28 comments2 min readLW link

Con­jec­ture in­ter­nal sur­vey: AGI timelines and prob­a­bil­ity of hu­man ex­tinc­tion from ad­vanced AI

Maris SalaMay 22, 2023, 2:31 PM
155 points
5 comments3 min readLW link
(www.conjecture.dev)

Papers, Please #1: Var­i­ous Papers on Em­ploy­ment, Wages and Productivity

ZviMay 22, 2023, 12:00 PM
42 points
2 comments8 min readLW link
(thezvi.wordpress.com)

In Defense of «The Army of Jakoths»

MikkWMay 22, 2023, 11:59 AM
−14 points
10 comments4 min readLW link

Speed of in­for­ma­tion in­put is a bot­tle­neck for rationality

MikkWMay 22, 2023, 10:24 AM
13 points
0 comments4 min readLW link

Distil­la­tion of Neu­rotech and Align­ment Work­shop Jan­uary 2023

May 22, 2023, 7:17 AM
52 points
9 comments14 min readLW link

The Treach­er­ous Turn is finished! (AI-takeover-themed table­top RPG)

Daniel KokotajloMay 22, 2023, 5:49 AM
55 points
5 comments2 min readLW link
(thetreacherousturn.ai)

The Stan­ley Parable: Mak­ing philos­o­phy fun

Nathan1123May 22, 2023, 2:15 AM
6 points
3 comments3 min readLW link

Sea Monsters

Adam ZernerMay 22, 2023, 12:58 AM
29 points
11 comments5 min readLW link

The Army of Jakoths (a parable)

MikkWMay 21, 2023, 10:48 PM
−6 points
0 comments1 min readLW link

A&I (Rihanna ‘S&M’ par­ody lyrics)

nahojMay 21, 2023, 10:34 PM
−2 points
0 comments2 min readLW link

Four Bat­tle­grounds: Power in the Age of Ar­tifi­cial In­tel­li­gence (Book re­view)

PeterMcCluskeyMay 21, 2023, 9:19 PM
25 points
0 comments4 min readLW link
(bayesianinvestor.com)

Gen­der Vec­tors in ROME’s La­tent Space

XodarapMay 21, 2023, 6:46 PM
14 points
2 comments3 min readLW link

Weight by Impact

VaniverMay 21, 2023, 2:37 PM
29 points
1 comment3 min readLW link

[out­dated] My cur­rent the­ory of change to miti­gate ex­is­ten­tial risk by mis­al­igned ASI

mesaoptimizerMay 21, 2023, 1:46 PM
32 points
8 comments6 min readLW link
(mesaoptimizer.com)

Bab­ble on grow­ing trust

qbolecMay 21, 2023, 1:19 PM
13 points
1 comment5 min readLW link

Ele­va­tor Positioning

jefftkMay 21, 2023, 11:30 AM
15 points
1 comment1 min readLW link
(www.jefftk.com)

Trans­former Ar­chi­tec­ture Choice for Re­sist­ing Prompt In­jec­tion and Jail-Break­ing Attacks

RogerDearnaleyMay 21, 2023, 8:29 AM
9 points
1 comment4 min readLW link

Jeff Clune ad­ver­tis­ing a post­doc on twit­ter...and ask­ing where he should tar­get his posts

Joyee ChenMay 21, 2023, 1:02 AM
4 points
0 comments1 min readLW link

Run­ning Sound for Yourself

jefftkMay 20, 2023, 10:10 PM
11 points
0 comments2 min readLW link
(www.jefftk.com)

Job Open­ing: SWE to help build sig­na­ture vet­ting sys­tem for AI-re­lated petitions

May 20, 2023, 7:02 PM
52 points
0 comments1 min readLW link

My Kind of Pragmatism

Nora BelroseMay 20, 2023, 6:58 PM
37 points
11 comments3 min readLW link

Colors Ap­pear To Have Al­most-Univer­sal Sym­bolic Associations

Thoth HermesMay 20, 2023, 6:40 PM
−33 points
4 comments7 min readLW link
(thothhermes.substack.com)

Twiblings, four-par­ent ba­bies and other re­pro­duc­tive technology

GeneSmithMay 20, 2023, 5:11 PM
191 points
33 comments6 min readLW link

P-zom­bies, Com­pres­sion and the Si­mu­la­tion Hypothesis

RussellThorMay 20, 2023, 11:36 AM
5 points
0 comments5 min readLW link

The pos­si­ble shared Craft of de­liber­ate Lex­i­co­ge­n­e­sis

TsviBTMay 20, 2023, 5:56 AM
56 points
5 comments5 min readLW link