Ex­trap­o­lat­ing from Five Words

Gordon Seidoh WorleyNov 15, 2023, 11:21 PM
40 points
11 comments2 min readLW link

In Defense of Parselmouths

ScrewtapeNov 15, 2023, 11:02 PM
51 points
11 comments10 min readLW link1 review

Life on the Grid (Part 1)

rogersbaconNov 15, 2023, 10:37 PM
12 points
4 comments9 min readLW link
(www.secretorum.life)

Glo­ma­riza­tion FAQ

ZaneNov 15, 2023, 8:20 PM
33 points
5 comments5 min readLW link

Testbed evals: eval­u­at­ing AI safety even when it can’t be di­rectly mea­sured

joshcNov 15, 2023, 7:00 PM
71 points
2 comments4 min readLW link

EA/​ACX/​LW Novem­ber Santa Cruz Meetup

madmailNov 15, 2023, 6:39 PM
1 point
0 comments1 min readLW link

New re­port: “Schem­ing AIs: Will AIs fake al­ign­ment dur­ing train­ing in or­der to get power?”

Joe CarlsmithNov 15, 2023, 5:16 PM
81 points
28 comments30 min readLW link1 review

Large Lan­guage Models can Strate­gi­cally De­ceive their Users when Put Un­der Pres­sure.

ReaderMNov 15, 2023, 4:36 PM
89 points
9 comments2 min readLW link1 review
(arxiv.org)

AISN #26: Na­tional In­sti­tu­tions for AI Safety, Re­sults From the UK Sum­mit, and New Re­leases From OpenAI and xAI

Nov 15, 2023, 4:07 PM
13 points
0 comments6 min readLW link
(newsletter.safe.ai)

‘The­o­ries of Values’ and ‘The­o­ries of Agents’: con­fu­sions, mus­ings and desiderata

Nov 15, 2023, 4:00 PM
35 points
8 comments24 min readLW link

Ex­pe­riences and learn­ings from both sides of the AI safety job market

Marius HobbhahnNov 15, 2023, 3:40 PM
110 points
4 comments18 min readLW link

Good busi­nesses cre­ate epistemic monopolies

Logan KiellerNov 15, 2023, 2:04 PM
−2 points
2 comments4 min readLW link
(logankieller.substack.com)

A con­cep­tual pre­cur­sor to to­day’s lan­guage ma­chines [Shan­non]

Bill BenzonNov 15, 2023, 1:50 PM
24 points
6 comments2 min readLW link

[Question] Should Ad­vanced Place­ment High School classes dis­cuss Is­rael-Pales­tine? If so, how? If not, why? Who should make this de­ci­sion?

Gesild MukaNov 15, 2023, 4:50 AM
−1 points
5 comments1 min readLW link

Re­in­force­ment Via Giv­ing Peo­ple Cookies

ScrewtapeNov 15, 2023, 4:34 AM
70 points
9 comments6 min readLW link

In­ci­den­tal polysemanticity

Nov 15, 2023, 4:00 AM
43 points
7 comments11 min readLW link

LLMs May Find It Hard to FOOM

RogerDearnaleyNov 15, 2023, 2:52 AM
11 points
30 comments12 min readLW link

Lin­ear­ity Fallacies

hippoNov 15, 2023, 2:23 AM
15 points
0 comments5 min readLW link

SIA Is Just Be­ing a Bayesian About the Fact That One Ex­ists

omnizoidNov 14, 2023, 10:55 PM
3 points
5 comments4 min readLW link

AI Align­ment [progress] this Week (11/​12/​2023)

Logan ZoellnerNov 14, 2023, 10:21 PM
6 points
0 comments2 min readLW link
(midwitalignment.substack.com)

[Question] When did Eliezer Yud­kowsky change his mind about neu­ral net­works?

[deactivated]Nov 14, 2023, 9:24 PM
31 points
15 comments1 min readLW link

Bet­ting on what is un-falsifi­able and un-verifiable

Abhimanyu Pallavi SudhirNov 14, 2023, 9:11 PM
13 points
0 comments15 min readLW link

Face­book is Pay­ing Me to Post

jefftkNov 14, 2023, 7:10 PM
26 points
5 comments1 min readLW link
(www.jefftk.com)

Feel­ings, Noth­ing More than Feel­ings, About AI

PaulBeconNov 14, 2023, 6:50 PM
7 points
0 comments3 min readLW link

Kids or No kids

Kids or no kidsNov 14, 2023, 6:37 PM
98 points
10 comments13 min readLW link

Rae­mon’s De­liber­ate (“Pur­pose­ful?”) Prac­tice Club

Nov 14, 2023, 6:24 PM
61 points
11 comments22 min readLW link

More metal less ore

Logan KiellerNov 14, 2023, 4:59 PM
6 points
3 comments2 min readLW link
(logankieller.substack.com)

Monthly Roundup #12: Novem­ber 2023

ZviNov 14, 2023, 3:20 PM
34 points
5 comments33 min readLW link
(thezvi.wordpress.com)

Do you want a first-prin­ci­pled pre­pared­ness guide to pre­pare your­self and loved ones for po­ten­tial catas­tro­phes?

Ulrik HornNov 14, 2023, 12:13 PM
16 points
5 comments15 min readLW link

[Question] Is there Work on Embed­ded Agency in Cel­lu­lar Au­tomata Toy Models?

Johannes C. MayerNov 14, 2023, 9:08 AM
10 points
0 comments1 min readLW link

[Question] Would this be Progress in Solv­ing Embed­ded Agency?

Johannes C. MayerNov 14, 2023, 9:08 AM
9 points
2 comments2 min readLW link

Is In­ter­pretabil­ity All We Need?

RogerDearnaleyNov 14, 2023, 5:31 AM
1 point
1 comment1 min readLW link

What is wis­dom?

TsviBTNov 14, 2023, 2:13 AM
39 points
3 comments13 min readLW link

Fes­ti­val Stats 2023

jefftkNov 14, 2023, 1:20 AM
9 points
0 comments1 min readLW link
(www.jefftk.com)

Out of the Box

jesseduffieldNov 13, 2023, 11:43 PM
5 points
1 comment7 min readLW link

Loudly Give Up, Don’t Quietly Fade

ScrewtapeNov 13, 2023, 11:30 PM
165 points
12 comments6 min readLW link1 review

Great Em­pa­thy and Great Re­sponse Ability

positivesumNov 13, 2023, 11:04 PM
16 points
0 comments3 min readLW link
(tryingtruly.substack.com)

The­o­ries of Change for AI Auditing

Nov 13, 2023, 7:33 PM
54 points
0 comments18 min readLW link
(www.apolloresearch.ai)

They are made of re­peat­ing patterns

quetzal_rainbowNov 13, 2023, 6:17 PM
53 points
4 comments2 min readLW link

How to Upload a Mind (In Three Not-So-Easy Steps)

Nov 13, 2023, 6:13 PM
26 points
0 comments7 min readLW link
(youtu.be)

Non-my­opia stories

lberglundNov 13, 2023, 5:52 PM
29 points
10 comments7 min readLW link

It’s OK to eat shrimp: EAs Make In­valid In­fer­ences About Fish Qualia and Mo­ral Patienthood

Mikhail SaminNov 13, 2023, 4:51 PM
0 points
17 commentsLW link

Sugges­tions for chess puzzles

ZaneNov 13, 2023, 3:39 PM
13 points
1 comment1 min readLW link

Why small phe­nomenons are rele­vant to moral­ity ​

Ryo Nov 13, 2023, 3:25 PM
1 point
0 comments3 min readLW link

Op­tion­al­ity ap­proach to ethics

Ryo Nov 13, 2023, 3:23 PM
7 points
2 comments3 min readLW link

Redi­rect­ing one’s own taxes as an effec­tive al­tru­ism method

David GrossNov 13, 2023, 3:17 PM
−5 points
34 comments16 min readLW link

AISC Pro­ject: Bench­marks for Stable Reflectivity

jacquesthibsNov 13, 2023, 2:51 PM
17 points
0 comments8 min readLW link

Re­search Adenda: Model­ling Tra­jec­to­ries of Lan­guage Models

NickyPNov 13, 2023, 2:33 PM
28 points
0 comments12 min readLW link

Bostrom Goes Unheard

ZviNov 13, 2023, 2:11 PM
81 points
9 comments18 min readLW link

Novem­ber hang­out in Warsaw

ntoxegNov 13, 2023, 1:20 PM
1 point
1 comment1 min readLW link