Some Rules for an Alge­bra of Bayes Nets

Nov 16, 2023, 11:53 PM
85 points
45 comments14 min readLW link1 review

How much to up­date on re­cent AI gov­er­nance moves?

Nov 16, 2023, 11:46 PM
112 points
5 comments29 min readLW link

New LessWrong fea­ture: Dialogue Matching

Bird ConceptNov 16, 2023, 9:27 PM
106 points
22 comments3 min readLW link

Towards Eval­u­at­ing AI Sys­tems for Mo­ral Sta­tus Us­ing Self-Reports

Nov 16, 2023, 8:18 PM
45 points
3 comments1 min readLW link
(arxiv.org)

So­cial Dark Matter

Duncan Sabien (Inactive)Nov 16, 2023, 8:00 PM
362 points
127 comments34 min readLW link2 reviews

AI #38: Let’s Make a Deal

ZviNov 16, 2023, 7:50 PM
44 points
2 comments55 min readLW link
(thezvi.wordpress.com)

Fore­cast­ing AI (Overview)

jsteinhardtNov 16, 2023, 7:00 PM
35 points
0 comments2 min readLW link
(bounded-regret.ghost.io)

We Should Talk About This More. Epistemic World Col­lapse as Im­mi­nent Safety Risk of Gen­er­a­tive AI.

Joerg WeissNov 16, 2023, 6:46 PM
11 points
2 comments29 min readLW link

In­tel­li­gence in sys­tems (hu­man, AI) can be con­cep­tu­al­ized as the re­s­olu­tion and through­put at which a sys­tem can pro­cess and af­fect Shan­non in­for­ma­tion.

AiresJLNov 16, 2023, 5:46 PM
0 points
0 comments2 min readLW link

Life on the Grid (Part 2)

rogersbaconNov 16, 2023, 5:22 PM
7 points
0 comments15 min readLW link
(www.secretorum.life)

The im­pos­si­bil­ity of ra­tio­nally an­a­lyz­ing par­ti­san news

RationalDinoNov 16, 2023, 4:19 PM
4 points
4 comments1 min readLW link

We are Peace­craft.ai!

MadHatterNov 16, 2023, 2:15 PM
15 points
20 comments2 min readLW link

A di­alec­ti­cal view of the his­tory of AI, Part 1: We’re only in the an­tithe­sis phase. [A syn­the­sis is in the fu­ture.]

Bill BenzonNov 16, 2023, 12:34 PM
6 points
0 comments12 min readLW link

[Question] How much fraud is there in academia?

ChristianKlNov 16, 2023, 11:50 AM
23 points
10 comments1 min readLW link

Learn­ing co­effi­cient es­ti­ma­tion: the details

Zach FurmanNov 16, 2023, 3:19 AM
36 points
0 comments2 min readLW link
(colab.research.google.com)

[Question] AI Safety orgs- what’s your biggest bot­tle­neck right now?

Kabir KumarNov 16, 2023, 2:02 AM
1 point
0 comments1 min readLW link

My cri­tique of Eliezer’s deeply ir­ra­tional beliefs

JorterderNov 16, 2023, 12:34 AM
−35 points
1 comment9 min readLW link
(docs.google.com)

Ex­trap­o­lat­ing from Five Words

Gordon Seidoh WorleyNov 15, 2023, 11:21 PM
40 points
11 comments2 min readLW link

In Defense of Parselmouths

ScrewtapeNov 15, 2023, 11:02 PM
51 points
11 comments10 min readLW link1 review

Life on the Grid (Part 1)

rogersbaconNov 15, 2023, 10:37 PM
12 points
4 comments9 min readLW link
(www.secretorum.life)

Glo­ma­riza­tion FAQ

ZaneNov 15, 2023, 8:20 PM
33 points
5 comments5 min readLW link

Testbed evals: eval­u­at­ing AI safety even when it can’t be di­rectly mea­sured

joshcNov 15, 2023, 7:00 PM
71 points
2 comments4 min readLW link

EA/​ACX/​LW Novem­ber Santa Cruz Meetup

madmailNov 15, 2023, 6:39 PM
1 point
0 comments1 min readLW link

New re­port: “Schem­ing AIs: Will AIs fake al­ign­ment dur­ing train­ing in or­der to get power?”

Joe CarlsmithNov 15, 2023, 5:16 PM
81 points
28 comments30 min readLW link1 review

Large Lan­guage Models can Strate­gi­cally De­ceive their Users when Put Un­der Pres­sure.

ReaderMNov 15, 2023, 4:36 PM
89 points
9 comments2 min readLW link1 review
(arxiv.org)

AISN #26: Na­tional In­sti­tu­tions for AI Safety, Re­sults From the UK Sum­mit, and New Re­leases From OpenAI and xAI

Nov 15, 2023, 4:07 PM
13 points
0 comments6 min readLW link
(newsletter.safe.ai)

‘The­o­ries of Values’ and ‘The­o­ries of Agents’: con­fu­sions, mus­ings and desiderata

Nov 15, 2023, 4:00 PM
35 points
8 comments24 min readLW link

Ex­pe­riences and learn­ings from both sides of the AI safety job market

Marius HobbhahnNov 15, 2023, 3:40 PM
110 points
4 comments18 min readLW link

Good busi­nesses cre­ate epistemic monopolies

Logan KiellerNov 15, 2023, 2:04 PM
−2 points
2 comments4 min readLW link
(logankieller.substack.com)

A con­cep­tual pre­cur­sor to to­day’s lan­guage ma­chines [Shan­non]

Bill BenzonNov 15, 2023, 1:50 PM
24 points
6 comments2 min readLW link

[Question] Should Ad­vanced Place­ment High School classes dis­cuss Is­rael-Pales­tine? If so, how? If not, why? Who should make this de­ci­sion?

Gesild MukaNov 15, 2023, 4:50 AM
−1 points
5 comments1 min readLW link

Re­in­force­ment Via Giv­ing Peo­ple Cookies

ScrewtapeNov 15, 2023, 4:34 AM
70 points
9 comments6 min readLW link

In­ci­den­tal polysemanticity

Nov 15, 2023, 4:00 AM
43 points
7 comments11 min readLW link

LLMs May Find It Hard to FOOM

RogerDearnaleyNov 15, 2023, 2:52 AM
11 points
30 comments12 min readLW link

Lin­ear­ity Fallacies

hippoNov 15, 2023, 2:23 AM
15 points
0 comments5 min readLW link

SIA Is Just Be­ing a Bayesian About the Fact That One Ex­ists

omnizoidNov 14, 2023, 10:55 PM
3 points
5 comments4 min readLW link

AI Align­ment [progress] this Week (11/​12/​2023)

Logan ZoellnerNov 14, 2023, 10:21 PM
6 points
0 comments2 min readLW link
(midwitalignment.substack.com)

[Question] When did Eliezer Yud­kowsky change his mind about neu­ral net­works?

[deactivated]Nov 14, 2023, 9:24 PM
31 points
15 comments1 min readLW link

Bet­ting on what is un-falsifi­able and un-verifiable

Abhimanyu Pallavi SudhirNov 14, 2023, 9:11 PM
13 points
0 comments15 min readLW link

Face­book is Pay­ing Me to Post

jefftkNov 14, 2023, 7:10 PM
26 points
5 comments1 min readLW link
(www.jefftk.com)

Feel­ings, Noth­ing More than Feel­ings, About AI

PaulBeconNov 14, 2023, 6:50 PM
7 points
0 comments3 min readLW link

Kids or No kids

Kids or no kidsNov 14, 2023, 6:37 PM
98 points
10 comments13 min readLW link

Rae­mon’s De­liber­ate (“Pur­pose­ful?”) Prac­tice Club

Nov 14, 2023, 6:24 PM
61 points
11 comments22 min readLW link

More metal less ore

Logan KiellerNov 14, 2023, 4:59 PM
6 points
3 comments2 min readLW link
(logankieller.substack.com)

Monthly Roundup #12: Novem­ber 2023

ZviNov 14, 2023, 3:20 PM
34 points
5 comments33 min readLW link
(thezvi.wordpress.com)

Do you want a first-prin­ci­pled pre­pared­ness guide to pre­pare your­self and loved ones for po­ten­tial catas­tro­phes?

Ulrik HornNov 14, 2023, 12:13 PM
16 points
5 comments15 min readLW link

[Question] Is there Work on Embed­ded Agency in Cel­lu­lar Au­tomata Toy Models?

Johannes C. MayerNov 14, 2023, 9:08 AM
10 points
0 comments1 min readLW link

[Question] Would this be Progress in Solv­ing Embed­ded Agency?

Johannes C. MayerNov 14, 2023, 9:08 AM
9 points
2 comments2 min readLW link

Is In­ter­pretabil­ity All We Need?

RogerDearnaleyNov 14, 2023, 5:31 AM
1 point
1 comment1 min readLW link

What is wis­dom?

TsviBTNov 14, 2023, 2:13 AM
39 points
3 comments13 min readLW link