Don’t want Good­hart? — Spec­ify the damn variables

Yan LyutnevNov 21, 2024, 10:45 PM
−3 points
2 comments5 min readLW link

Don’t want Good­hart? — Spec­ify the vari­ables more

YanLyutnevNov 21, 2024, 10:43 PM
2 points
2 comments5 min readLW link

Align­ing AI Safety Pro­jects with a Repub­li­can Administration

Deric ChengNov 21, 2024, 10:12 PM
33 points
1 comment8 min readLW link

En­tropic strat­egy in Two Truths and a Lie

dkl9Nov 21, 2024, 10:03 PM
4 points
2 comments1 min readLW link
(dkl9.net)

The Three Warn­ings of the Zentradi

Trevor Hill-HandNov 21, 2024, 8:28 PM
13 points
1 comment5 min readLW link

[Question] Which things were you sur­prised to learn are not metaphors?

Eric NeymanNov 21, 2024, 6:56 PM
135 points
88 comments1 min readLW link

Epistemic sta­tus: po­etry (and other po­ems)

Richard_NgoNov 21, 2024, 6:13 PM
51 points
5 comments2 min readLW link
(www.narrativeark.xyz)

OpenAI’s CBRN tests seem unclear

LucaRighettiNov 21, 2024, 5:28 PM
124 points
6 comments7 min readLW link
(www.planned-obsolescence.org)

I Have A New Paper Out Ar­gu­ing Against The Asym­me­try And For The Ex­is­tence of Happy Peo­ple Be­ing Very Good

omnizoidNov 21, 2024, 5:21 PM
9 points
3 comments9 min readLW link

Danger­ous ca­pa­bil­ity tests should be harder

LucaRighettiNov 21, 2024, 5:20 PM
44 points
3 comments5 min readLW link
(www.planned-obsolescence.org)

Ac­tion deriva­tives: You’re not do­ing what you think you’re doing

PatrickDFarleyNov 21, 2024, 4:24 PM
26 points
0 comments3 min readLW link

AI #91: Deep Thinking

ZviNov 21, 2024, 2:30 PM
47 points
11 comments56 min readLW link
(thezvi.wordpress.com)

Sec­u­lar Sols­tice Round Up 2024

dspeyerNov 21, 2024, 10:49 AM
76 points
15 comments1 min readLW link

An Episte­molog­i­cal Nightmare

Ariel ChengNov 21, 2024, 2:08 AM
6 points
1 comment2 min readLW link
(www.mit.edu)

A Con­flicted Linkspost

ScrewtapeNov 21, 2024, 12:37 AM
52 points
0 comments3 min readLW link

Deep­Seek beats o1-pre­view on math, ties on cod­ing; will re­lease weights

Zach Stein-PerlmanNov 20, 2024, 11:50 PM
113 points
26 comments1 min readLW link

Ex­pected Utility, Geo­met­ric Utility, and Other Equiv­a­lent Representations

StrivingForLegibilityNov 20, 2024, 11:28 PM
10 points
0 comments11 min readLW link

[Question] Green thumb

Pug stankyNov 20, 2024, 9:52 PM
−12 points
1 comment2 min readLW link

Cost, Not Sacrifice

Joe RogeroNov 20, 2024, 9:32 PM
75 points
13 commentsLW link
(subatomicarticles.com)

China Hawks are Man­u­fac­tur­ing an AI Arms Race

garrisonNov 20, 2024, 6:17 PM
144 points
44 commentsLW link
(garrisonlovely.substack.com)

Why I Think All The Species Of Sig­nifi­cantly De­bated Con­scious­ness Are Con­scious And Suffer Intensely

omnizoidNov 20, 2024, 4:48 PM
25 points
5 comments33 min readLW link

as­pira­tional leadership

dhruvmethiNov 20, 2024, 4:07 PM
2 points
0 comments7 min readLW link

Zvi’s Thoughts on His 2nd Round of SFF

ZviNov 20, 2024, 1:40 PM
91 points
2 comments10 min readLW link
(thezvi.wordpress.com)

A Lit­tle Depth Goes a Long Way: the Ex­pres­sive Power of Log-Depth Transformers

Bogdan Ionut CirsteaNov 20, 2024, 11:48 AM
16 points
0 comments1 min readLW link
(openreview.net)

[Question] What changes should hap­pen in the HHS?

ChristianKlNov 20, 2024, 11:04 AM
0 points
19 comments1 min readLW link

[Question] What are the good ra­tio­nal­ity films?

Ben PaceNov 20, 2024, 6:04 AM
83 points
54 comments1 min readLW link

Valence Need Not Be Bounded; Utility Need Not Synthesize

LorecNov 20, 2024, 1:37 AM
8 points
0 comments6 min readLW link

Value/​Utility: A History

LorecNov 19, 2024, 11:01 PM
9 points
0 comments10 min readLW link

Why Don’t We Just… Shog­goth+Face+Para­phraser?

Nov 19, 2024, 8:53 PM
145 points
58 comments14 min readLW link

Every niche event should also be a meetup

DMMFNov 19, 2024, 8:47 PM
18 points
0 comments3 min readLW link
(danfrank.ca)

U.S.-China Eco­nomic and Se­cu­rity Re­view Com­mis­sion pushes Man­hat­tan Pro­ject-style AI initiative

worseNov 19, 2024, 6:42 PM
56 points
7 comments1 min readLW link

In­trin­sic Power-Seek­ing: AI Might Seek Power for Power’s Sake

TurnTroutNov 19, 2024, 6:36 PM
40 points
5 comments1 min readLW link
(turntrout.com)

Evolu­tion’s se­lec­tion tar­get de­pends on your weighting

tailcalledNov 19, 2024, 6:24 PM
23 points
22 comments1 min readLW link

AISN #44: The Trump Cir­cle on AI Safety Plus, Chi­nese re­searchers used Llama to cre­ate a mil­i­tary tool for the PLA, a Google AI sys­tem dis­cov­ered a zero-day cy­ber­se­cu­rity vuln­er­a­bil­ity, and Com­plex Sys­tems

Nov 19, 2024, 4:36 PM
9 points
0 comments5 min readLW link
(newsletter.safe.ai)

Jakarta ACX De­cem­ber 2024 Meetup

AudNov 19, 2024, 3:01 PM
1 point
0 comments1 min readLW link

Vi­su­al­iz­ing small At­ten­tion-only Transformers

WCargoNov 19, 2024, 9:37 AM
4 points
0 comments8 min readLW link

Amer­i­cans are fat and sick—and it’s their fault…right?

Declan MolonyNov 19, 2024, 6:41 AM
11 points
6 comments7 min readLW link

An­nounc­ing the CLR Foun­da­tions Course and CLR S-Risk Seminars

JamesFavilleNov 19, 2024, 1:18 AM
18 points
0 commentsLW link

No Elec­tric­ity in Manchuria

winstonBosanNov 19, 2024, 1:11 AM
25 points
0 comments5 min readLW link

Look­ing back on the Fu­ture of Hu­man­ity In­sti­tute—Asterisk

jakeeatonNov 19, 2024, 12:44 AM
48 points
0 comments1 min readLW link

Don’t Dis­miss on Epistemics

ggexNov 19, 2024, 12:44 AM
8 points
3 comments2 min readLW link

Train­ing AI agents to solve hard prob­lems could lead to Scheming

Nov 19, 2024, 12:10 AM
61 points
12 comments28 min readLW link

Proac­tive ‘If-Then’ Safety Cases

Nathan Helm-BurgerNov 18, 2024, 9:16 PM
10 points
0 comments4 min readLW link

[Question] Will Orion/​Gem­ini 2/​Llama-4 out­perform o1

LuigiPaganiNov 18, 2024, 9:15 PM
2 points
3 comments1 min readLW link

How to use bright light to im­prove your life.

Nat MartinNov 18, 2024, 7:32 PM
40 points
10 comments10 min readLW link

So­cial events with plau­si­ble deniability

ChipmonkNov 18, 2024, 6:25 PM
25 points
24 comments1 min readLW link
(chrislakin.blog)

How likely is brain preser­va­tion to work?

Andy_McKenzieNov 18, 2024, 4:58 PM
26 points
3 comments6 min readLW link

Why im­perfect ad­ver­sar­ial ro­bust­ness doesn’t doom AI control

Nov 18, 2024, 4:05 PM
62 points
25 comments2 min readLW link

Eth­i­cal Im­pli­ca­tions of the Quan­tum Multiverse

Jonah WilbergNov 18, 2024, 4:00 PM
7 points
22 comments6 min readLW link

Re­duc­ing x-risk might be ac­tively harmful

MountainPathNov 18, 2024, 2:25 PM
5 points
5 comments1 min readLW link