Sin­gu­lar­ity Sur­vival Guide: A Bayesian Guide for Nav­i­gat­ing the Pre-Sin­gu­lar­ity Period

mbrooksMar 28, 2025, 11:21 PM
6 points
4 comments2 min readLW link

Soft­max, Em­mett Shear’s new AI startup fo­cused on “Or­ganic Align­ment”

Chris LakinMar 28, 2025, 9:23 PM
61 points
2 comments1 min readLW link
(www.corememory.com)

The Pando Prob­lem: Re­think­ing AI Individuality

Jan_KulveitMar 28, 2025, 9:03 PM
133 points
14 comments13 min readLW link

Selec­tion Pres­sures on LM Personas

Raymond DouglasMar 28, 2025, 8:33 PM
30 points
0 comments3 min readLW link

AXRP Epi­sode 40 - Ja­son Gross on Com­pact Proofs and Interpretability

DanielFilanMar 28, 2025, 6:40 PM
26 points
0 comments89 min readLW link

[Question] Share AI Safety Ideas: Both Crazy and Not. №2

ankMar 28, 2025, 5:22 PM
2 points
10 comments1 min readLW link

AI x Bio Workshop

Allison DuettmannMar 28, 2025, 5:21 PM
16 points
0 comments1 min readLW link

[Question] How many times faster can the AGI ad­vance the sci­ence than hu­mans do?

StanislavKrymMar 28, 2025, 3:16 PM
0 points
0 comments1 min readLW link

Gem­ini 2.5 is the New SoTA

ZviMar 28, 2025, 2:20 PM
52 points
1 comment12 min readLW link
(thezvi.wordpress.com)

Will the Need to Re­train AI Models from Scratch Block a Soft­ware In­tel­li­gence Ex­plo­sion?

Tom DavidsonMar 28, 2025, 2:12 PM
10 points
0 comments3 min readLW link

How We Might All Die in A Year

Greg CMar 28, 2025, 1:22 PM
6 points
13 comments21 min readLW link
(x.com)

The vi­sion of Bill Thurston

TsviBTMar 28, 2025, 11:45 AM
50 points
34 comments4 min readLW link

What Uni­parental Di­somy Tells Us About Im­proper Im­print­ing in Humans

MorpheusMar 28, 2025, 11:24 AM
34 points
1 comment6 min readLW link
(www.tassiloneubauer.com)

Ex­plain­ing Bri­tish Naval Dom­i­nance Dur­ing the Age of Sail

Arjun PanicksseryMar 28, 2025, 5:47 AM
206 points
16 comments4 min readLW link
(arjunpanickssery.substack.com)

Will the AGIs be able to run the civil­i­sa­tion?

StanislavKrymMar 28, 2025, 4:50 AM
−7 points
2 comments3 min readLW link

[Question] Is AGI ac­tu­ally that likely to take off given the world en­ergy con­sump­tion?

StanislavKrymMar 27, 2025, 11:13 PM
2 points
2 comments1 min readLW link

[Linkpost] The value of ini­ti­at­ing a pur­suit in tem­po­ral de­ci­sion-making

Gunnar_ZarnckeMar 27, 2025, 9:47 PM
13 points
0 comments2 min readLW link

Align­ment through atomic agents

micseydelMar 27, 2025, 6:43 PM
−1 points
0 comments1 min readLW link

Machines of Stolen Grace

Riley TavassoliMar 27, 2025, 6:15 PM
2 points
0 comments5 min readLW link

An ar­gu­ment for asexuality

filthy_hedonistMar 27, 2025, 6:08 PM
−2 points
10 comments1 min readLW link

On the plau­si­bil­ity of a “messy” rogue AI com­mit­ting hu­man-like evil

Jacob GriffithMar 27, 2025, 6:06 PM
8 points
0 comments7 min readLW link

AI Mo­ral Align­ment: The Most Im­por­tant Goal of Our Generation

Ronen BarMar 27, 2025, 6:04 PM
3 points
0 comments8 min readLW link
(forum.effectivealtruism.org)

Trac­ing the Thoughts of a Large Lan­guage Model

Adam JermynMar 27, 2025, 5:20 PM
305 points
24 comments10 min readLW link
(www.anthropic.com)

Com­pu­ta­tional Su­per­po­si­tion in a Toy Model of the U-AND Problem

Adam NewgasMar 27, 2025, 4:56 PM
18 points
2 comments11 min readLW link

Mis­tral Large 2 (123B) seems to ex­hibit al­ign­ment faking

Mar 27, 2025, 3:39 PM
81 points
4 comments13 min readLW link

AIS Nether­lands is look­ing for a Found­ing Ex­ec­u­tive Direc­tor (EOI form)

Mar 27, 2025, 3:30 PM
15 points
0 comments4 min readLW link

AI #109: Google Fails Mar­ket­ing Forever

ZviMar 27, 2025, 2:50 PM
42 points
12 comments35 min readLW link
(thezvi.wordpress.com)

What life will be like for hu­mans if al­igned ASI is created

james oofouMar 27, 2025, 10:06 AM
3 points
6 comments2 min readLW link

What is scaf­fold­ing?

Mar 27, 2025, 9:06 AM
10 points
0 comments2 min readLW link
(aisafety.info)

Work­flow vs in­ter­face vs implementation

SniffnoyMar 27, 2025, 7:38 AM
12 points
0 comments1 min readLW link

Quick thoughts on the difficulty of widely con­vey­ing a non-stereo­typed position

SniffnoyMar 27, 2025, 7:30 AM
12 points
0 comments5 min readLW link

Do­ing prin­ci­ple-of-char­ity better

SniffnoyMar 27, 2025, 5:19 AM
22 points
1 comment3 min readLW link

X as phe­nomenon vs as policy, Good­hart, and the AB problem

SniffnoyMar 27, 2025, 4:32 AM
13 points
0 comments2 min readLW link

Con­se­quen­tial­ism is for mak­ing decisions

SniffnoyMar 27, 2025, 4:00 AM
21 points
9 comments1 min readLW link

Third-wave AI safety needs so­ciopoli­ti­cal thinking

Richard_NgoMar 27, 2025, 12:55 AM
99 points
23 comments26 min readLW link

Knowl­edge, Rea­son­ing, and Superintelligence

owencbMar 26, 2025, 11:28 PM
21 points
1 comment7 min readLW link
(strangecities.substack.com)

Many Com­mon Prob­lems are NP-Hard, and Why that Mat­ters for AI

Andrew Keenan RichardsonMar 26, 2025, 9:51 PM
5 points
9 comments5 min readLW link

Fun With GPT-4o Image Generation

ZviMar 26, 2025, 7:50 PM
76 points
3 comments15 min readLW link
(thezvi.wordpress.com)

I’m hiring a Re­search As­sis­tant for a non­fic­tion book on AI!

garrisonMar 26, 2025, 7:46 PM
17 points
0 comments1 min readLW link
(garrisonlovely.substack.com)

Au­to­mated Re­searchers Can Subtly Sandbag

Mar 26, 2025, 7:13 PM
44 points
0 comments4 min readLW link
(alignment.anthropic.com)

Nega­tive Re­sults for SAEs On Down­stream Tasks and Depri­ori­tis­ing SAE Re­search (GDM Mech In­terp Team Progress Up­date #2)

Mar 26, 2025, 7:07 PM
113 points
15 comments29 min readLW link
(deepmindsafetyresearch.medium.com)

AI com­pa­nies should be safety-test­ing the most ca­pa­ble ver­sions of their models

sjadlerMar 26, 2025, 7:03 PM
17 points
6 comments1 min readLW link
(stevenadler.substack.com)

Con­cep­tual Round­ing Errors

Jan_KulveitMar 26, 2025, 7:00 PM
151 points
15 comments3 min readLW link
(boundedlyrational.substack.com)

Per­sonal Agents: The First Step in Emer­gent AI Society

Andrey SeryakovMar 26, 2025, 6:55 PM
3 points
0 comments2 min readLW link

Will AI R&D Au­toma­tion Cause a Soft­ware In­tel­li­gence Ex­plo­sion?

Mar 26, 2025, 6:12 PM
19 points
3 comments2 min readLW link
(www.forethought.org)

Why Does Unem­ploy­ment Hap­pen?

Nicholas D.Mar 26, 2025, 6:02 PM
−2 points
2 comments1 min readLW link
(nicholasdecker.substack.com)

Find­ing Emer­gent Misalignment

Jan BetleyMar 26, 2025, 5:33 PM
26 points
0 comments3 min readLW link

Cen­ter on Long-Term Risk: Sum­mer Re­search Fel­low­ship 2025 - Ap­ply Now

Tristan CookMar 26, 2025, 5:29 PM
33 points
0 comments1 min readLW link
(longtermrisk.org)

Eukary­ote Skips Town—Why I’m leav­ing DC

eukaryoteMar 26, 2025, 5:16 PM
80 points
1 comment6 min readLW link
(eukaryotewritesblog.com)

Ap­ply to be­come a Fu­turekind AI Fa­cil­i­ta­tor or Men­tor (dead­line: April 10)

superbeneficiaryMar 26, 2025, 3:47 PM
4 points
0 comments1 min readLW link
(forum.effectivealtruism.org)