Sin­gu­lar­ity Sur­vival Guide: A Bayesian Guide for Nav­i­gat­ing the Pre-Sin­gu­lar­ity Period

mbrooks28 Mar 2025 23:21 UTC
6 points
4 comments2 min readLW link

Soft­max, Em­mett Shear’s new AI startup fo­cused on “Or­ganic Align­ment”

Chris Lakin28 Mar 2025 21:23 UTC
61 points
2 comments1 min readLW link
(www.corememory.com)

The Pando Prob­lem: Re­think­ing AI Individuality

Jan_Kulveit28 Mar 2025 21:03 UTC
133 points
14 comments13 min readLW link

Selec­tion Pres­sures on LM Personas

Raymond Douglas28 Mar 2025 20:33 UTC
40 points
0 comments3 min readLW link

AXRP Epi­sode 40 - Ja­son Gross on Com­pact Proofs and Interpretability

DanielFilan28 Mar 2025 18:40 UTC
26 points
0 comments89 min readLW link

[Question] Share AI Safety Ideas: Both Crazy and Not. №2

ank28 Mar 2025 17:22 UTC
2 points
10 comments1 min readLW link

AI x Bio Workshop

Allison Duettmann28 Mar 2025 17:21 UTC
16 points
0 comments1 min readLW link

[Question] How many times faster can the AGI ad­vance the sci­ence than hu­mans do?

StanislavKrym28 Mar 2025 15:16 UTC
0 points
0 comments1 min readLW link

Gem­ini 2.5 is the New SoTA

Zvi28 Mar 2025 14:20 UTC
52 points
1 comment12 min readLW link
(thezvi.wordpress.com)

Will the Need to Re­train AI Models from Scratch Block a Soft­ware In­tel­li­gence Ex­plo­sion?

Tom Davidson28 Mar 2025 14:12 UTC
10 points
0 comments3 min readLW link

How We Might All Die in A Year

Greg C28 Mar 2025 13:22 UTC
6 points
13 comments21 min readLW link
(x.com)

The vi­sion of Bill Thurston

TsviBT28 Mar 2025 11:45 UTC
50 points
34 comments4 min readLW link

What Uni­parental Di­somy Tells Us About Im­proper Im­print­ing in Humans

Morpheus28 Mar 2025 11:24 UTC
34 points
1 comment6 min readLW link
(www.tassiloneubauer.com)

Ex­plain­ing Bri­tish Naval Dom­i­nance Dur­ing the Age of Sail

Arjun Panickssery28 Mar 2025 5:47 UTC
206 points
16 comments4 min readLW link
(arjunpanickssery.substack.com)

Will the AGIs be able to run the civil­i­sa­tion?

StanislavKrym28 Mar 2025 4:50 UTC
−7 points
2 comments3 min readLW link

[Question] Is AGI ac­tu­ally that likely to take off given the world en­ergy con­sump­tion?

StanislavKrym27 Mar 2025 23:13 UTC
2 points
2 comments1 min readLW link

[Linkpost] The value of ini­ti­at­ing a pur­suit in tem­po­ral de­ci­sion-making

Gunnar_Zarncke27 Mar 2025 21:47 UTC
13 points
0 comments2 min readLW link

Align­ment through atomic agents

micseydel27 Mar 2025 18:43 UTC
−1 points
0 comments1 min readLW link

Machines of Stolen Grace

Riley Tavassoli27 Mar 2025 18:15 UTC
2 points
0 comments5 min readLW link

An ar­gu­ment for asexuality

filthy_hedonist27 Mar 2025 18:08 UTC
−2 points
10 comments1 min readLW link

On the plau­si­bil­ity of a “messy” rogue AI com­mit­ting hu­man-like evil

Jacob Griffith27 Mar 2025 18:06 UTC
8 points
0 comments7 min readLW link

AI Mo­ral Align­ment: The Most Im­por­tant Goal of Our Generation

Ronen Bar27 Mar 2025 18:04 UTC
3 points
0 comments8 min readLW link
(forum.effectivealtruism.org)

Trac­ing the Thoughts of a Large Lan­guage Model

Adam Jermyn27 Mar 2025 17:20 UTC
307 points
24 comments10 min readLW link
(www.anthropic.com)

Com­pu­ta­tional Su­per­po­si­tion in a Toy Model of the U-AND Problem

Adam Newgas27 Mar 2025 16:56 UTC
18 points
2 comments11 min readLW link

Mis­tral Large 2 (123B) seems to ex­hibit al­ign­ment faking

27 Mar 2025 15:39 UTC
81 points
4 comments13 min readLW link

AIS Nether­lands is look­ing for a Found­ing Ex­ec­u­tive Direc­tor (EOI form)

27 Mar 2025 15:30 UTC
15 points
0 comments4 min readLW link

AI #109: Google Fails Mar­ket­ing Forever

Zvi27 Mar 2025 14:50 UTC
42 points
12 comments35 min readLW link
(thezvi.wordpress.com)

What life will be like for hu­mans if al­igned ASI is created

james oofou27 Mar 2025 10:06 UTC
5 points
6 comments2 min readLW link

What is scaf­fold­ing?

27 Mar 2025 9:06 UTC
10 points
0 comments2 min readLW link
(aisafety.info)

Work­flow vs in­ter­face vs implementation

Sniffnoy27 Mar 2025 7:38 UTC
12 points
0 comments1 min readLW link

Quick thoughts on the difficulty of widely con­vey­ing a non-stereo­typed position

Sniffnoy27 Mar 2025 7:30 UTC
12 points
0 comments5 min readLW link

Do­ing prin­ci­ple-of-char­ity better

Sniffnoy27 Mar 2025 5:19 UTC
22 points
1 comment3 min readLW link

X as phe­nomenon vs as policy, Good­hart, and the AB problem

Sniffnoy27 Mar 2025 4:32 UTC
14 points
0 comments2 min readLW link

Con­se­quen­tial­ism is for mak­ing decisions

Sniffnoy27 Mar 2025 4:00 UTC
21 points
9 comments1 min readLW link

Third-wave AI safety needs so­ciopoli­ti­cal thinking

Richard_Ngo27 Mar 2025 0:55 UTC
100 points
23 comments26 min readLW link

Knowl­edge, Rea­son­ing, and Superintelligence

owencb26 Mar 2025 23:28 UTC
21 points
1 comment7 min readLW link
(strangecities.substack.com)

Many Com­mon Prob­lems are NP-Hard, and Why that Mat­ters for AI

Andrew Keenan Richardson26 Mar 2025 21:51 UTC
5 points
9 comments5 min readLW link

Fun With GPT-4o Image Generation

Zvi26 Mar 2025 19:50 UTC
76 points
3 comments15 min readLW link
(thezvi.wordpress.com)

I’m hiring a Re­search As­sis­tant for a non­fic­tion book on AI!

garrison26 Mar 2025 19:46 UTC
17 points
0 comments1 min readLW link
(garrisonlovely.substack.com)

Au­to­mated Re­searchers Can Subtly Sandbag

26 Mar 2025 19:13 UTC
44 points
0 comments4 min readLW link
(alignment.anthropic.com)

Nega­tive Re­sults for SAEs On Down­stream Tasks and Depri­ori­tis­ing SAE Re­search (GDM Mech In­terp Team Progress Up­date #2)

26 Mar 2025 19:07 UTC
115 points
15 comments29 min readLW link
(deepmindsafetyresearch.medium.com)

AI com­pa­nies should be safety-test­ing the most ca­pa­ble ver­sions of their models

sjadler26 Mar 2025 19:03 UTC
17 points
6 comments1 min readLW link
(stevenadler.substack.com)

Con­cep­tual Round­ing Errors

Jan_Kulveit26 Mar 2025 19:00 UTC
153 points
15 comments3 min readLW link
(boundedlyrational.substack.com)

Per­sonal Agents: The First Step in Emer­gent AI Society

Andrey Seryakov26 Mar 2025 18:55 UTC
3 points
0 comments2 min readLW link

Will AI R&D Au­toma­tion Cause a Soft­ware In­tel­li­gence Ex­plo­sion?

26 Mar 2025 18:12 UTC
19 points
3 comments2 min readLW link
(www.forethought.org)

Why Does Unem­ploy­ment Hap­pen?

Nicholas D.26 Mar 2025 18:02 UTC
−2 points
2 comments1 min readLW link
(nicholasdecker.substack.com)

Find­ing Emer­gent Misalignment

Jan Betley26 Mar 2025 17:33 UTC
33 points
0 comments3 min readLW link

Cen­ter on Long-Term Risk: Sum­mer Re­search Fel­low­ship 2025 - Ap­ply Now

Tristan Cook26 Mar 2025 17:29 UTC
33 points
0 comments1 min readLW link
(longtermrisk.org)

Eukary­ote Skips Town—Why I’m leav­ing DC

eukaryote26 Mar 2025 17:16 UTC
82 points
1 comment6 min readLW link
(eukaryotewritesblog.com)

Ap­ply to be­come a Fu­turekind AI Fa­cil­i­ta­tor or Men­tor (dead­line: April 10)

superbeneficiary26 Mar 2025 15:47 UTC
4 points
0 comments1 min readLW link
(forum.effectivealtruism.org)