The Lazarus Pro­ject—TV Review

Commander Zander23 Jun 2025 23:53 UTC
6 points
0 comments1 min readLW link

Fron­tier AI Labs: the Call Op­tion to AGI

ykevinzhang23 Jun 2025 21:02 UTC
22 points
2 comments10 min readLW link

The Rose Test: a fun way to feel with your guts (not just log­i­cally un­der­stand) why AI-safety mat­ters right now (and get new adepts)

lovagrus23 Jun 2025 21:02 UTC
10 points
0 comments3 min readLW link

Knowl­edge Ex­trac­tion Plan (KEP): An al­ter­na­tive to reck­less scaling

aswani23 Jun 2025 20:58 UTC
1 point
0 comments1 min readLW link

His­tory re­peats it­self: Fron­tier AI labs act­ing like Ama­zon in the early 2000s

i_am_nuts23 Jun 2025 20:56 UTC
8 points
0 comments3 min readLW link
(iamnuts.substack.com)

Open Thread—Sum­mer 2025

habryka23 Jun 2025 20:54 UTC
23 points
70 comments1 min readLW link

Child­hood and Ed­u­ca­tion #10: Behaviors

Zvi23 Jun 2025 20:40 UTC
26 points
4 comments15 min readLW link
(thezvi.wordpress.com)

Com­pressed Com­pu­ta­tion is (prob­a­bly) not Com­pu­ta­tion in Superposition

23 Jun 2025 19:35 UTC
57 points
9 comments10 min readLW link

Si­tu­a­tional Aware­ness: A One-Year Retrospective

Nathan Delisle23 Jun 2025 19:15 UTC
82 points
4 comments12 min readLW link

Rec­og­niz­ing Optimality

jimmy23 Jun 2025 17:55 UTC
20 points
0 comments7 min readLW link

Com­par­ing risk from in­ter­nally-de­ployed AI to in­sider and out­sider threats from humans

Buck23 Jun 2025 17:47 UTC
150 points
22 comments3 min readLW link

Foom & Doom 2: Tech­ni­cal al­ign­ment is hard

Steven Byrnes23 Jun 2025 17:19 UTC
165 points
65 comments28 min readLW link

Foom & Doom 1: “Brain in a box in a base­ment”

Steven Byrnes23 Jun 2025 17:18 UTC
282 points
120 comments29 min readLW link

The Lies of Big Bug

Bentham's Bulldog23 Jun 2025 16:03 UTC
−4 points
2 comments4 min readLW link

AI com­pa­nies aren’t plan­ning to se­cure crit­i­cal model weights

Zach Stein-Perlman23 Jun 2025 16:00 UTC
15 points
0 comments1 min readLW link

“It isn’t magic”

Ben (Berlin)23 Jun 2025 14:00 UTC
92 points
17 comments2 min readLW link

Fore­cast­ing AI Forecasting

Alvin Ånestrand23 Jun 2025 13:39 UTC
15 points
4 comments6 min readLW link

Re­cent progress on the sci­ence of evaluations

PabloAMC23 Jun 2025 9:41 UTC
14 points
1 comment8 min readLW link

Ra­cial Dat­ing Prefer­ences and Sex­ual Racism

koreindian23 Jun 2025 3:57 UTC
56 points
70 comments32 min readLW link
(vishalblog.substack.com)

Main­stream Grant­mak­ing Ex­per­tise (Post 7 of 7 on AI Gover­nance)

Mass_Driver23 Jun 2025 1:39 UTC
56 points
7 comments37 min readLW link

[Question] How does the LessWrong team gen­er­ate the web­site illus­tra­tions?

Nina Panickssery23 Jun 2025 0:05 UTC
16 points
1 comment1 min readLW link

The AI’s Toolbox: From Soggy Toast to Op­ti­mal Solutions

Thehumanproject.ai22 Jun 2025 20:54 UTC
1 point
0 comments8 min readLW link

Black-box in­ter­pretabil­ity method­ol­ogy blueprint: Prob­ing run­away op­ti­mi­sa­tion in LLMs

Roland Pihlakas22 Jun 2025 18:16 UTC
17 points
0 comments7 min readLW link

The Crois­sant Prin­ci­ple: A The­ory of AI Generalization

Jeffrey Liang22 Jun 2025 17:58 UTC
20 points
6 comments2 min readLW link

Re­la­tional De­sign Can’t Be Left to Chance

Priyanka Bharadwaj22 Jun 2025 15:32 UTC
5 points
0 comments3 min readLW link

Ground­ing to Avoid Air­plane Delays

jefftk22 Jun 2025 1:50 UTC
30 points
0 comments2 min readLW link
(www.jefftk.com)

Open ques­tions on com­pat­i­bil­ist free will and sub­junc­tive dependence

jackmastermind22 Jun 2025 1:15 UTC
3 points
0 comments1 min readLW link
(jacktlab.substack.com)

The Six­teen Kinds of Intimacy

Ruby21 Jun 2025 19:59 UTC
57 points
2 comments5 min readLW link

Book re­view: Against Method

Valdes21 Jun 2025 18:59 UTC
9 points
0 comments6 min readLW link

Con­trived eval­u­a­tions are use­ful evaluations

pradyuprasad21 Jun 2025 18:18 UTC
3 points
0 comments3 min readLW link
(speculativedecoding.substack.com)

Con­sider chilling out in 2028

Valentine21 Jun 2025 17:07 UTC
189 points
143 comments13 min readLW link

Up­com­ing work­shop on Post-AGI Civ­i­liza­tional Equilibria

21 Jun 2025 15:57 UTC
25 points
0 comments1 min readLW link

Ge­nomic emancipation

TsviBT21 Jun 2025 8:15 UTC
83 points
14 comments26 min readLW link

Eval­u­at­ing the Risk of Job Dis­place­ment by Trans­for­ma­tive AI Au­toma­tion in Devel­op­ing Coun­tries: A Case Study on Brazil

Abubakar21 Jun 2025 0:48 UTC
4 points
0 comments15 min readLW link

Back­door aware­ness and mis­al­igned per­sonas in rea­son­ing models

20 Jun 2025 23:38 UTC
35 points
8 comments6 min readLW link

Agen­tic Misal­ign­ment: How LLMs Could be In­sider Threats

20 Jun 2025 22:34 UTC
83 points
13 comments6 min readLW link

Clar­ify­ing “wis­dom”: Foun­da­tional top­ics for al­igned AIs to pri­ori­tize be­fore ir­re­versible decisions

Anthony DiGiovanni20 Jun 2025 21:55 UTC
40 points
2 comments12 min readLW link

Are In­tel­li­gent Agents More Eth­i­cal?

PeterMcCluskey20 Jun 2025 21:26 UTC
13 points
7 comments2 min readLW link

An AI Arms Race Scenario

shanzson20 Jun 2025 19:25 UTC
2 points
2 comments1 min readLW link

Mak­ing deals with early schemers

20 Jun 2025 18:21 UTC
127 points
41 comments15 min readLW link

Ivan Gay­ton: A Right and a Duty

Elizabeth20 Jun 2025 18:20 UTC
21 points
0 comments1 min readLW link
(acesounderglass.com)

What is the func­tional role of SAE er­rors?

20 Jun 2025 18:11 UTC
12 points
6 comments38 min readLW link

Mus­ings on AI Com­pa­nies of 2025-2026 (Jun 2025)

Vladimir_Nesov20 Jun 2025 17:14 UTC
66 points
4 comments3 min readLW link

Es­cap­ing the Jun­gles of Nor­wood: A Ra­tion­al­ist’s Guide to Male Pat­tern Baldness

AlphaAndOmega20 Jun 2025 16:40 UTC
12 points
10 comments1 min readLW link
(open.substack.com)

Pre­fix cache un­trusted mon­i­tors: a method to ap­ply af­ter you catch your AI

ryan_greenblatt20 Jun 2025 15:56 UTC
33 points
2 comments7 min readLW link

Did the Army Poi­son a Bunch of Women in Min­nesota?

rba20 Jun 2025 15:33 UTC
54 points
2 comments4 min readLW link

AI #121 Part 2: The OpenAI Files

Zvi20 Jun 2025 14:50 UTC
37 points
9 comments41 min readLW link
(thezvi.wordpress.com)

Smarter Models Lie Less

Expertium20 Jun 2025 13:31 UTC
6 points
0 comments2 min readLW link

AI Safety Com­mu­ni­ca­tors Meet-up

Vishakha20 Jun 2025 12:34 UTC
3 points
0 comments1 min readLW link

X ex­plains Z% of the var­i­ance in Y

Leon Lang20 Jun 2025 12:17 UTC
160 points
36 comments9 min readLW link