A regime-change power-vac­uum con­jec­ture about group belief

TsviBT24 Jun 2025 23:16 UTC
41 points
16 comments3 min readLW link

Ap­ply to be a men­tor in SPAR!

agucova24 Jun 2025 23:00 UTC
10 points
0 comments1 min readLW link

Machines of Faith­ful Obedience

Boaz Barak24 Jun 2025 22:06 UTC
41 points
19 comments10 min readLW link

Gra­di­ent Des­cent on To­ken In­put Embeddings

KAP24 Jun 2025 20:24 UTC
8 points
0 comments6 min readLW link

A crisis simu­la­tion changed how I think about AI risk

sjadler24 Jun 2025 20:04 UTC
5 points
0 comments2 min readLW link
(open.substack.com)

Towards a the­ory of lo­cal altruism

DMMF24 Jun 2025 19:39 UTC
11 points
1 comment5 min readLW link
(notnottalmud.substack.com)

Why “train­ing against schem­ing” is hard

Marius Hobbhahn24 Jun 2025 19:08 UTC
66 points
2 comments12 min readLW link

An­a­lyz­ing A Cri­tique Of The AI 2027 Timeline Forecasts

Zvi24 Jun 2025 18:50 UTC
76 points
38 comments30 min readLW link
(thezvi.wordpress.com)

What does 10x-ing effec­tive com­pute get you?

ryan_greenblatt24 Jun 2025 18:33 UTC
55 points
10 comments12 min readLW link

My pitch for the AI Village

Daniel Kokotajlo24 Jun 2025 15:00 UTC
178 points
35 comments5 min readLW link

An Anal­ogy for Interpretability

Roman Malov24 Jun 2025 14:56 UTC
13 points
2 comments2 min readLW link

The V&V method—A step to­wards safer AGI

Yoav Hollander24 Jun 2025 13:42 UTC
20 points
1 comment1 min readLW link
(blog.foretellix.com)

Try o3-pro in ChatGPT for $1 (is AI a bub­ble?)

Hauke Hillebrandt24 Jun 2025 11:15 UTC
15 points
2 comments4 min readLW link

Belief in con­ti­nu­ity of per­son­hood can be money-pumped

Filip Sondej24 Jun 2025 9:39 UTC
3 points
6 comments1 min readLW link

What can be learned from scary de­mos? A snitch­ing case study

Fabien Roger24 Jun 2025 8:40 UTC
22 points
1 comment7 min readLW link

How to Host a Find Me Party

Commander Zander24 Jun 2025 0:56 UTC
13 points
1 comment2 min readLW link

Lo­cal Speech Recog­ni­tion with Whisper

jefftk24 Jun 2025 0:30 UTC
11 points
0 comments2 min readLW link
(www.jefftk.com)

Twig—Fic­tion Review

Commander Zander24 Jun 2025 0:04 UTC
17 points
0 comments1 min readLW link

The Lazarus Pro­ject—TV Review

Commander Zander23 Jun 2025 23:53 UTC
6 points
0 comments1 min readLW link

Fron­tier AI Labs: the Call Op­tion to AGI

ykevinzhang23 Jun 2025 21:02 UTC
22 points
2 comments10 min readLW link

The Rose Test: a fun way to feel with your guts (not just log­i­cally un­der­stand) why AI-safety mat­ters right now (and get new adepts)

lovagrus23 Jun 2025 21:02 UTC
10 points
0 comments3 min readLW link

Knowl­edge Ex­trac­tion Plan (KEP): An al­ter­na­tive to reck­less scaling

aswani23 Jun 2025 20:58 UTC
1 point
0 comments1 min readLW link

His­tory re­peats it­self: Fron­tier AI labs act­ing like Ama­zon in the early 2000s

i_am_nuts23 Jun 2025 20:56 UTC
8 points
0 comments3 min readLW link
(iamnuts.substack.com)

Open Thread—Sum­mer 2025

habryka23 Jun 2025 20:54 UTC
23 points
70 comments1 min readLW link

Child­hood and Ed­u­ca­tion #10: Behaviors

Zvi23 Jun 2025 20:40 UTC
26 points
4 comments15 min readLW link
(thezvi.wordpress.com)

Com­pressed Com­pu­ta­tion is (prob­a­bly) not Com­pu­ta­tion in Superposition

23 Jun 2025 19:35 UTC
57 points
9 comments10 min readLW link

Si­tu­a­tional Aware­ness: A One-Year Retrospective

Nathan Delisle23 Jun 2025 19:15 UTC
82 points
4 comments12 min readLW link

Rec­og­niz­ing Optimality

jimmy23 Jun 2025 17:55 UTC
20 points
0 comments7 min readLW link

Com­par­ing risk from in­ter­nally-de­ployed AI to in­sider and out­sider threats from humans

Buck23 Jun 2025 17:47 UTC
150 points
22 comments3 min readLW link

Foom & Doom 2: Tech­ni­cal al­ign­ment is hard

Steven Byrnes23 Jun 2025 17:19 UTC
165 points
65 comments28 min readLW link

Foom & Doom 1: “Brain in a box in a base­ment”

Steven Byrnes23 Jun 2025 17:18 UTC
282 points
120 comments29 min readLW link

The Lies of Big Bug

Bentham's Bulldog23 Jun 2025 16:03 UTC
−4 points
2 comments4 min readLW link

AI com­pa­nies aren’t plan­ning to se­cure crit­i­cal model weights

Zach Stein-Perlman23 Jun 2025 16:00 UTC
15 points
0 comments1 min readLW link

“It isn’t magic”

Ben (Berlin)23 Jun 2025 14:00 UTC
92 points
17 comments2 min readLW link

Fore­cast­ing AI Forecasting

Alvin Ånestrand23 Jun 2025 13:39 UTC
15 points
4 comments6 min readLW link

Re­cent progress on the sci­ence of evaluations

PabloAMC23 Jun 2025 9:41 UTC
14 points
1 comment8 min readLW link

Ra­cial Dat­ing Prefer­ences and Sex­ual Racism

koreindian23 Jun 2025 3:57 UTC
56 points
70 comments32 min readLW link
(vishalblog.substack.com)

Main­stream Grant­mak­ing Ex­per­tise (Post 7 of 7 on AI Gover­nance)

Mass_Driver23 Jun 2025 1:39 UTC
56 points
7 comments37 min readLW link

[Question] How does the LessWrong team gen­er­ate the web­site illus­tra­tions?

Nina Panickssery23 Jun 2025 0:05 UTC
16 points
1 comment1 min readLW link

The AI’s Toolbox: From Soggy Toast to Op­ti­mal Solutions

Thehumanproject.ai22 Jun 2025 20:54 UTC
1 point
0 comments8 min readLW link

Black-box in­ter­pretabil­ity method­ol­ogy blueprint: Prob­ing run­away op­ti­mi­sa­tion in LLMs

Roland Pihlakas22 Jun 2025 18:16 UTC
17 points
0 comments7 min readLW link

The Crois­sant Prin­ci­ple: A The­ory of AI Generalization

Jeffrey Liang22 Jun 2025 17:58 UTC
20 points
6 comments2 min readLW link

Re­la­tional De­sign Can’t Be Left to Chance

Priyanka Bharadwaj22 Jun 2025 15:32 UTC
5 points
0 comments3 min readLW link

Ground­ing to Avoid Air­plane Delays

jefftk22 Jun 2025 1:50 UTC
30 points
0 comments2 min readLW link
(www.jefftk.com)

Open ques­tions on com­pat­i­bil­ist free will and sub­junc­tive dependence

jackmastermind22 Jun 2025 1:15 UTC
3 points
0 comments1 min readLW link
(jacktlab.substack.com)

The Six­teen Kinds of Intimacy

Ruby21 Jun 2025 19:59 UTC
57 points
2 comments5 min readLW link

Book re­view: Against Method

Valdes21 Jun 2025 18:59 UTC
9 points
0 comments6 min readLW link

Con­trived eval­u­a­tions are use­ful evaluations

pradyuprasad21 Jun 2025 18:18 UTC
3 points
0 comments3 min readLW link
(speculativedecoding.substack.com)

Con­sider chilling out in 2028

Valentine21 Jun 2025 17:07 UTC
189 points
143 comments13 min readLW link

Up­com­ing work­shop on Post-AGI Civ­i­liza­tional Equilibria

21 Jun 2025 15:57 UTC
25 points
0 comments1 min readLW link