Lu­nar bom­bard­ment of earth is practical

anithite4 Jun 2026 23:25 UTC
27 points
0 comments4 min readLW link

En­durance: Shack­le­ton’s In­cred­ible Voy­age Review

nomagicpill4 Jun 2026 22:19 UTC
6 points
0 comments11 min readLW link

Rent from oil: a goldmine

TerriLeaf4 Jun 2026 21:05 UTC
15 points
5 comments5 min readLW link

Book of Cron Job

suchow4 Jun 2026 18:58 UTC
4 points
0 comments1 min readLW link
(www.nature.com)

(Mis)gen­er­al­iza­tion of Helpful-Only Fine-tuning

4 Jun 2026 18:40 UTC
55 points
7 comments11 min readLW link

Defeat­ing In­tro­spec­tion Adapters (and Why Threat Models Mat­ter)

4 Jun 2026 18:39 UTC
10 points
0 comments5 min readLW link

Build­ing Bet­ter Ac­ti­va­tion Oracles

4 Jun 2026 18:34 UTC
62 points
1 comment7 min readLW link

What Separates an Op­ti­mizer From Some­thing We Merely De­scribe as Op­ti­miz­ing?

stewart leland jansen4 Jun 2026 18:30 UTC
3 points
2 comments1 min readLW link

Ro­hin Shah on AGI Safety

anaguma4 Jun 2026 16:57 UTC
38 points
2 comments90 min readLW link
(80000hours.org)

Train­ing De­liber­a­tive Mon­i­tors for Black-Box Schem­ing Detection

4 Jun 2026 16:43 UTC
33 points
6 comments6 min readLW link

When AI Builds It­self (An­thropic In­sti­tute Linkpost)

fluxxrider4 Jun 2026 16:37 UTC
26 points
16 comments1 min readLW link

Lab Leaks, Black Holes, and Eggs: Epistemic Case Study Competition

4 Jun 2026 16:26 UTC
44 points
6 comments8 min readLW link
(flf.org)

Log­its as a new mon­i­tor for eval­u­a­tion awareness

Santiago Aranguri4 Jun 2026 16:12 UTC
34 points
7 comments6 min readLW link

AI #171: False Flag

Zvi4 Jun 2026 15:50 UTC
41 points
1 comment48 min readLW link
(thezvi.wordpress.com)

What should go in a model spec?

James_T4 Jun 2026 14:57 UTC
8 points
0 comments12 min readLW link
(www.forethought.org)

The Psy­cholog­i­cal Challenges of High-Im­pact Work—please par­ti­ci­pate in our sur­vey!

spencerg4 Jun 2026 3:51 UTC
9 points
0 comments1 min readLW link

Run­ning An Air Puri­fier on Batteries

jefftk4 Jun 2026 2:40 UTC
15 points
0 comments4 min readLW link
(www.jefftk.com)

Vol­un­tary Paternalism

quality_qualia4 Jun 2026 1:34 UTC
5 points
2 comments1 min readLW link
(sidkol1.github.io)

Six­teen schemes for AI safety

Austin Chen3 Jun 2026 21:50 UTC
32 points
4 comments8 min readLW link
(manifund.substack.com)

Align­ing Su­per­in­tel­li­gent Humans

Elliot Callender3 Jun 2026 20:39 UTC
17 points
2 comments3 min readLW link

A Pipeline for Gen­er­at­ing Syn­thetic Sab­o­tage Tra­jec­to­ries to Red-Team Monitors

3 Jun 2026 20:33 UTC
9 points
0 comments12 min readLW link

Beyond Hard­coded Evolu­tion­ary Psychology

Elliot Callender3 Jun 2026 20:26 UTC
27 points
10 comments6 min readLW link

Trump Signs Ex­ec­u­tive Order For AI Test­ing Prior To Fron­tier Model Releases

Zvi3 Jun 2026 16:30 UTC
51 points
1 comment13 min readLW link
(thezvi.wordpress.com)

Thoughts on ‘Learn­ing Me­chan­ics’

criticalpoints3 Jun 2026 15:36 UTC
12 points
0 comments10 min readLW link

Towards Shut­down­able Agents: Gen­er­al­iz­ing Stochas­tic Choice in RL Agents and LLMs

3 Jun 2026 14:24 UTC
20 points
3 comments19 min readLW link
(arxiv.org)

So­ciety Ex­plained: a tool for effi­ciently ex­plor­ing >100 the­o­ries of society

spencerg3 Jun 2026 14:08 UTC
48 points
5 comments1 min readLW link

Don’t Edit Your Ideas Be­fore Hav­ing Them

Hide3 Jun 2026 8:09 UTC
35 points
4 comments3 min readLW link
(hidefromit.substack.com)

China won’t win the AI race but would it be much worse if it did?

Chastity Ruth3 Jun 2026 5:46 UTC
71 points
18 comments13 min readLW link

Bear spray ex­piry dates: good news, and stag­ger­ing peer-re­viewed pseudoscience

Bruce Middleton3 Jun 2026 3:25 UTC
23 points
1 comment4 min readLW link

Ab­strac­tion Boundaries and Bub­bles of Legibility

Adam Chlipala2 Jun 2026 23:54 UTC
1 point
0 comments9 min readLW link

Should AI Safety Re­searchers Ex­per­i­ment with Au­to­mated Research

Ephraiem Sarabamoun2 Jun 2026 23:18 UTC
1 point
0 comments1 min readLW link

My fa­vorite de­pic­tion of utopia

Caleb Biddulph2 Jun 2026 23:15 UTC
189 points
20 comments33 min readLW link
(docs.google.com)

The Ori­gin of Uncertainty

Gordon Seidoh Worley2 Jun 2026 18:20 UTC
13 points
2 comments2 min readLW link
(www.uncertainupdates.com)

LURE: Align­ment Eval­u­a­tions to Re­duce Eval­u­a­tion Awareness

2 Jun 2026 18:20 UTC
26 points
5 comments5 min readLW link

Why Even Ex­perts Don’t Know What to Do About AI Risk

2 Jun 2026 17:31 UTC
78 points
22 comments2 min readLW link

Where does the race to au­to­mate AI re­search end?

Simon Lermen2 Jun 2026 17:21 UTC
16 points
0 comments1 min readLW link
(simonlermen.substack.com)

A Town Without Children

SeñorDingDong2 Jun 2026 16:35 UTC
35 points
7 comments4 min readLW link

An­nounc­ing the ARC White-Box Es­ti­ma­tion Challenge

2 Jun 2026 16:20 UTC
165 points
15 comments3 min readLW link
(www.alignment.org)

Agent Foun­da­tions Re­minds Me of Con­ti­nen­tal Philosophy

IanWS2 Jun 2026 14:34 UTC
106 points
15 comments5 min readLW link
(write.ianwsperber.com)

Claude Opus 4.8: Ca­pa­bil­ities and Reactions

Zvi2 Jun 2026 14:10 UTC
38 points
2 comments31 min readLW link
(thezvi.wordpress.com)

Why we’re launch­ing the Fron­tier Biodefense Fellowship

Tobias H2 Jun 2026 9:06 UTC
8 points
0 comments4 min readLW link

Wood Screws and the Meth­ods of Rationality

quanticle2 Jun 2026 7:49 UTC
12 points
7 comments4 min readLW link

Tak­ing the Train­ing Wheels Off: Align­ing LLMs with­out Personas

Matthew Khoriaty2 Jun 2026 6:29 UTC
23 points
16 comments3 min readLW link

Com­pute Ver­ifi­ca­tion on Short Timelines

skunnavakkam2 Jun 2026 3:31 UTC
13 points
0 comments2 min readLW link

Test­ing Best-Effort Solar

jefftk2 Jun 2026 3:00 UTC
16 points
0 comments2 min readLW link
(www.jefftk.com)

May 2026 Links

nomagicpill2 Jun 2026 1:42 UTC
8 points
0 comments4 min readLW link

% Bureaucracy

PossiblyElaine2 Jun 2026 0:36 UTC
11 points
1 comment5 min readLW link
(possiblyelaine.substack.com)

Tech I’m skep­ti­cal of and why

harsimony1 Jun 2026 22:54 UTC
46 points
24 comments24 min readLW link
(splittinginfinity.substack.com)

Cri­tique of cur­rent AI safety bug bounty programs

clickyquack1 Jun 2026 21:26 UTC
7 points
0 comments7 min readLW link

[Linkpost] Pre­fix­ing names with ‘se­cure_’ makes agents write more se­cure code

Jack1 Jun 2026 21:20 UTC
14 points
1 comment1 min readLW link
(antimemeticai.com)