Why Rea­son­ing Isn’t Enough: How LLM Agents Strug­gle with Ethics and Cooperation

28 Jun 2025 20:43 UTC
6 points
0 comments4 min readLW link

Sup­port for bedrock liberal prin­ci­ples seems to be in pretty bad shape these days

Max H28 Jun 2025 20:37 UTC
32 points
52 comments4 min readLW link

A De­pressed Shrink Tries Shrooms

AlphaAndOmega28 Jun 2025 17:16 UTC
44 points
6 comments1 min readLW link
(open.substack.com)

Time Ma­chine as Ex­is­ten­tial Risk

avturchin28 Jun 2025 15:17 UTC
15 points
7 comments45 min readLW link

The next wave of model im­prove­ments will be due to data quality

ChristianKl28 Jun 2025 14:34 UTC
17 points
4 comments1 min readLW link

AXRP Epi­sode 44 - Peter Salib on AI Rights for Hu­man Safety

DanielFilan28 Jun 2025 1:40 UTC
12 points
0 comments103 min readLW link

Pre­dic­tion Mar­kets Have an An­thropic Bias to Deal With

ar-sht28 Jun 2025 1:16 UTC
7 points
1 comment11 min readLW link

Emer­gent Misal­ign­ment & Realignment

27 Jun 2025 21:31 UTC
45 points
1 comment17 min readLW link

Pro­ject Moonbeam

WillPetillo27 Jun 2025 21:08 UTC
14 points
2 comments6 min readLW link

Pro­posal for mak­ing cred­ible com­mit­ments to AIs.

Cleo Nardo27 Jun 2025 19:43 UTC
107 points
45 comments2 min readLW link

Me­mory De­cod­ing Jour­nal Club: Sys­tems con­soli­da­tion re­or­ga­nizes hip­pocam­pal en­gram circuitry

Devin Ward27 Jun 2025 17:43 UTC
1 point
0 comments1 min readLW link

[Paper] Stochas­tic Pa­ram­e­ter Decomposition

27 Jun 2025 16:54 UTC
47 points
14 comments1 min readLW link
(arxiv.org)

Epoch: What is Epoch?

Zach Stein-Perlman27 Jun 2025 16:45 UTC
34 points
1 comment8 min readLW link
(epoch.ai)

Un­learn­ing Needs to be More Selec­tive [Progress Re­port]

27 Jun 2025 16:38 UTC
24 points
6 comments3 min readLW link

Jankily con­trol­ling superintelligence

ryan_greenblatt27 Jun 2025 14:05 UTC
70 points
4 comments7 min readLW link

Child­hood and Ed­u­ca­tion #11: The Art of Learning

Zvi27 Jun 2025 13:50 UTC
45 points
8 comments12 min readLW link
(thezvi.wordpress.com)

No, Futarchy Doesn’t Have This EDT Flaw

Mikhail Samin27 Jun 2025 9:27 UTC
35 points
28 comments2 min readLW link

Mis­con­cep­tions on Afford­able Housing

jefftk27 Jun 2025 2:40 UTC
14 points
0 comments1 min readLW link
(www.jefftk.com)

A case for courage, when speak­ing of AI danger

So8res27 Jun 2025 2:15 UTC
530 points
129 comments6 min readLW link

Help the AI 2027 team make an on­line AGI wargame

Jonas V27 Jun 2025 1:02 UTC
81 points
10 comments1 min readLW link

The Bel­l­man equa­tion does not ap­ply to bounded rationality

Christopher King26 Jun 2025 23:01 UTC
17 points
2 comments1 min readLW link

Re­cent and fore­casted rates of soft­ware and hard­ware progress

elifland26 Jun 2025 22:37 UTC
46 points
0 comments8 min readLW link

Too Many Defi­ni­tions of Consciousness

Commander Zander26 Jun 2025 22:22 UTC
7 points
2 comments1 min readLW link

May-June 2025 Progress in Guaran­teed Safe AI

Quinn26 Jun 2025 21:30 UTC
8 points
0 comments4 min readLW link
(gsai.substack.com)

How many GPUs are mar­kets ex­pect­ing?

CaseyMilkweed26 Jun 2025 21:17 UTC
6 points
0 comments3 min readLW link
(caseymilkweed.substack.com)

A Guide For LLM-As­sisted Web Research

26 Jun 2025 18:39 UTC
46 points
3 comments7 min readLW link

RLAIF/​RLHF for Public Value Align­ment En­hanc­ing Trans­parency in LLMs

Jada4226 Jun 2025 18:32 UTC
1 point
0 comments2 min readLW link

If Mo­ral Real­ism is true, then the Orthog­o­nal­ity Th­e­sis is false.

Eye You26 Jun 2025 18:31 UTC
6 points
13 comments1 min readLW link

The Cadca Tran­si­tion Map—Nav­i­gat­ing the Path to the ASI Singleton

cadca26 Jun 2025 18:30 UTC
1 point
0 comments10 min readLW link

Get­ting To and From Monism

unication26 Jun 2025 18:28 UTC
−5 points
21 comments3 min readLW link

AI #122: Pay­ing The Mar­ket Price

Zvi26 Jun 2025 18:10 UTC
36 points
2 comments40 min readLW link
(thezvi.wordpress.com)

Love Is­land USA Sea­son 7 Epi­sode 20: What Could The Pro­duc­ers Be Thinking

Zvi26 Jun 2025 17:31 UTC
17 points
14 comments14 min readLW link

The need to rel­a­tivise in de­bate

26 Jun 2025 16:23 UTC
31 points
2 comments5 min readLW link

Prover­bial Corollaries

Jordan Rubin26 Jun 2025 15:25 UTC
11 points
0 comments2 min readLW link
(jordanmrubin.substack.com)

How meta­phys­i­cal be­liefs shape crit­i­cal as­pects of AI development

Jáchym Fibír26 Jun 2025 15:13 UTC
−9 points
8 comments8 min readLW link
(www.phiand.ai)

The In­dus­trial Explosion

26 Jun 2025 14:41 UTC
128 points
70 comments15 min readLW link
(www.forethought.org)

The Ice­berg The­ory of Meaning

Richard Juggins26 Jun 2025 12:13 UTC
10 points
9 comments5 min readLW link

If we get things right, AI could have huge benefits

26 Jun 2025 8:19 UTC
5 points
0 comments1 min readLW link

Ad­vanced AI is a big deal even if we don’t lose control

26 Jun 2025 8:19 UTC
8 points
0 comments2 min readLW link

Defeat may be ir­re­versibly catastrophic

26 Jun 2025 8:19 UTC
5 points
0 comments2 min readLW link

If Not Now, When?

Yair Halberstadt26 Jun 2025 6:10 UTC
31 points
3 comments1 min readLW link

How Much Data From a Se­quenc­ing Run?

jefftk26 Jun 2025 2:30 UTC
13 points
0 comments2 min readLW link
(www.jefftk.com)

The Prac­ti­cal Value of Flawed Models: A Re­sponse to tito­tal’s AI 2027 Critique

Michelle_Ma25 Jun 2025 22:15 UTC
7 points
1 comment6 min readLW link

I Tested LLM Agents on Sim­ple Safety Rules. They Failed in Sur­pris­ing and In­for­ma­tive Ways.

Ram Potham25 Jun 2025 21:39 UTC
9 points
12 comments6 min readLW link

Tech for Thinking

sarahconstantin25 Jun 2025 21:30 UTC
60 points
9 comments7 min readLW link
(sarahconstantin.substack.com)

Me­mory De­cod­ing Jour­nal Club: Sys­tems con­soli­da­tion re­or­ga­nizes hip­pocam­pal en­gram cir­cuitry

Devin Ward25 Jun 2025 21:21 UTC
3 points
0 comments1 min readLW link

Mak­ing Sense of Con­scious­ness Part 1: Per­cep­tual Awareness

sarahconstantin25 Jun 2025 21:10 UTC
19 points
0 comments9 min readLW link
(sarahconstantin.substack.com)

Dou­ble Pod­cast Drop on AI Safety

jacobhaimes25 Jun 2025 20:11 UTC
5 points
0 comments1 min readLW link

Is there a loom­ing Cul­tural Om­ni­cide?

Jared M.25 Jun 2025 18:18 UTC
24 points
7 comments5 min readLW link

A Method­ol­o­gist’s Apology

adamShimi25 Jun 2025 16:52 UTC
13 points
0 comments9 min readLW link
(formethods.substack.com)