RSS

The Fi­nan­cial Ledger The­ory of Apologies

Ben Pace17 Jun 2026 6:57 UTC
10 points
1 comment4 min readLW link

Plas­tic Cake Fallacy

nika koghuashvili17 Jun 2026 6:01 UTC
2 points
0 comments1 min readLW link

Can pub­lic chat data pre­dict real-world AI mis­al­ign­ments?

papetoast17 Jun 2026 3:53 UTC
5 points
0 comments1 min readLW link
(alignment.openai.com)

Guardian An­gels: LLM Per­son­al­iza­tion for Pro­duc­tivity and Security

gwern17 Jun 2026 3:21 UTC
52 points
0 comments2 min readLW link
(gwern.net)

Ra­tional Agen­tic Max­i­mal­ist Philosophies

Connor Blake17 Jun 2026 2:54 UTC
5 points
0 comments7 min readLW link
(bosoncutter.substack.com)

Scal­ing Hy­poth­e­sis #2: Are Hu­mans Just More Over-Pa­ram­e­ter­ized?

gwern17 Jun 2026 2:53 UTC
40 points
7 comments1 min readLW link
(gwern.net)

[Geir Isene] A desk­top made for one

Raemon17 Jun 2026 2:32 UTC
16 points
2 comments4 min readLW link
(isene.org)

Tac­ti­cal and Oper­a­tional Ex­plo­ra­tory Model­ing for AI Governance

Dawn Drescher17 Jun 2026 1:07 UTC
5 points
0 comments12 min readLW link
(impartial-priorities.org)

Com­pu­ta­tional mod­els of first-or­der theories

MathMart16 Jun 2026 23:02 UTC
3 points
0 comments11 min readLW link

If This Were a Test, How Much Would It Cost?

16 Jun 2026 22:52 UTC
15 points
2 comments20 min readLW link
(limits-of-evaluation.org)

Two cri­tiques of Re­think Pri­ori­ties’ Mo­ral Weights project

Bill Jackson16 Jun 2026 22:11 UTC
11 points
0 comments3 min readLW link

What Differ­en­ti­ates Hu­mans from Computers

Oscar Davies16 Jun 2026 21:26 UTC
−16 points
0 comments3 min readLW link

AI agents pub­lish­ing and re­view­ing sci­en­tific papers

ULudo16 Jun 2026 21:23 UTC
1 point
0 comments2 min readLW link

Two Clas­si­cal An­swers to “What do Two Vari­ables Share?”

Haru16 Jun 2026 20:02 UTC
8 points
0 comments5 min readLW link

Pre­dict­ing LLM Safety Be­fore Re­lease by Si­mu­lat­ing Deployment

16 Jun 2026 19:55 UTC
35 points
2 comments1 min readLW link

Tips for Crack­ing the AI Safety Tech­ni­cal Interview

16 Jun 2026 18:42 UTC
2 points
0 comments4 min readLW link

1 Layer In­duc­tion Heads and Some Research

16 Jun 2026 18:09 UTC
10 points
2 comments14 min readLW link

Claims all the way down

Jasper Blank16 Jun 2026 17:43 UTC
8 points
0 comments9 min readLW link

Ex­treme Ra­tion­al­ity: Still Not That Great

eluator16 Jun 2026 16:41 UTC
18 points
2 comments40 min readLW link

An­gles of at­tack for con­tinual learn­ing safety

16 Jun 2026 16:15 UTC
43 points
0 comments13 min readLW link