Dark­ness Med­i­ta­tion—for NZ Win­ter Sols­tice 2025

joshuamerriam16 Jun 2025 23:58 UTC
2 points
0 comments4 min readLW link

[Question] Are su­per­hu­man sa­vants real?

Bunthut16 Jun 2025 22:02 UTC
15 points
4 comments1 min readLW link

Ok, AI Can Write Pretty Good Fic­tion Now

JustisMills16 Jun 2025 21:13 UTC
59 points
34 comments6 min readLW link
(justismills.substack.com)

Sub­jec­tive ex­pe­rience is most likely physical

martinkunev16 Jun 2025 20:54 UTC
5 points
3 comments4 min readLW link

VLMs can Ag­gre­gate Scat­tered Train­ing Patches

LINGJIE CHEN16 Jun 2025 18:25 UTC
2 points
0 comments4 min readLW link

Set­point = The ex­pe­rience we at­tend to

jimmy16 Jun 2025 17:34 UTC
22 points
0 comments7 min readLW link

Thought Crime: Back­doors & Emer­gent Misal­ign­ment in Rea­son­ing Models

16 Jun 2025 16:43 UTC
69 points
2 comments8 min readLW link

How LLM Beliefs Change Dur­ing Chain-of-Thought Reasoning

16 Jun 2025 16:18 UTC
32 points
3 comments5 min readLW link

Con­ver­gent Lin­ear Rep­re­sen­ta­tions of Emer­gent Misalignment

16 Jun 2025 15:47 UTC
76 points
1 comment8 min readLW link

Model Or­ganisms for Emer­gent Misalignment

16 Jun 2025 15:46 UTC
118 points
19 comments5 min readLW link

Coach­ing AI: A Re­la­tional Ap­proach to AI Safety

Priyanka Bharadwaj16 Jun 2025 15:33 UTC
11 points
0 comments5 min readLW link

Me­mories of the Neu­tral Zone

Jordan Rubin16 Jun 2025 15:33 UTC
7 points
0 comments3 min readLW link
(jordanmrubin.substack.com)

Do LLMs Com­ply Differ­ently Dur­ing Tests? Is This a Hid­den Vari­able in Safety Eval­u­a­tion? And Can We Steer That?

Sahar Abdelnabi16 Jun 2025 13:52 UTC
17 points
0 comments6 min readLW link

RTFB: The RAISE Act

Zvi16 Jun 2025 12:50 UTC
97 points
8 comments8 min readLW link
(thezvi.wordpress.com)

[Question] Galaxy-Brain Hobo An­tibiotics?

Lorec16 Jun 2025 12:43 UTC
3 points
9 comments4 min readLW link

The EU com­mis­sion seeks ex­pert ad­visers on AI

PabloAMC16 Jun 2025 12:28 UTC
7 points
0 comments1 min readLW link

Dou­ble Crux: Master the art of pro­duc­tive disagreement

marta_k16 Jun 2025 11:15 UTC
2 points
0 comments1 min readLW link

From Paper­clips to Bombs: The Evolu­tion of AI Risk Dis­course on LessWrong

David Harket16 Jun 2025 5:16 UTC
3 points
0 comments24 min readLW link

Donut­ting is bad

Jarrah16 Jun 2025 4:12 UTC
20 points
4 comments1 min readLW link

Futarchy us­ing a sealed-bid auc­tion to avoid liquidity problems

Christopher King16 Jun 2025 1:34 UTC
21 points
6 comments8 min readLW link

Me­mory De­cod­ing Jour­nal Club: Neo­cor­ti­cal synap­tic en­grams for re­mote con­tex­tual memories

Devin Ward15 Jun 2025 23:22 UTC
1 point
0 comments1 min readLW link

Every Ma­jor LLM En­dorses New­comb One-Boxing

jackmastermind15 Jun 2025 20:44 UTC
19 points
13 comments1 min readLW link
(jacktlab.substack.com)

FDT Does Not En­dorse It­self in Asym­met­ric Games

jackmastermind15 Jun 2025 20:44 UTC
23 points
3 comments5 min readLW link

Can We Change the Goals of a Toy RL Agent?

15 Jun 2025 20:34 UTC
20 points
0 comments9 min readLW link

Some re­pro­ge­net­ics-re­lated pro­jects you could help with

TsviBT15 Jun 2025 20:25 UTC
80 points
1 comment4 min readLW link

Risk To­kens: Eco­nomic Se­cu­rity in AI Safety

mhdempsey15 Jun 2025 19:25 UTC
1 point
0 comments6 min readLW link
(www.michaeldempsey.me)

Aligned mon­e­ti­za­tion of mod­ern dating

kwang15 Jun 2025 16:01 UTC
0 points
0 comments3 min readLW link
(kevw.substack.com)

In­tel­li­gence Is Not Magic, But Your Thresh­old For “Magic” Is Pretty Low

Expertium15 Jun 2025 15:23 UTC
215 points
27 comments1 min readLW link

Estro­gen: A trip report

cube_flipper15 Jun 2025 13:15 UTC
167 points
42 comments27 min readLW link
(smoothbrains.net)

[Question] Do mul­ti­modal LLMs (like 4o) use OCR un­der the hood to read dense text in images?

2PuNCheeZ15 Jun 2025 11:20 UTC
4 points
1 comment1 min readLW link

Book re­view: Air-borne by Carl Zimmer

eukaryote15 Jun 2025 5:49 UTC
34 points
0 comments11 min readLW link
(eukaryotewritesblog.com)

My fa­vorite Soviet songs

Nina Panickssery15 Jun 2025 2:48 UTC
22 points
1 comment5 min readLW link
(ninapanickssery.substack.com)

Side quests in cur­ricu­lum learn­ing and regularization

Sandy Fraser15 Jun 2025 2:03 UTC
5 points
0 comments10 min readLW link

AXRP Epi­sode 43 - David Lind­ner on My­opic Op­ti­miza­tion with Non-my­opic Approval

DanielFilan15 Jun 2025 1:20 UTC
12 points
0 comments56 min readLW link

Jailbreak­ing Claude 4 and Other Fron­tier Lan­guage Models

James Sullivan15 Jun 2025 0:31 UTC
1 point
0 comments3 min readLW link
(open.substack.com)

En­dometri­o­sis is an in­cred­ibly in­ter­est­ing disease

Abhishaike Mahajan14 Jun 2025 22:14 UTC
166 points
5 comments16 min readLW link
(www.owlposting.com)

Field Notes from Ship­ping Real Code with Claude

creatorrr14 Jun 2025 16:36 UTC
22 points
0 comments12 min readLW link
(diwank.space)

Train­ing Su­pe­rior Sparse Au­toen­coders for In­struct Models

Haoran Ye14 Jun 2025 16:35 UTC
4 points
0 comments7 min readLW link

Fore­sight In­sti­tute AI safety RFPs in au­toma­tion, se­cu­rity, multi-agent, neuro

Allison Duettmann14 Jun 2025 16:29 UTC
6 points
0 comments2 min readLW link

A Very Sim­ple Case For Giv­ing To Shrimp

Bentham's Bulldog14 Jun 2025 15:31 UTC
−6 points
1 comment3 min readLW link

Why we’re still do­ing nor­mal school

juliawise14 Jun 2025 12:40 UTC
85 points
0 comments3 min readLW link

What Caused the Fer­til­ity Col­lapse?

Zero Contradictions14 Jun 2025 7:15 UTC
−3 points
2 comments4 min readLW link
(expandingrationality.substack.com)

Re­lo­ca­tion triggers

denkenberger14 Jun 2025 6:36 UTC
2 points
0 comments1 min readLW link

Me­mory De­cod­ing Jour­nal Club: Neo­cor­ti­cal synap­tic en­grams for re­mote con­tex­tual memories

Devin Ward14 Jun 2025 2:26 UTC
1 point
0 comments1 min readLW link

[Question] How con­cerned are you about a fast take­off due to a leap in hard­ware us­age?

MichaelDickens14 Jun 2025 1:15 UTC
9 points
7 comments1 min readLW link

[Question] How could I tell some­one that con­scious­ness is not the pri­mary con­cern of AI Safety?

Lysandre Terrisse13 Jun 2025 22:44 UTC
11 points
2 comments3 min readLW link

De­bate ex­per­i­ments at The Curve, LessOn­line and Manifest

Nathan Young13 Jun 2025 22:35 UTC
36 points
12 comments5 min readLW link
(nathanpmyoung.substack.com)

Futarchy’s fun­da­men­tal flaw

dynomight13 Jun 2025 22:08 UTC
178 points
49 comments9 min readLW link
(dynomight.net)

The Pros and Cons of Be­ing Among Your Tribe

Sable13 Jun 2025 21:41 UTC
32 points
0 comments7 min readLW link
(affablyevil.substack.com)

Con­strain­ing Minds, Not Goals: A Struc­tural Ap­proach to AI Alignment

Johannes C. Mayer13 Jun 2025 21:06 UTC
25 points
0 comments9 min readLW link