Me­mory De­cod­ing Jour­nal Club: Neo­cor­ti­cal synap­tic en­grams for re­mote con­tex­tual memories

Devin Ward15 Jun 2025 23:22 UTC
1 point
0 comments1 min readLW link

Every Ma­jor LLM En­dorses New­comb One-Boxing

jackmastermind15 Jun 2025 20:44 UTC
19 points
13 comments1 min readLW link
(jacktlab.substack.com)

FDT Does Not En­dorse It­self in Asym­met­ric Games

jackmastermind15 Jun 2025 20:44 UTC
23 points
3 comments5 min readLW link

Can We Change the Goals of a Toy RL Agent?

15 Jun 2025 20:34 UTC
20 points
0 comments9 min readLW link

Some re­pro­ge­net­ics-re­lated pro­jects you could help with

TsviBT15 Jun 2025 20:25 UTC
80 points
1 comment4 min readLW link

Risk To­kens: Eco­nomic Se­cu­rity in AI Safety

mhdempsey15 Jun 2025 19:25 UTC
1 point
0 comments6 min readLW link
(www.michaeldempsey.me)

Aligned mon­e­ti­za­tion of mod­ern dating

kwang15 Jun 2025 16:01 UTC
0 points
0 comments3 min readLW link
(kevw.substack.com)

In­tel­li­gence Is Not Magic, But Your Thresh­old For “Magic” Is Pretty Low

Expertium15 Jun 2025 15:23 UTC
215 points
27 comments1 min readLW link

Estro­gen: A trip report

cube_flipper15 Jun 2025 13:15 UTC
167 points
42 comments27 min readLW link
(smoothbrains.net)

[Question] Do mul­ti­modal LLMs (like 4o) use OCR un­der the hood to read dense text in images?

2PuNCheeZ15 Jun 2025 11:20 UTC
4 points
1 comment1 min readLW link

Book re­view: Air-borne by Carl Zimmer

eukaryote15 Jun 2025 5:49 UTC
34 points
0 comments11 min readLW link
(eukaryotewritesblog.com)

My fa­vorite Soviet songs

Nina Panickssery15 Jun 2025 2:48 UTC
22 points
1 comment5 min readLW link
(ninapanickssery.substack.com)

Side quests in cur­ricu­lum learn­ing and regularization

Sandy Fraser15 Jun 2025 2:03 UTC
5 points
0 comments10 min readLW link

AXRP Epi­sode 43 - David Lind­ner on My­opic Op­ti­miza­tion with Non-my­opic Approval

DanielFilan15 Jun 2025 1:20 UTC
12 points
0 comments56 min readLW link

Jailbreak­ing Claude 4 and Other Fron­tier Lan­guage Models

James Sullivan15 Jun 2025 0:31 UTC
1 point
0 comments3 min readLW link
(open.substack.com)

En­dometri­o­sis is an in­cred­ibly in­ter­est­ing disease

Abhishaike Mahajan14 Jun 2025 22:14 UTC
166 points
5 comments16 min readLW link
(www.owlposting.com)

Field Notes from Ship­ping Real Code with Claude

creatorrr14 Jun 2025 16:36 UTC
22 points
0 comments12 min readLW link
(diwank.space)

Train­ing Su­pe­rior Sparse Au­toen­coders for In­struct Models

Haoran Ye14 Jun 2025 16:35 UTC
4 points
0 comments7 min readLW link

Fore­sight In­sti­tute AI safety RFPs in au­toma­tion, se­cu­rity, multi-agent, neuro

Allison Duettmann14 Jun 2025 16:29 UTC
6 points
0 comments2 min readLW link

A Very Sim­ple Case For Giv­ing To Shrimp

Bentham's Bulldog14 Jun 2025 15:31 UTC
−6 points
1 comment3 min readLW link

Why we’re still do­ing nor­mal school

juliawise14 Jun 2025 12:40 UTC
85 points
0 comments3 min readLW link

What Caused the Fer­til­ity Col­lapse?

Zero Contradictions14 Jun 2025 7:15 UTC
−3 points
2 comments4 min readLW link
(expandingrationality.substack.com)

Re­lo­ca­tion triggers

denkenberger14 Jun 2025 6:36 UTC
2 points
0 comments1 min readLW link

Me­mory De­cod­ing Jour­nal Club: Neo­cor­ti­cal synap­tic en­grams for re­mote con­tex­tual memories

Devin Ward14 Jun 2025 2:26 UTC
1 point
0 comments1 min readLW link

[Question] How con­cerned are you about a fast take­off due to a leap in hard­ware us­age?

MichaelDickens14 Jun 2025 1:15 UTC
9 points
7 comments1 min readLW link

[Question] How could I tell some­one that con­scious­ness is not the pri­mary con­cern of AI Safety?

Lysandre Terrisse13 Jun 2025 22:44 UTC
11 points
2 comments3 min readLW link

De­bate ex­per­i­ments at The Curve, LessOn­line and Manifest

Nathan Young13 Jun 2025 22:35 UTC
36 points
12 comments5 min readLW link
(nathanpmyoung.substack.com)

Futarchy’s fun­da­men­tal flaw

dynomight13 Jun 2025 22:08 UTC
178 points
49 comments9 min readLW link
(dynomight.net)

The Pros and Cons of Be­ing Among Your Tribe

Sable13 Jun 2025 21:41 UTC
32 points
0 comments7 min readLW link
(affablyevil.substack.com)

Con­strain­ing Minds, Not Goals: A Struc­tural Ap­proach to AI Alignment

Johannes C. Mayer13 Jun 2025 21:06 UTC
25 points
0 comments9 min readLW link

The op­ti­mal level of op­ti­miza­tion is suboptimal

ellifournier13 Jun 2025 18:06 UTC
4 points
4 comments1 min readLW link
(ellifournier.substack.com)

On Prun­ing an Over­grown Garden

Vaatzes13 Jun 2025 17:54 UTC
3 points
3 comments6 min readLW link

Learned hel­pless­ness about “teach­ing to the test”

Viliam13 Jun 2025 17:53 UTC
36 points
16 comments3 min readLW link

In­for­ma­tion-Dense Con­fer­ence Badges

ozziegooen13 Jun 2025 17:52 UTC
28 points
4 comments4 min readLW link
(ozziegooen.substack.com)

The Su­per­wis­dom Th­e­sis: Why Su­per­in­tel­li­gence Does Not Pose An Ex­is­ten­tial Threat

Max Abecassis13 Jun 2025 17:35 UTC
−23 points
9 comments30 min readLW link

The Boat Theft The­ory of Consciousness

Lorec13 Jun 2025 16:38 UTC
41 points
36 comments2 min readLW link

Monthly Roundup #31: June 2025

Zvi13 Jun 2025 16:20 UTC
37 points
3 comments50 min readLW link
(thezvi.wordpress.com)

Un­su­per­vised Elic­i­ta­tion of Lan­guage Models

13 Jun 2025 16:15 UTC
57 points
12 comments2 min readLW link

Lucky Omega Problem

Tapatakt13 Jun 2025 14:54 UTC
10 points
4 comments4 min readLW link

Distil­la­tion Ro­bus­tifies Unlearning

13 Jun 2025 13:45 UTC
236 points
43 comments8 min readLW link
(arxiv.org)

Self-Adapt­ing Lan­guage Models (from MIT, arXiv preprint)

Person13 Jun 2025 13:08 UTC
5 points
1 comment1 min readLW link

Do Not Tile the Light­cone with Your Con­fused Ontology

Jan_Kulveit13 Jun 2025 12:45 UTC
229 points
27 comments5 min readLW link
(boundedlyrational.substack.com)

Cor­po­ra­tions as Paper­clip/​Profit Maximizers

busssard13 Jun 2025 10:55 UTC
17 points
3 comments22 min readLW link

4. Why ex­ist­ing ap­proaches to cause pri­ori­ti­za­tion are not ro­bust to unawareness

Anthony DiGiovanni13 Jun 2025 8:55 UTC
26 points
0 comments17 min readLW link

[Question] Un­der what con­di­tions should hu­mans stop pur­su­ing tech­ni­cal AI safety ca­reers?

S. Alex Bradt13 Jun 2025 5:56 UTC
6 points
0 comments1 min readLW link

[linkpost] AI Align­ment is About Cul­ture, Not Con­trol by JCorvinus

Milan W13 Jun 2025 0:07 UTC
1 point
8 comments1 min readLW link
(jcorvinus.medium.com)

Fore­cast AI 2027

ChristianWilliams12 Jun 2025 21:12 UTC
20 points
0 comments1 min readLW link
(www.metaculus.com)

CRMArena-Pro: Holis­tic Assess­ment of LLM Agents Across Di­verse Busi­ness Sce­nar­ios and Interactions

Annapurna12 Jun 2025 19:53 UTC
8 points
0 comments1 min readLW link
(arxiv.org)

When does train­ing a model change its goals?

12 Jun 2025 18:43 UTC
78 points
3 comments15 min readLW link

Res­train­ing Fac­tors in AI Align­ment Sys­tems

theophilus tabuke12 Jun 2025 18:17 UTC
1 point
1 comment1 min readLW link