Maybe So­cial Anx­iety Is Just You Failing At Mind Control

25Hour11 Jun 2025 23:49 UTC
81 points
21 comments16 min readLW link

OpenAI now has an RL API which is broadly accessible

ryan_greenblatt11 Jun 2025 23:39 UTC
43 points
1 comment5 min readLW link

So You Want to Work at a Fron­tier AI Lab

Joe Rogero11 Jun 2025 23:11 UTC
48 points
14 comments7 min readLW link
(intelligence.org)

Com­men­tary On The Tur­ing Apocrypha

jdp11 Jun 2025 22:52 UTC
21 points
0 comments11 min readLW link
(minihf.com)

[Question] My friend wants a good book recom­men­da­tion to un­der­stand AI, AI safety, and the field, and prob­a­bly the drama. He’s smart but non-tech­ni­cal and not keep­ing up with trends. Any recs?

JohnGreer11 Jun 2025 22:32 UTC
9 points
0 comments1 min readLW link

The Dun­ning-Dun­ning-Kruger-Kruger Effect

ellifournier11 Jun 2025 21:02 UTC
−1 points
2 comments1 min readLW link
(ellifournier.substack.com)

A Re­vi­sion to Mar­ket Mone­tarism: In­di­vi­d­ual Hoard­ing as Ra­tional, Com­pe­ti­tion for Dol­lars as Zero-Sum?

Lorec11 Jun 2025 20:13 UTC
4 points
0 comments4 min readLW link

In­ves­ti­gat­ing Ac­ci­den­tal Misal­ign­ment: Causal Effects of Fine-Tun­ing Data on Model Vulnerability

11 Jun 2025 19:30 UTC
6 points
0 comments5 min readLW link

The Dream of a Gen­tle Singularity

Zvi11 Jun 2025 19:30 UTC
57 points
7 comments12 min readLW link
(thezvi.wordpress.com)

Be­ware Gen­eral Claims about “Gen­er­al­iz­able Rea­son­ing Ca­pa­bil­ities” (of Modern AI Sys­tems)

LawrenceC11 Jun 2025 19:27 UTC
297 points
19 comments16 min readLW link

Reli­gion for Rationalists

Gordon Seidoh Worley11 Jun 2025 19:05 UTC
28 points
65 comments4 min readLW link

Difficul­ties of Escha­tolog­i­cal policy mak­ing [Linkpost]

Noosphere8911 Jun 2025 14:12 UTC
11 points
3 comments3 min readLW link
(jack-clark.net)

Hydra

Matrice Jacobine11 Jun 2025 14:07 UTC
24 points
0 comments1 min readLW link
(philosophybear.substack.com)

SafeRLHub: An In­ter­ac­tive Re­source for RL Safety and Interpretability

11 Jun 2025 5:47 UTC
11 points
0 comments7 min readLW link

More on policy ar­gu­ments and the AB problem

Sniffnoy11 Jun 2025 4:42 UTC
10 points
0 comments4 min readLW link

Us­ing AI Video Gen­er­a­tion to Re-cre­ate Memories

Annapurna11 Jun 2025 4:06 UTC
−1 points
2 comments1 min readLW link

Con­flicted on AI Politics

jefftk11 Jun 2025 3:40 UTC
27 points
5 comments2 min readLW link
(www.jefftk.com)

the void

nostalgebraist11 Jun 2025 3:19 UTC
397 points
107 comments1 min readLW link
(nostalgebraist.tumblr.com)

$500 bounty for en­gage­ment on asym­met­ric AI risk

YonatanK10 Jun 2025 21:50 UTC
23 points
14 comments2 min readLW link

AI-2027 Re­sponse: In­ter-AI Ten­sions, Value Distil­la­tion, US Mul­tipo­lar­ity, & More

Gatlen Culp10 Jun 2025 18:17 UTC
3 points
0 comments8 min readLW link
(gatlen.blog)

Give Me a Rea­son(ing Model)

Zvi10 Jun 2025 15:10 UTC
55 points
6 comments5 min readLW link
(thezvi.wordpress.com)

Mech in­terp is not pre-paradigmatic

Lee Sharkey10 Jun 2025 13:39 UTC
211 points
15 comments13 min readLW link

The In­tel­li­gence Sym­bio­sis Man­i­festo—Toward a Fu­ture of Liv­ing with AI

Hiroshi Yamakawa10 Jun 2025 10:23 UTC
7 points
2 comments2 min readLW link

Re­search Without Permission

Priyanka Bharadwaj10 Jun 2025 7:33 UTC
28 points
1 comment3 min readLW link

Some Hu­man That I Used to Know (Filk)

Gordon Seidoh Worley10 Jun 2025 4:29 UTC
11 points
3 comments1 min readLW link

Read the Pric­ing First

Max Niederman10 Jun 2025 2:22 UTC
174 points
14 comments1 min readLW link

A quick list of re­ward hack­ing interventions

Alex Mallen10 Jun 2025 0:58 UTC
49 points
5 comments3 min readLW link

Ghiblifi­ca­tion for Privacy

jefftk10 Jun 2025 0:30 UTC
75 points
47 comments1 min readLW link
(www.jefftk.com)

How to help friend who needs to get bet­ter at plan­ning?

shuffled-cantaloupe9 Jun 2025 23:28 UTC
12 points
4 comments1 min readLW link

Per­sonal Agents: AIs as trusted ad­vi­sors, care­tak­ers, and user proxies

JWJohnston9 Jun 2025 21:26 UTC
2 points
0 comments2 min readLW link

Cau­sa­tion, Cor­re­la­tion, and Con­found­ing: A Graph­i­cal Explainer

Tim Hua9 Jun 2025 20:46 UTC
12 points
2 comments9 min readLW link

When is it im­por­tant that open-weight mod­els aren’t re­leased? My thoughts on the benefits and dan­gers of open-weight mod­els in re­sponse to de­vel­op­ments in CBRN ca­pa­bil­ities.

ryan_greenblatt9 Jun 2025 19:19 UTC
63 points
11 comments9 min readLW link

METR’s Ob­ser­va­tions of Re­ward Hack­ing in Re­cent Fron­tier Models

Daniel Kokotajlo9 Jun 2025 18:03 UTC
100 points
9 comments11 min readLW link
(metr.org)

Ex­pec­ta­tion = in­ten­tion = set­point

jimmy9 Jun 2025 17:33 UTC
32 points
15 comments13 min readLW link

Iden­ti­fy­ing “De­cep­tion Vec­tors” In Models

Stephen Martin9 Jun 2025 17:30 UTC
12 points
0 comments1 min readLW link
(arxiv.org)

Policy De­sign: Ideas into Proposals

belos9 Jun 2025 17:26 UTC
2 points
0 comments7 min readLW link
(bestofagreatlot.substack.com)

Reflec­tions on an­thropic principle

Crazy philosopher9 Jun 2025 16:51 UTC
−5 points
13 comments1 min readLW link

Outer Align­ment is the Ne­c­es­sary Com­pli­ment to AI 2027′s Best Case Scenario

Josh Hickman9 Jun 2025 15:43 UTC
4 points
2 comments2 min readLW link

The Un­par­alleled Awe­some­ness of Effec­tive Altru­ism Conferences

Bentham's Bulldog9 Jun 2025 15:32 UTC
5 points
0 comments6 min readLW link

Dwarkesh Pa­tel on Con­tinual Learning

Zvi9 Jun 2025 14:50 UTC
35 points
1 comment20 min readLW link
(thezvi.wordpress.com)

The True Goal Fallacy

adamShimi9 Jun 2025 14:42 UTC
50 points
1 comment7 min readLW link
(formethods.substack.com)

Non-tech­ni­cal strate­gies for con­fronting a hu­man-level AI competitor

Jackson Emanuel9 Jun 2025 14:07 UTC
1 point
0 comments4 min readLW link

AI com­pa­nies’ eval re­ports mostly don’t sup­port their claims

Zach Stein-Perlman9 Jun 2025 13:00 UTC
207 points
13 comments4 min readLW link

Against ask­ing if AIs are conscious

AlexMennen9 Jun 2025 6:05 UTC
19 points
35 comments5 min readLW link

Be­ware the Del­more Effect

Lydia Nottingham9 Jun 2025 1:08 UTC
11 points
1 comment1 min readLW link

Busk­ing with Kids

jefftk9 Jun 2025 0:30 UTC
76 points
0 comments1 min readLW link
(www.jefftk.com)

AI in Govern­ment: Re­silience in an Era of AI Monoculture

prue8 Jun 2025 21:00 UTC
2 points
0 comments8 min readLW link
(www.prue0.com)

Emer­gence Spirals—what Yud­kowsky gets wrong

James Stephen Brown8 Jun 2025 19:02 UTC
29 points
25 comments9 min readLW link

Ad­minis­ter­ing im­munother­apy in the morn­ing seems to re­ally, re­ally mat­ter. Why?

Abhishaike Mahajan8 Jun 2025 16:37 UTC
35 points
0 comments10 min readLW link
(www.owlposting.com)

Emer­gent Misal­ign­ment on a Budget

8 Jun 2025 15:28 UTC
54 points
0 comments9 min readLW link