Fore­cast AI 2027

ChristianWilliams12 Jun 2025 21:12 UTC
20 points
0 comments1 min readLW link
(www.metaculus.com)

CRMArena-Pro: Holis­tic Assess­ment of LLM Agents Across Di­verse Busi­ness Sce­nar­ios and Interactions

Annapurna12 Jun 2025 19:53 UTC
8 points
0 comments1 min readLW link
(arxiv.org)

When does train­ing a model change its goals?

12 Jun 2025 18:43 UTC
78 points
3 comments15 min readLW link

Res­train­ing Fac­tors in AI Align­ment Sys­tems

theophilus tabuke12 Jun 2025 18:17 UTC
1 point
1 comment1 min readLW link

Anal­y­sis of Au­to­mated Prompt Eng­ineer­ing for Forecasting

ChristianWilliams12 Jun 2025 15:49 UTC
6 points
0 comments7 min readLW link
(www.metaculus.com)

AI #120: While o3 Turned Pro

Zvi12 Jun 2025 15:30 UTC
51 points
3 comments53 min readLW link
(thezvi.wordpress.com)

Towards mu­tu­ally as­sured cooperation

mikko12 Jun 2025 15:15 UTC
5 points
0 comments1 min readLW link

What If We Could Mon­i­tor Hu­man In­tent?

Saif Khan12 Jun 2025 8:51 UTC
−8 points
6 comments3 min readLW link

The Way of a Skeptic

Martin Sustrik12 Jun 2025 5:40 UTC
38 points
2 comments6 min readLW link
(www.250bpm.com)

[Question] When should you read a bi­og­ra­phy?

CstineSublime12 Jun 2025 5:19 UTC
3 points
6 comments3 min readLW link

An Easily Over­looked Post on the Au­toma­tion of Wis­dom and Philosophy

Chris_Leong12 Jun 2025 2:54 UTC
19 points
0 comments1 min readLW link
(blog.aiimpacts.org)

Maybe So­cial Anx­iety Is Just You Failing At Mind Control

25Hour11 Jun 2025 23:49 UTC
81 points
21 comments16 min readLW link

OpenAI now has an RL API which is broadly accessible

ryan_greenblatt11 Jun 2025 23:39 UTC
43 points
1 comment5 min readLW link

So You Want to Work at a Fron­tier AI Lab

Joe Rogero11 Jun 2025 23:11 UTC
48 points
14 comments7 min readLW link
(intelligence.org)

Com­men­tary On The Tur­ing Apocrypha

jdp11 Jun 2025 22:52 UTC
21 points
0 comments11 min readLW link
(minihf.com)

[Question] My friend wants a good book recom­men­da­tion to un­der­stand AI, AI safety, and the field, and prob­a­bly the drama. He’s smart but non-tech­ni­cal and not keep­ing up with trends. Any recs?

JohnGreer11 Jun 2025 22:32 UTC
9 points
0 comments1 min readLW link

The Dun­ning-Dun­ning-Kruger-Kruger Effect

ellifournier11 Jun 2025 21:02 UTC
−1 points
2 comments1 min readLW link
(ellifournier.substack.com)

A Re­vi­sion to Mar­ket Mone­tarism: In­di­vi­d­ual Hoard­ing as Ra­tional, Com­pe­ti­tion for Dol­lars as Zero-Sum?

Lorec11 Jun 2025 20:13 UTC
4 points
0 comments4 min readLW link

In­ves­ti­gat­ing Ac­ci­den­tal Misal­ign­ment: Causal Effects of Fine-Tun­ing Data on Model Vulnerability

11 Jun 2025 19:30 UTC
6 points
0 comments5 min readLW link

The Dream of a Gen­tle Singularity

Zvi11 Jun 2025 19:30 UTC
57 points
7 comments12 min readLW link
(thezvi.wordpress.com)

Be­ware Gen­eral Claims about “Gen­er­al­iz­able Rea­son­ing Ca­pa­bil­ities” (of Modern AI Sys­tems)

LawrenceC11 Jun 2025 19:27 UTC
297 points
19 comments16 min readLW link

Reli­gion for Rationalists

Gordon Seidoh Worley11 Jun 2025 19:05 UTC
28 points
65 comments4 min readLW link

Difficul­ties of Escha­tolog­i­cal policy mak­ing [Linkpost]

Noosphere8911 Jun 2025 14:12 UTC
11 points
3 comments3 min readLW link
(jack-clark.net)

Hydra

Matrice Jacobine11 Jun 2025 14:07 UTC
24 points
0 comments1 min readLW link
(philosophybear.substack.com)

SafeRLHub: An In­ter­ac­tive Re­source for RL Safety and Interpretability

11 Jun 2025 5:47 UTC
11 points
0 comments7 min readLW link

More on policy ar­gu­ments and the AB problem

Sniffnoy11 Jun 2025 4:42 UTC
10 points
0 comments4 min readLW link

Us­ing AI Video Gen­er­a­tion to Re-cre­ate Memories

Annapurna11 Jun 2025 4:06 UTC
−1 points
2 comments1 min readLW link

Con­flicted on AI Politics

jefftk11 Jun 2025 3:40 UTC
27 points
5 comments2 min readLW link
(www.jefftk.com)

the void

nostalgebraist11 Jun 2025 3:19 UTC
397 points
107 comments1 min readLW link
(nostalgebraist.tumblr.com)

$500 bounty for en­gage­ment on asym­met­ric AI risk

YonatanK10 Jun 2025 21:50 UTC
23 points
14 comments2 min readLW link

AI-2027 Re­sponse: In­ter-AI Ten­sions, Value Distil­la­tion, US Mul­tipo­lar­ity, & More

Gatlen Culp10 Jun 2025 18:17 UTC
3 points
0 comments8 min readLW link
(gatlen.blog)

Give Me a Rea­son(ing Model)

Zvi10 Jun 2025 15:10 UTC
55 points
6 comments5 min readLW link
(thezvi.wordpress.com)

Mech in­terp is not pre-paradigmatic

Lee Sharkey10 Jun 2025 13:39 UTC
211 points
15 comments13 min readLW link

The In­tel­li­gence Sym­bio­sis Man­i­festo—Toward a Fu­ture of Liv­ing with AI

Hiroshi Yamakawa10 Jun 2025 10:23 UTC
7 points
2 comments2 min readLW link

Re­search Without Permission

Priyanka Bharadwaj10 Jun 2025 7:33 UTC
28 points
1 comment3 min readLW link

Some Hu­man That I Used to Know (Filk)

Gordon Seidoh Worley10 Jun 2025 4:29 UTC
11 points
3 comments1 min readLW link

Read the Pric­ing First

Max Niederman10 Jun 2025 2:22 UTC
174 points
14 comments1 min readLW link

A quick list of re­ward hack­ing interventions

Alex Mallen10 Jun 2025 0:58 UTC
49 points
5 comments3 min readLW link

Ghiblifi­ca­tion for Privacy

jefftk10 Jun 2025 0:30 UTC
75 points
47 comments1 min readLW link
(www.jefftk.com)

How to help friend who needs to get bet­ter at plan­ning?

shuffled-cantaloupe9 Jun 2025 23:28 UTC
12 points
4 comments1 min readLW link

Per­sonal Agents: AIs as trusted ad­vi­sors, care­tak­ers, and user proxies

JWJohnston9 Jun 2025 21:26 UTC
2 points
0 comments2 min readLW link

Cau­sa­tion, Cor­re­la­tion, and Con­found­ing: A Graph­i­cal Explainer

Tim Hua9 Jun 2025 20:46 UTC
12 points
2 comments9 min readLW link

When is it im­por­tant that open-weight mod­els aren’t re­leased? My thoughts on the benefits and dan­gers of open-weight mod­els in re­sponse to de­vel­op­ments in CBRN ca­pa­bil­ities.

ryan_greenblatt9 Jun 2025 19:19 UTC
63 points
11 comments9 min readLW link

METR’s Ob­ser­va­tions of Re­ward Hack­ing in Re­cent Fron­tier Models

Daniel Kokotajlo9 Jun 2025 18:03 UTC
100 points
9 comments11 min readLW link
(metr.org)

Ex­pec­ta­tion = in­ten­tion = set­point

jimmy9 Jun 2025 17:33 UTC
32 points
15 comments13 min readLW link

Iden­ti­fy­ing “De­cep­tion Vec­tors” In Models

Stephen Martin9 Jun 2025 17:30 UTC
12 points
0 comments1 min readLW link
(arxiv.org)

Policy De­sign: Ideas into Proposals

belos9 Jun 2025 17:26 UTC
2 points
0 comments7 min readLW link
(bestofagreatlot.substack.com)

Reflec­tions on an­thropic principle

Crazy philosopher9 Jun 2025 16:51 UTC
−5 points
13 comments1 min readLW link

Outer Align­ment is the Ne­c­es­sary Com­pli­ment to AI 2027′s Best Case Scenario

Josh Hickman9 Jun 2025 15:43 UTC
4 points
2 comments2 min readLW link

The Un­par­alleled Awe­some­ness of Effec­tive Altru­ism Conferences

Bentham's Bulldog9 Jun 2025 15:32 UTC
5 points
0 comments6 min readLW link