All 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 2022 2023 202420252026

All Jan Feb Mar Apr MayJunJul Aug Sep Oct Nov Dec

All 1 2 3 4 5 6 7 8 9 10 11 121314 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30

Forecast AI 2027

ChristianWilliams12 Jun 2025 21:12 UTC

20 points

0 comments1 min readLW link

(www.metaculus.com)

CRMArena-Pro: Holistic Assessment of LLM Agents Across Diverse Business Scenarios and Interactions

Annapurna12 Jun 2025 19:53 UTC

8 points

0 comments1 min readLW link

(arxiv.org)

When does training a model change its goals?

Vivek Hebbar and ryan_greenblatt

12 Jun 2025 18:43 UTC

78 points

3 comments15 min readLW link

Restraining Factors in AI Alignment Systems

theophilus tabuke12 Jun 2025 18:17 UTC

1 point

1 comment1 min readLW link

Analysis of Automated Prompt Engineering for Forecasting

ChristianWilliams12 Jun 2025 15:49 UTC

6 points

0 comments7 min readLW link

(www.metaculus.com)

AI #120: While o3 Turned Pro

Zvi12 Jun 2025 15:30 UTC

51 points

3 comments53 min readLW link

(thezvi.wordpress.com)

Towards mutually assured cooperation

mikko12 Jun 2025 15:15 UTC

5 points

0 comments1 min readLW link

What If We Could Monitor Human Intent?

Saif Khan12 Jun 2025 8:51 UTC

−8 points

6 comments3 min readLW link

The Way of a Skeptic

Martin Sustrik12 Jun 2025 5:40 UTC

38 points

2 comments6 min readLW link

(www.250bpm.com)

[Question] When should you read a biography?

CstineSublime12 Jun 2025 5:19 UTC

3 points

6 comments3 min readLW link

An Easily Overlooked Post on the Automation of Wisdom and Philosophy

Chris_Leong12 Jun 2025 2:54 UTC

19 points

0 comments1 min readLW link

(blog.aiimpacts.org)

Maybe Social Anxiety Is Just You Failing At Mind Control

25Hour11 Jun 2025 23:49 UTC

81 points

21 comments16 min readLW link

OpenAI now has an RL API which is broadly accessible

ryan_greenblatt11 Jun 2025 23:39 UTC

43 points

1 comment5 min readLW link

So You Want to Work at a Frontier AI Lab

Joe Rogero11 Jun 2025 23:11 UTC

48 points

14 comments7 min readLW link

(intelligence.org)

Commentary On The Turing Apocrypha

jdp11 Jun 2025 22:52 UTC

21 points

0 comments11 min readLW link

(minihf.com)

[Question] My friend wants a good book recommendation to understand AI, AI safety, and the field, and probably the drama. He’s smart but non-technical and not keeping up with trends. Any recs?

JohnGreer11 Jun 2025 22:32 UTC

9 points

0 comments1 min readLW link

The Dunning-Dunning-Kruger-Kruger Effect

ellifournier11 Jun 2025 21:02 UTC

−1 points

2 comments1 min readLW link

(ellifournier.substack.com)

A Revision to Market Monetarism: Individual Hoarding as Rational, Competition for Dollars as Zero-Sum?

Lorec11 Jun 2025 20:13 UTC

4 points

0 comments4 min readLW link

Investigating Accidental Misalignment: Causal Effects of Fine-Tuning Data on Model Vulnerability

Zhijing Jin, Punya Syon Pandey, samuelsimko and Kellin Pelrine

11 Jun 2025 19:30 UTC

6 points

0 comments5 min readLW link

The Dream of a Gentle Singularity

Zvi11 Jun 2025 19:30 UTC

57 points

7 comments12 min readLW link

(thezvi.wordpress.com)

Beware General Claims about “Generalizable Reasoning Capabilities” (of Modern AI Systems)

LawrenceC11 Jun 2025 19:27 UTC

297 points

19 comments16 min readLW link

Religion for Rationalists

Gordon Seidoh Worley11 Jun 2025 19:05 UTC

28 points

65 comments4 min readLW link

Difficulties of Eschatological policy making [Linkpost]

Noosphere8911 Jun 2025 14:12 UTC

11 points

3 comments3 min readLW link

(jack-clark.net)

Hydra

Matrice Jacobine11 Jun 2025 14:07 UTC

24 points

0 comments1 min readLW link

(philosophybear.substack.com)

SafeRLHub: An Interactive Resource for RL Safety and Interpretability

Siya and deneille

11 Jun 2025 5:47 UTC

11 points

0 comments7 min readLW link

More on policy arguments and the AB problem

Sniffnoy11 Jun 2025 4:42 UTC

10 points

0 comments4 min readLW link

Using AI Video Generation to Re-create Memories

Annapurna11 Jun 2025 4:06 UTC

−1 points

2 comments1 min readLW link

Conflicted on AI Politics

jefftk11 Jun 2025 3:40 UTC

27 points

5 comments2 min readLW link

(www.jefftk.com)

the void

nostalgebraist11 Jun 2025 3:19 UTC

397 points

107 comments1 min readLW link

(nostalgebraist.tumblr.com)

$500 bounty for engagement on asymmetric AI risk

YonatanK10 Jun 2025 21:50 UTC

23 points

14 comments2 min readLW link

AI-2027 Response: Inter-AI Tensions, Value Distillation, US Multipolarity, & More

Gatlen Culp10 Jun 2025 18:17 UTC

3 points

0 comments8 min readLW link

(gatlen.blog)

Give Me a Reason(ing Model)

Zvi10 Jun 2025 15:10 UTC

55 points

6 comments5 min readLW link

(thezvi.wordpress.com)

Mech interp is not pre-paradigmatic

Lee Sharkey10 Jun 2025 13:39 UTC

211 points

15 comments13 min readLW link

The Intelligence Symbiosis Manifesto—Toward a Future of Living with AI

Hiroshi Yamakawa10 Jun 2025 10:23 UTC

7 points

2 comments2 min readLW link

Research Without Permission

Priyanka Bharadwaj10 Jun 2025 7:33 UTC

28 points

1 comment3 min readLW link

Some Human That I Used to Know (Filk)

Gordon Seidoh Worley10 Jun 2025 4:29 UTC

11 points

3 comments1 min readLW link

Read the Pricing First

Max Niederman10 Jun 2025 2:22 UTC

174 points

14 comments1 min readLW link

A quick list of reward hacking interventions

Alex Mallen10 Jun 2025 0:58 UTC

49 points

5 comments3 min readLW link

Ghiblification for Privacy

jefftk10 Jun 2025 0:30 UTC

75 points

47 comments1 min readLW link

(www.jefftk.com)

How to help friend who needs to get better at planning?

shuffled-cantaloupe9 Jun 2025 23:28 UTC

12 points

4 comments1 min readLW link

Personal Agents: AIs as trusted advisors, caretakers, and user proxies

JWJohnston9 Jun 2025 21:26 UTC

2 points

0 comments2 min readLW link

Causation, Correlation, and Confounding: A Graphical Explainer

Tim Hua9 Jun 2025 20:46 UTC

12 points

2 comments9 min readLW link

When is it important that open-weight models aren’t released? My thoughts on the benefits and dangers of open-weight models in response to developments in CBRN capabilities.

ryan_greenblatt9 Jun 2025 19:19 UTC

63 points

11 comments9 min readLW link

METR’s Observations of Reward Hacking in Recent Frontier Models

Daniel Kokotajlo9 Jun 2025 18:03 UTC

100 points

9 comments11 min readLW link

(metr.org)

Expectation = intention = setpoint

jimmy9 Jun 2025 17:33 UTC

32 points

15 comments13 min readLW link

Identifying “Deception Vectors” In Models

Stephen Martin9 Jun 2025 17:30 UTC

12 points

0 comments1 min readLW link

(arxiv.org)

Policy Design: Ideas into Proposals

belos9 Jun 2025 17:26 UTC

2 points

0 comments7 min readLW link

(bestofagreatlot.substack.com)

Reflections on anthropic principle

Crazy philosopher9 Jun 2025 16:51 UTC

−5 points

13 comments1 min readLW link

Outer Alignment is the Necessary Compliment to AI 2027′s Best Case Scenario

Josh Hickman9 Jun 2025 15:43 UTC

4 points

2 comments2 min readLW link

The Unparalleled Awesomeness of Effective Altruism Conferences

Bentham's Bulldog9 Jun 2025 15:32 UTC

5 points

0 comments6 min readLW link