All 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 2022 2023 202420252026

All Jan FebMarApr May Jun Jul Aug Sep Oct Nov Dec

All 1 2 3 4 5 6 7 8 9 10 111213 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31

The Grapes of Hardness

adamShimi11 Mar 2025 21:01 UTC

8 points

0 comments5 min readLW link

(formethods.substack.com)

Don’t over-update on FrontierMath results

David Matolcsi11 Mar 2025 20:44 UTC

47 points

7 comments9 min readLW link

Response to Scott Alexander on Imprisonment

Zvi11 Mar 2025 20:40 UTC

40 points

4 comments9 min readLW link

(thezvi.wordpress.com)

Paths and waystations in AI safety

Joe Carlsmith11 Mar 2025 18:52 UTC

42 points

1 comment11 min readLW link

(joecarlsmith.substack.com)

Meridian Cambridge Visiting Researcher Programme: Turn AI safety ideas into funded projects in one week!

Meridian Cambridge11 Mar 2025 17:46 UTC

13 points

0 comments2 min readLW link

Elon Musk May Be Transitioning to Bipolar Type I

Cyborg2511 Mar 2025 17:45 UTC

87 points

22 comments4 min readLW link

Scaling AI Regulation: Realistically, what Can (and Can’t) Be Regulated?

Katalina Hernandez11 Mar 2025 16:51 UTC

3 points

1 comment3 min readLW link

How Language Models Understand Nullability

Anish Tondwalkar and Alex Sanchez-Stern

11 Mar 2025 15:57 UTC

5 points

0 comments2 min readLW link

(dmodel.ai)

Forethought: a new AI macrostrategy group

Max Dalton, Tom Davidson, wdmacaskill and AmritSidhu-Brar

11 Mar 2025 15:39 UTC

32 points

0 comments3 min readLW link

Preparing for the Intelligence Explosion

fin and wdmacaskill

11 Mar 2025 15:38 UTC

79 points

17 comments1 min readLW link

(www.forethought.org)

stop solving problems that have already been solved

dhruvmethi11 Mar 2025 15:30 UTC

10 points

3 comments8 min readLW link

AI Control May Increase Existential Risk

Jan_Kulveit11 Mar 2025 14:30 UTC

101 points

13 comments1 min readLW link

When is it Better to Train on the Alignment Proxy?

dil-leik-og11 Mar 2025 13:35 UTC

14 points

0 comments9 min readLW link

A different take on the Musk v OpenAI preliminary injunction order

TFD11 Mar 2025 12:46 UTC

8 points

0 comments20 min readLW link

(www.thefloatingdroid.com)

Do reasoning models use their scratchpad like we do? Evidence from distilling paraphrases

Fabien Roger11 Mar 2025 11:52 UTC

128 points

23 comments11 min readLW link

(alignment.anthropic.com)

A Hogwarts Guide to Citizenship

WillPetillo11 Mar 2025 5:50 UTC

7 points

1 comment3 min readLW link

Trojan Sky

Richard_Ngo11 Mar 2025 3:14 UTC

260 points

40 comments12 min readLW link

(www.narrativeark.xyz)

OpenAI: Detecting misbehavior in frontier reasoning models

Daniel Kokotajlo11 Mar 2025 2:17 UTC

183 points

26 comments4 min readLW link

(openai.com)

HPMOR Anniversary Parties: Coordination, Resources, and Discussion

Screwtape11 Mar 2025 1:30 UTC

52 points

6 comments7 min readLW link

Progress links and short notes, 2025-03-10

jasoncrawford10 Mar 2025 20:27 UTC

8 points

0 comments4 min readLW link

(newsletter.rootsofprogress.org)

The Manus Marketing Madness

Zvi10 Mar 2025 20:10 UTC

54 points

0 comments24 min readLW link

(thezvi.wordpress.com)

You can just play

aswath krishnan10 Mar 2025 20:00 UTC

−5 points

0 comments2 min readLW link

How to Use Prompt Engineering to Rewire Your Brain

aswath krishnan10 Mar 2025 20:00 UTC

1 point

0 comments5 min readLW link

(www.aswathkrishnan.com)

When Independent Optimization Is Worse Than Randomness

Chaotic rationalist10 Mar 2025 19:46 UTC

−4 points

0 comments2 min readLW link

Stress exists only where the Mind makes it

Noahh10 Mar 2025 19:44 UTC

5 points

2 comments4 min readLW link

Counterargument to Godel’s Modal Ontological Argument

Wynn10 Mar 2025 19:38 UTC

−1 points

0 comments4 min readLW link

[Question] How much do frontier LLMs code and browse while in training?

Joe Rogero10 Mar 2025 19:34 UTC

7 points

0 comments1 min readLW link

Observations on self-supervised Learning for vision

Dinkar Juyal10 Mar 2025 19:31 UTC

3 points

0 comments5 min readLW link

Introducing 11 New AI Safety Organizations—Catalyze’s Winter 24/25 London Incubation Program Cohort

Alexandra Bos10 Mar 2025 19:26 UTC

75 points

0 comments14 min readLW link

The Jackpot Jinx (or why “Superintelligence Strategy” is wrong)

E.G. Blee-Goldman10 Mar 2025 19:18 UTC

13 points

0 comments5 min readLW link

Effective AI Outreach | A Data Driven Approach

NoahCWilson10 Mar 2025 19:18 UTC

1 point

0 comments15 min readLW link

Emergent AI Society. Tasks, Scarcity, Talks

Andrey Seryakov10 Mar 2025 19:18 UTC

1 point

0 comments5 min readLW link

Sentinel minutes #10/2025: Trump tariffs, US/China tensions, Claude code reward hacking.

NunoSempere10 Mar 2025 19:00 UTC

25 points

0 comments10 min readLW link

(blog.sentinel-team.org)

Have you actually tried raising the birth rate?

Yair Halberstadt10 Mar 2025 18:06 UTC

6 points

5 comments1 min readLW link

Split Personality Training: Revealing Latent Knowledge Through Personality-Shift Tokens

Florian_Dietz10 Mar 2025 16:07 UTC

49 points

7 comments9 min readLW link

We Have No Plan for Preventing Loss of Control in Open Models

Andrew Dickson10 Mar 2025 15:35 UTC

46 points

11 comments22 min readLW link

Lock-In Threat Models

Alfie Lamerton10 Mar 2025 10:22 UTC

5 points

0 comments8 min readLW link

Book Review: Affective Neuroscience

sarahconstantin10 Mar 2025 6:50 UTC

62 points

8 comments13 min readLW link

(sarahconstantin.substack.com)

The chessboard world

phdead10 Mar 2025 1:26 UTC

5 points

0 comments8 min readLW link

[Question] when will LLMs become human-level bloggers?

nostalgebraist9 Mar 2025 21:10 UTC

127 points

36 comments6 min readLW link

Everything I Know About Semantics I Learned From Music Notation

J Bostock9 Mar 2025 18:09 UTC

34 points

2 comments10 min readLW link

Phoenix Rising

Metacelsus9 Mar 2025 11:53 UTC

67 points

7 comments5 min readLW link

(denovo.substack.com)

How well can Claude write coding questions?

bodry9 Mar 2025 5:29 UTC

9 points

3 comments12 min readLW link

A model of the final phase: the current frontier AIs as de facto CEOs of their own companies

Mitchell_Porter8 Mar 2025 22:15 UTC

23 points

2 comments1 min readLW link

Harry Potter and the Methods of Rationality 10 Year Anniversary Party!

Robert Cousineau8 Mar 2025 21:29 UTC

6 points

0 comments1 min readLW link

A case for peer-reviewed conspiracy theories

Sam G8 Mar 2025 20:41 UTC

13 points

3 comments4 min readLW link

The machine has no mouth and it must scream

zef8 Mar 2025 16:40 UTC

80 points

1 comment7 min readLW link

(zephyyr.substack.com)

How Do We Fix the Education Crisis?

programjames8 Mar 2025 2:59 UTC

15 points

5 comments8 min readLW link

GPT-4.5 Can Play Losing Chess

GoteNoSente8 Mar 2025 0:58 UTC

9 points

0 comments1 min readLW link

(chatgpt.com)

[Question] are “almost-p-zombies” possible?

KvmanThinking7 Mar 2025 22:58 UTC

4 points

3 comments1 min readLW link