All 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 2022 2023 202420252026

All Jan Feb Mar Apr MayJunJul Aug Sep Oct Nov Dec

All 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 242526 27 28 29 30

A regime-change power-vacuum conjecture about group belief

TsviBT24 Jun 2025 23:16 UTC

42 points

16 comments3 min readLW link

Apply to be a mentor in SPAR!

agucova24 Jun 2025 23:00 UTC

10 points

0 comments1 min readLW link

Machines of Faithful Obedience

Boaz Barak24 Jun 2025 22:06 UTC

41 points

19 comments10 min readLW link

Gradient Descent on Token Input Embeddings

KAP24 Jun 2025 20:24 UTC

8 points

1 comment6 min readLW link

A crisis simulation changed how I think about AI risk

sjadler24 Jun 2025 20:04 UTC

5 points

0 comments2 min readLW link

(open.substack.com)

Towards a theory of local altruism

DMMF24 Jun 2025 19:39 UTC

11 points

1 comment5 min readLW link

(notnottalmud.substack.com)

Why “training against scheming” is hard

Marius Hobbhahn24 Jun 2025 19:08 UTC

66 points

2 comments12 min readLW link

Analyzing A Critique Of The AI 2027 Timeline Forecasts

Zvi24 Jun 2025 18:50 UTC

76 points

38 comments30 min readLW link

(thezvi.wordpress.com)

What does 10x-ing effective compute get you?

ryan_greenblatt24 Jun 2025 18:33 UTC

55 points

10 comments12 min readLW link

My pitch for the AI Village

Daniel Kokotajlo24 Jun 2025 15:00 UTC

184 points

35 comments5 min readLW link

An Analogy for Interpretability

Roman Malov24 Jun 2025 14:56 UTC

13 points

2 comments2 min readLW link

The V&V method—A step towards safer AGI

Yoav Hollander24 Jun 2025 13:42 UTC

20 points

1 comment1 min readLW link

(blog.foretellix.com)

Try o3-pro in ChatGPT for $1 (is AI a bubble?)

Hauke Hillebrandt24 Jun 2025 11:15 UTC

15 points

2 comments4 min readLW link

Belief in continuity of personhood can be money-pumped

Filip Sondej24 Jun 2025 9:39 UTC

3 points

6 comments1 min readLW link

What can be learned from scary demos? A snitching case study

Fabien Roger24 Jun 2025 8:40 UTC

28 points

5 comments7 min readLW link

How to Host a Find Me Party

Commander Zander24 Jun 2025 0:56 UTC

13 points

1 comment2 min readLW link

Local Speech Recognition with Whisper

jefftk24 Jun 2025 0:30 UTC

11 points

0 comments2 min readLW link

(www.jefftk.com)

Twig—Fiction Review

Commander Zander24 Jun 2025 0:04 UTC

17 points

0 comments1 min readLW link

The Lazarus Project—TV Review

Commander Zander23 Jun 2025 23:53 UTC

6 points

0 comments1 min readLW link

Frontier AI Labs: the Call Option to AGI

ykevinzhang23 Jun 2025 21:02 UTC

22 points

2 comments10 min readLW link

The Rose Test: a fun way to feel with your guts (not just logically understand) why AI-safety matters right now (and get new adepts)

lovagrus23 Jun 2025 21:02 UTC

10 points

0 comments3 min readLW link

Knowledge Extraction Plan (KEP): An alternative to reckless scaling

aswani23 Jun 2025 20:58 UTC

1 point

0 comments1 min readLW link

History repeats itself: Frontier AI labs acting like Amazon in the early 2000s

i_am_nuts23 Jun 2025 20:56 UTC

8 points

0 comments3 min readLW link

(iamnuts.substack.com)

Open Thread—Summer 2025

habryka23 Jun 2025 20:54 UTC

23 points

70 comments1 min readLW link

Childhood and Education #10: Behaviors

Zvi23 Jun 2025 20:40 UTC

30 points

4 comments15 min readLW link

(thezvi.wordpress.com)

Compressed Computation is (probably) not Computation in Superposition

Jai Bhagat, Sara Molas Medina, Giorgi Giglemiani and StefanHex

23 Jun 2025 19:35 UTC

59 points

9 comments10 min readLW link

Situational Awareness: A One-Year Retrospective

Nathan Delisle23 Jun 2025 19:15 UTC

82 points

5 comments12 min readLW link

Recognizing Optimality

jimmy23 Jun 2025 17:55 UTC

20 points

0 comments7 min readLW link

Comparing risk from internally-deployed AI to insider and outsider threats from humans

Buck23 Jun 2025 17:47 UTC

150 points

22 comments3 min readLW link

Foom & Doom 2: Technical alignment is hard

Steven Byrnes23 Jun 2025 17:19 UTC

175 points

68 comments28 min readLW link

Foom & Doom 1: “Brain in a box in a basement”

Steven Byrnes23 Jun 2025 17:18 UTC

302 points

125 comments29 min readLW link

The Lies of Big Bug

Bentham's Bulldog23 Jun 2025 16:03 UTC

−4 points

2 comments4 min readLW link

AI companies aren’t planning to secure critical model weights

Zach Stein-Perlman23 Jun 2025 16:00 UTC

15 points

0 comments1 min readLW link

“It isn’t magic”

Ben (Berlin)23 Jun 2025 14:00 UTC

92 points

17 comments2 min readLW link

Forecasting AI Forecasting

Alvin Ånestrand23 Jun 2025 13:39 UTC

15 points

4 comments6 min readLW link

Recent progress on the science of evaluations

PabloAMC23 Jun 2025 9:41 UTC

14 points

1 comment8 min readLW link

Racial Dating Preferences and Sexual Racism

koreindian23 Jun 2025 3:57 UTC

56 points

70 comments32 min readLW link

(vishalblog.substack.com)

Mainstream Grantmaking Expertise (Post 7 of 7 on AI Governance)

Mass_Driver23 Jun 2025 1:39 UTC

56 points

7 comments37 min readLW link

[Question] How does the LessWrong team generate the website illustrations?

Nina Panickssery23 Jun 2025 0:05 UTC

16 points

1 comment1 min readLW link

The AI’s Toolbox: From Soggy Toast to Optimal Solutions

Thehumanproject.ai22 Jun 2025 20:54 UTC

1 point

0 comments8 min readLW link

Black-box interpretability methodology blueprint: Probing runaway optimisation in LLMs

Roland Pihlakas22 Jun 2025 18:16 UTC

17 points

0 comments7 min readLW link

The Croissant Principle: A Theory of AI Generalization

Jeffrey Liang22 Jun 2025 17:58 UTC

20 points

6 comments2 min readLW link

Relational Design Can’t Be Left to Chance

Priyanka Bharadwaj22 Jun 2025 15:32 UTC

5 points

0 comments3 min readLW link

Grounding to Avoid Airplane Delays

jefftk22 Jun 2025 1:50 UTC

30 points

0 comments2 min readLW link

(www.jefftk.com)

Open questions on compatibilist free will and subjunctive dependence

Jack Thompson22 Jun 2025 1:15 UTC

3 points

0 comments1 min readLW link

(jacktlab.substack.com)

The Sixteen Kinds of Intimacy

Ruby21 Jun 2025 19:59 UTC

57 points

2 comments5 min readLW link

Book review: Against Method

Valdes21 Jun 2025 18:59 UTC

9 points

0 comments6 min readLW link

Contrived evaluations are useful evaluations

pradyuprasad21 Jun 2025 18:18 UTC

3 points

0 comments3 min readLW link

(speculativedecoding.substack.com)

Consider chilling out in 2028

Valentine21 Jun 2025 17:07 UTC

212 points

144 comments13 min readLW link

Upcoming workshop on Post-AGI Civilizational Equilibria

David Duvenaud, Jan_Kulveit, Raymond Douglas, Nora_Ammann and David Scott Krueger

21 Jun 2025 15:57 UTC

25 points

0 comments1 min readLW link