All 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 2022 2023 202420252026

All Jan Feb Mar Apr May JunJulAug Sep Oct Nov Dec

All 1 2 3 456 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31

Claude is a Ravenclaw

Adam Newgas4 Jul 2025 21:32 UTC

67 points

9 comments2 min readLW link

(www.boristhebrave.com)

To All Language Models Reading This

Ivan Vendrov4 Jul 2025 17:56 UTC

11 points

0 comments4 min readLW link

(nothinghuman.substack.com)

Market Pulse Challenge 25Q3, $7,500 Prize Pool

ChristianWilliams4 Jul 2025 17:03 UTC

4 points

0 comments1 min readLW link

How much novel security-critical infrastructure do you need during the singularity?

Buck4 Jul 2025 16:54 UTC

57 points

7 comments5 min readLW link

Early Signs of Steganographic Capabilities in Frontier LLMs

Kei Nishimura-Gasparian, Artur Zolkowski, robert mccarthy and David Lindner

4 Jul 2025 16:36 UTC

33 points

5 comments2 min readLW link

Dear Paperclip Maximizer, Please Don’t Turn Off the Simulation

James_Miller and turchin

4 Jul 2025 16:13 UTC

14 points

6 comments4 min readLW link

Two proposed projects on abstract analogies for scheming

Julian Stastny4 Jul 2025 16:03 UTC

49 points

0 comments3 min readLW link

Mouse caviar: mass-production of eggs

Metacelsus4 Jul 2025 15:44 UTC

17 points

0 comments3 min readLW link

(denovo.substack.com)

‘AI for societal uplift’ as a path to victory

Raymond Douglas4 Jul 2025 15:32 UTC

93 points

22 comments2 min readLW link

The Self-Hating Attention Head: A Deep Dive in GPT-2

Matteo Migliarini4 Jul 2025 13:07 UTC

12 points

0 comments7 min readLW link

How AI researchers define AI sentience? Participate in the poll

Valentin20264 Jul 2025 12:29 UTC

7 points

4 comments1 min readLW link

Housing Roundup #12

Zvi4 Jul 2025 12:10 UTC

24 points

9 comments26 min readLW link

(thezvi.wordpress.com)

[Question] Can a pre-commitment to not give in to blackmail be “countered” by a pre-commitment to ignore such pre-commitments?

Sappique4 Jul 2025 11:48 UTC

10 points

12 comments1 min readLW link

Outlive: A Critical Review

MichaelDickens4 Jul 2025 2:14 UTC

67 points

4 comments27 min readLW link

(mdickens.me)

Layered AI Defenses Have Holes: Vulnerabilities and Key Recommendations

smallsilo, Ian McKenzie, Oskar Hollinsworth, Tom Tseng, Xander Davies, scasper, Aaron Tucker, Robert Kirk and Adam Gleave

4 Jul 2025 0:07 UTC

13 points

1 comment4 min readLW link

(far.ai)

MIRI Newsletter #123

Harlan and Rob Bensinger

3 Jul 2025 22:56 UTC

54 points

0 comments2 min readLW link

(intelligence.org)

Making Sense of Consciousness Part 2: Attention

sarahconstantin3 Jul 2025 21:20 UTC

16 points

1 comment6 min readLW link

(sarahconstantin.substack.com)

Battle of the Sexes—how to solve any (solvable) dispute

James Stephen Brown3 Jul 2025 19:21 UTC

7 points

0 comments3 min readLW link

(nonzerosum.games)

How worker co-ops can help restore social trust

B Jacobs3 Jul 2025 19:13 UTC

12 points

7 comments6 min readLW link

(bobjacobs.substack.com)

The Ultimatum Game—take it or leave it

James Stephen Brown3 Jul 2025 19:05 UTC

11 points

1 comment2 min readLW link

(nonzerosum.games)

A comment on Bayesian vs. frequentist statistical practice

bilibili3 Jul 2025 17:47 UTC

0 points

0 comments1 min readLW link

AISN #58: Senate Removes State AI Regulation Moratorium

Corin Katzke and Dan H

3 Jul 2025 17:26 UTC

6 points

0 comments4 min readLW link

(newsletter.safe.ai)

Research Note: Our scheming precursor evals had limited predictive power for our in-context scheming evals

Marius Hobbhahn3 Jul 2025 15:57 UTC

75 points

0 comments1 min readLW link

(www.apolloresearch.ai)

AI #123: Moratorium Moratorium

Zvi3 Jul 2025 15:40 UTC

33 points

1 comment49 min readLW link

(thezvi.wordpress.com)

Call for suggestions—AI safety course

Boaz Barak3 Jul 2025 14:30 UTC

54 points

23 comments1 min readLW link

Why I am not a polygenic score nihilist

David Hugh-Jones3 Jul 2025 13:38 UTC

6 points

0 comments2 min readLW link

(wyclif.substack.com)

Hunch: minimalism is correct

Adam Zerner3 Jul 2025 5:03 UTC

18 points

12 comments2 min readLW link

If Anyone Builds It, Everyone Dies: Advertisement design competition

yams2 Jul 2025 23:14 UTC

86 points

37 comments1 min readLW link

(intelligence.org)

Dialects for Humans: Sounding Distinct from LLMs

nebrelbug2 Jul 2025 23:03 UTC

9 points

2 comments2 min readLW link

Congress Asks Better Questions

Zvi2 Jul 2025 22:10 UTC

48 points

1 comment17 min readLW link

(thezvi.wordpress.com)

Eating Honey is (Probably) Fine, Actually

Linch2 Jul 2025 22:09 UTC

36 points

0 comments3 min readLW link

(linch.substack.com)

On Paying Attention

Alex Darby2 Jul 2025 21:52 UTC

5 points

0 comments7 min readLW link

Curing PMDD with Hair Loss Pills

David Lorell2 Jul 2025 21:35 UTC

105 points

4 comments8 min readLW link

[Question] RSS feed for 1 LW user?

Commander Zander2 Jul 2025 20:19 UTC

10 points

2 comments1 min readLW link

Thought Anchors: Which LLM Reasoning Steps Matter?

Uzay Macar, Paul Bogdan, Neel Nanda and Arthur Conmy

2 Jul 2025 20:16 UTC

35 points

6 comments6 min readLW link

(www.thought-anchors.com)

Cyberpunk Yoga

Commander Zander2 Jul 2025 20:04 UTC

7 points

0 comments3 min readLW link

The influence conjecture and its implcations

Bastian Gronager2 Jul 2025 19:36 UTC

−1 points

0 comments5 min readLW link

Idea on Bayes’ Theorem

BJ33832 Jul 2025 19:27 UTC

3 points

3 comments1 min readLW link

The Prisoner’s Dilemma—A Problematic Poster-Child

James Stephen Brown2 Jul 2025 19:10 UTC

9 points

0 comments3 min readLW link

AI Task Length Horizons in Offensive Cybersecurity

Sean Peters2 Jul 2025 18:36 UTC

73 points

10 comments12 min readLW link

Slicing the (Kosher) Hate Salami

ymeskhout2 Jul 2025 18:11 UTC

22 points

5 comments11 min readLW link

(www.ymeskhout.com)

Race and Gender Bias As An Example of Unfaithful Chain of Thought in the Wild

Adam Karvonen and Sam Marks

2 Jul 2025 16:35 UTC

191 points

26 comments4 min readLW link

Executive Belocracy: Review of Organization Types

belos2 Jul 2025 15:56 UTC

−1 points

0 comments11 min readLW link

(bestofagreatlot.substack.com)

There are two fundamentally different constraints on schemers

Buck2 Jul 2025 15:51 UTC

63 points

0 comments4 min readLW link

Mythbusting the supposed “1,000+ AI state bills that would hobble innovation”

sjadler2 Jul 2025 4:49 UTC

6 points

0 comments1 min readLW link

(stevenadler.substack.com)

[Question] Are LLMs being trained using LessWrong text?

Cedar2 Jul 2025 3:00 UTC

7 points

4 comments1 min readLW link

“What’s my goal?”

Raemon2 Jul 2025 2:58 UTC

132 points

9 comments2 min readLW link

Use AI to Dimensionalize

Jordan Rubin2 Jul 2025 2:43 UTC

10 points

1 comment3 min readLW link

(jordanmrubin.substack.com)

Why Engaging with Global Majority AI Policy Matters

Heramb2 Jul 2025 1:46 UTC

9 points

0 comments2 min readLW link

Lessons from Building Secular Ritual: A Winter Solstice Experiment

joshuamerriam2 Jul 2025 0:55 UTC

9 points

0 comments4 min readLW link