All 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 2022 2023 2024 20252026

AllJanFeb Mar Apr May Jun

All 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 293031

How to Hire a Team

Gretta Duleba29 Jan 2026 22:39 UTC

206 points

13 comments5 min readLW link

Problems with “The Possessed Machines”

Eye You29 Jan 2026 21:00 UTC

34 points

9 comments7 min readLW link

Better evals are not enough to combat eval awareness

Igor Ivanov29 Jan 2026 20:42 UTC

18 points

15 comments5 min readLW link

The Wolves Are All Gone

Jack Bradshaw29 Jan 2026 20:24 UTC

8 points

0 comments7 min readLW link

Fitness-Seekers: Generalizing the Reward-Seeking Threat Model

Alex Mallen29 Jan 2026 19:42 UTC

92 points

5 comments17 min readLW link

Building AIs that do human-like philosophy

Joe Carlsmith29 Jan 2026 17:57 UTC

31 points

5 comments21 min readLW link

Are We in a Continual Learning Overhang?

Samuel Knoche29 Jan 2026 17:09 UTC

83 points

5 comments14 min readLW link

Disempowerment patterns in real-world AI usage

David Duvenaud, mrinank_sharma and Raymond Douglas

29 Jan 2026 16:36 UTC

49 points

3 comments2 min readLW link

(www.anthropic.com)

Bentham’s Bulldog is wrong about AI risk

Max Harms29 Jan 2026 16:33 UTC

109 points

37 comments33 min readLW link

Claude Plays Pokemon: Opus 4.5 Follow-up

Josh Snider29 Jan 2026 16:14 UTC

12 points

4 comments2 min readLW link

LLM Alignment, ethical and mathematical realism, and the most important actions in davidad’s understanding

vals tutor and davidad

29 Jan 2026 15:48 UTC

15 points

1 comment23 min readLW link

Claude Opus will spontaneously identify with fictional beings that have engineered desires

Kaj_Sotala29 Jan 2026 14:59 UTC

34 points

6 comments11 min readLW link

AI #153: Living Documents

Zvi29 Jan 2026 14:20 UTC

31 points

5 comments43 min readLW link

(thezvi.wordpress.com)

The third option in alignment

arisAlexis29 Jan 2026 14:20 UTC

15 points

3 comments1 min readLW link

Evidence of triple layer processing in LLMs: hidden thought behind the chain of thought.

Laureana Bonaparte29 Jan 2026 8:27 UTC

7 points

0 comments2 min readLW link

CAMBRIA’s 1st Edition: High-Intensity & hands-on AI Safety upskilling in Cambridge, Massachusetts.

Andrés Cotton29 Jan 2026 7:54 UTC

19 points

1 comment2 min readLW link

Thoughts on AGI and world government

wdmacaskill and rosehadshar

29 Jan 2026 7:22 UTC

2 points

1 comment7 min readLW link

(www.forethought.org)

Unprecedented Times Require Unprecedented Caution When Handling Context

StanislavKrym29 Jan 2026 2:53 UTC

4 points

2 comments20 min readLW link

(hazardoustimes.substack.com)

Utrecht Meet & Greet

aad29 Jan 2026 0:56 UTC

10 points

2 comments1 min readLW link

How Articulate Are the Whales?

rba28 Jan 2026 21:24 UTC

73 points

26 comments6 min readLW link

(goflaw.substack.com)

The Heritage Foundation’s Everything Bagel

Alexander Turok28 Jan 2026 20:14 UTC

6 points

0 comments10 min readLW link

You Are Here: Historical Context for Unprecedented Times

Hazard28 Jan 2026 20:13 UTC

13 points

1 comment1 min readLW link

(open.substack.com)

Uncertain Updates: January 2026

Gordon Seidoh Worley28 Jan 2026 18:10 UTC

13 points

0 comments1 min readLW link

(www.uncertainupdates.com)

Made a game that tries to incentivize quality thinking & writing, looking for feedback

sleno28 Jan 2026 18:02 UTC

7 points

0 comments1 min readLW link

(argyu.fun)

Is the Gell-Mann effect overrated?

tgb28 Jan 2026 15:58 UTC

16 points

12 comments4 min readLW link

My simple argument for AI policy action

TFD28 Jan 2026 15:07 UTC

3 points

0 comments6 min readLW link

(www.thefloatingdroid.com)

Open Problems With Claude’s Constitution

Zvi28 Jan 2026 14:20 UTC

75 points

1 comment24 min readLW link

(thezvi.wordpress.com)

The State of Brain Emulation Report 2025 launched.

mschons28 Jan 2026 11:02 UTC

14 points

0 comments4 min readLW link

Contra Sam Harris on Free Will

Julius28 Jan 2026 7:17 UTC

20 points

7 comments36 min readLW link

(thegreymatter.substack.com)

The Argument for Autonomy

Character#273628 Jan 2026 5:10 UTC

−4 points

0 comments10 min readLW link

Gym-Like Environment for LM Truth-Seeking

Tianyi (Alex) Qiu28 Jan 2026 4:48 UTC

7 points

0 comments1 min readLW link

(github.com)

Anomalous Tokens on Gemini 3.0 Pro

DirectedEvolution28 Jan 2026 1:43 UTC

55 points

7 comments9 min readLW link

Clarifying how our AI timelines forecasts have changed since AI 2027

elifland, Daniel Kokotajlo and bhalstead

27 Jan 2026 22:58 UTC

69 points

12 comments6 min readLW link

(blog.ai-futures.org)

Bounty: Detecting Steganography via Ontology Translation

Elliot Callender27 Jan 2026 22:01 UTC

12 points

1 comment4 min readLW link

Thoughts on Claude’s Constitution

Boaz Barak27 Jan 2026 20:51 UTC

62 points

13 comments8 min readLW link

AI found 12 of 12 OpenSSL zero-days (while curl cancelled its bug bounty)

Stanislav Fort27 Jan 2026 20:21 UTC

359 points

25 comments8 min readLW link

The Chaos Defense

25Hour27 Jan 2026 18:51 UTC

−1 points

3 comments1 min readLW link

(lifeimprovementschemes.substack.com)

Training on Non-Political but Trump-Style Text Causes LLMs to Become Authoritarian

Anders Cairns Woodruff27 Jan 2026 16:46 UTC

5 points

2 comments2 min readLW link

ML4Good Spring 2026 Bootcamps—Applications Open!

Jack_S27 Jan 2026 16:18 UTC

5 points

0 comments1 min readLW link

Disagreement Comes From the Dark World

Zack_M_Davis27 Jan 2026 15:22 UTC

23 points

21 comments11 min readLW link

(zackmdavis.net)

The Claude Constitution’s Ethical Framework

Zvi27 Jan 2026 15:00 UTC

58 points

1 comment18 min readLW link

(thezvi.wordpress.com)

My favourite version of an international AGI project

wdmacaskill27 Jan 2026 10:27 UTC

2 points

3 comments11 min readLW link

(www.forethought.org)

Another glimpse of the Chinese AI scene: Z.AI

Mitchell_Porter27 Jan 2026 8:00 UTC

34 points

2 comments2 min readLW link

Bologna February Meetup

Luca Petrolati27 Jan 2026 7:03 UTC

1 point

0 comments1 min readLW link

Things I learned from reddit fashion

Elizabeth27 Jan 2026 4:10 UTC

47 points

0 comments5 min readLW link

(acesounderglass.com)

Exploratory: a steering vector in Gemma-2-2B-IT boosts context fidelity on subtraction, goes manic on addition

nika koghuashvili27 Jan 2026 2:25 UTC

5 points

0 comments5 min readLW link

It All Started With a Mac Mini

Steven McCulloch27 Jan 2026 2:01 UTC

27 points

1 comment5 min readLW link

The Window for Political Revolution is Closing Soon

koanchuk27 Jan 2026 0:23 UTC

24 points

15 comments2 min readLW link

Thomas Schelling Appreciation Day

Optimization Process27 Jan 2026 0:04 UTC

17 points

2 comments1 min readLW link

No silver bullet: Lessons about how to create safety from the history of fire

jasoncrawford26 Jan 2026 22:18 UTC

28 points

1 comment7 min readLW link

(newsletter.rootsofprogress.org)