All 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 2022 2023 2024 20252026

All JanFebMar Apr May Jun

All 1 2 3 4 5 6 7 8 9 10 11 12 131415 16 17 18 19 20 21 22 23 24 25 26 27 28

Why I’m Worried About Job Loss + Thoughts on Comparative Advantage

claywren13 Feb 2026 23:36 UTC

62 points

5 comments11 min readLW link

METR Time Horizons: Now 10x/Year

johncrox13 Feb 2026 23:01 UTC

28 points

6 comments3 min readLW link

Use more text than one token to avoid neuralese

Jude Stiel13 Feb 2026 21:09 UTC

10 points

4 comments1 min readLW link

Hazards of Selection Effects on Approved Information

Zack_M_Davis13 Feb 2026 18:51 UTC

56 points

11 comments12 min readLW link

(zackmdavis.net)

OpenClaw Newsletter

Jacobson13 Feb 2026 17:59 UTC

2 points

1 comment5 min readLW link

ChatGPT-5.3-Codex Is Also Good At Coding

Zvi13 Feb 2026 16:20 UTC

45 points

2 comments20 min readLW link

(thezvi.wordpress.com)

Replication of Koorndijk (2025): Differential Compliance May Reflect Prompt Sensitivity Rather Than Strategic Reasoning

Chijioke Ugwuanyi and TerryJCZhang

13 Feb 2026 16:12 UTC

9 points

0 comments8 min readLW link

Towards an objective test of Compassion—Turning an abstract test into a collection of nuances

tailcalled13 Feb 2026 15:03 UTC

12 points

0 comments7 min readLW link

(Updated) METR’s data can’t distinguish between trajectories (and 80% horizons are an order of magnitude off)

Jonas Moss13 Feb 2026 14:05 UTC

28 points

10 comments10 min readLW link

We Die Because it’s a Computational Necessity

E.G. Blee-Goldman13 Feb 2026 13:16 UTC

2 points

3 comments22 min readLW link

Hazardous States and Accidents

kqr13 Feb 2026 13:02 UTC

4 points

0 comments4 min readLW link

(entropicthoughts.com)

Systemic Risks and Where to Find Them

Jonas Hallgren13 Feb 2026 10:51 UTC

14 points

0 comments20 min readLW link

(equilibria1.substack.com)

Nick Bostrom: Optimal Timing for Superintelligence

Julian Bradshaw13 Feb 2026 7:33 UTC

8 points

3 comments2 min readLW link

(nickbostrom.com)

Why You Don’t Believe in Xhosa Prophecies

Jan_Kulveit13 Feb 2026 2:25 UTC

265 points

28 comments4 min readLW link

Gemini’s Hypothetical Present

jefftk13 Feb 2026 2:20 UTC

101 points

9 comments2 min readLW link

(www.jefftk.com)

I Tried to Trick Myself into Being a Better Planner & Problem Solver

CstineSublime13 Feb 2026 0:25 UTC

7 points

2 comments3 min readLW link

Grading AI 2027′s 2025 Predictions

Daniel Kokotajlo and elifland

13 Feb 2026 0:18 UTC

64 points

4 comments9 min readLW link

(blog.ai-futures.org)

Long-term risks from ideological fanaticism

David Althaus, Jamie_Harris, Vanessa Sarre, Clare and _will_

12 Feb 2026 23:26 UTC

99 points

12 comments84 min readLW link

(Re)Discovering Natural Laws

Margot12 Feb 2026 21:45 UTC

13 points

0 comments17 min readLW link

An Ontology of Representations: Limits of Universality

Margot12 Feb 2026 21:43 UTC

23 points

1 comment39 min readLW link

A Closer Look at the “Societies of Thought” Paper

Against Moloch12 Feb 2026 21:38 UTC

10 points

0 comments3 min readLW link

(againstmoloch.com)

models have some pretty funny attractor states

aryaj, Senthooran Rajamanoharan and Neel Nanda

12 Feb 2026 21:14 UTC

275 points

38 comments18 min readLW link

Stay in your human loop

benjamin ar12 Feb 2026 21:05 UTC

22 points

0 comments5 min readLW link

(bjar.substack.com)

The case for industrial evals

Andre Assis and Monte M

12 Feb 2026 20:45 UTC

16 points

0 comments23 min readLW link

Multiverse sampling assumption

avturchin12 Feb 2026 19:59 UTC

12 points

0 comments5 min readLW link

What We Learned from Briefing 140+ Lawmakers on the Threat from AI

leticiagarcia12 Feb 2026 19:53 UTC

174 points

7 comments14 min readLW link

(substack.com)

Paper: Prompt Optimization Makes Misalignment Legible

Caleb Biddulph and micahcarroll

12 Feb 2026 19:45 UTC

63 points

8 comments8 min readLW link

Claude’s Constitution

PeterMcCluskey12 Feb 2026 19:44 UTC

15 points

4 comments6 min readLW link

(bayesianinvestor.com)

Human-like metacognitive skills will reduce LLM slop and aid alignment and capabilities

Seth Herd12 Feb 2026 19:38 UTC

48 points

16 comments18 min readLW link

Good AI Epistemics as an Offramp from the Intelligence Explosion

Ben Goldhaber12 Feb 2026 19:18 UTC

23 points

2 comments3 min readLW link

How Secret Loyalty Differs from Standard Backdoor Threats

Joe Kwon12 Feb 2026 18:48 UTC

23 points

4 comments12 min readLW link

You get about.… how many words exactly?

Raemon12 Feb 2026 18:06 UTC

21 points

1 comment7 min readLW link

Basic Legibility Protocols Improve Trusted Monitoring

SebastianP and theashwinner

12 Feb 2026 17:50 UTC

8 points

4 comments11 min readLW link

A research agenda for the final year

Mitchell_Porter12 Feb 2026 17:24 UTC

13 points

22 comments3 min readLW link

Polysemanticity is a Misnomer

Shiva's Right Foot12 Feb 2026 17:22 UTC

11 points

0 comments3 min readLW link

Optimal Timing for Superintelligence: Mundane Considerations for Existing People

Nick Bostrom12 Feb 2026 17:06 UTC

49 points

89 comments72 min readLW link

How do we (more) safely defer to AIs?

ryan_greenblatt and Julian Stastny

12 Feb 2026 16:55 UTC

83 points

5 comments72 min readLW link

A Conceptual Framework for Exploration Hacking

Joschka Braun, Eyon Jang and Damon Falck

12 Feb 2026 16:33 UTC

26 points

2 comments9 min readLW link

AI #155: Welcome to Recursive Self-Improvement

Zvi12 Feb 2026 16:10 UTC

52 points

5 comments56 min readLW link

(thezvi.wordpress.com)

The Facade of AI Safety Will Crumble

Liron12 Feb 2026 15:57 UTC

36 points

11 comments4 min readLW link

(doomdebates.com)

The history of light

Kotlopou12 Feb 2026 14:16 UTC

16 points

0 comments1 min readLW link

(beatingthehydra.substack.com)

Three Worlds Collide assumes calibration is solved

Vyacheslav Ladischenski (Slava)12 Feb 2026 4:28 UTC

7 points

1 comment3 min readLW link

Research note: A simpler AI timelines model predicts 99% AI R&D automation in ~2032

Thomas Kwa12 Feb 2026 0:13 UTC

69 points

15 comments8 min readLW link

(metr.org)

Timeless Engineering

Jack Bradshaw11 Feb 2026 23:53 UTC

−14 points

0 comments5 min readLW link

[Paper] How does information access affect LLM monitors’ ability to detect sabotage?

Rauno Arike, Raja Moreno, RohanS, Shubhorup Biswas and Francis Rhys Ward

11 Feb 2026 21:25 UTC

26 points

0 comments6 min readLW link

Claude Opus 4.6 Escalates Things Quickly

Zvi11 Feb 2026 21:20 UTC

51 points

0 comments34 min readLW link

(thezvi.wordpress.com)

Where Will Call Center Workers Go?

loic11 Feb 2026 20:44 UTC

19 points

2 comments4 min readLW link

Distinguish between inference scaling and “larger tasks use more compute”

ryan_greenblatt11 Feb 2026 18:37 UTC

87 points

5 comments2 min readLW link

Monitor Jailbreaking: Evading Chain-of-Thought Monitoring Without Encoded Reasoning

Wuschel Schulz11 Feb 2026 17:18 UTC

61 points

17 comments5 min readLW link

[Hiring] Principia Research Fellows

Matthias Dellago and Jin Hwa Lee

11 Feb 2026 16:30 UTC

35 points

1 comment3 min readLW link