All 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 2022 2023 2024 20252026

All Jan Feb Mar Apr MayJun

All 1 2 3 4 5 6 7 8 9 10 11 121314 15 16 17 18 19 20

When Emotion Descriptors Fail: AI-Native Functions of Emotion Vectors

CandidLind12 Jun 2026 23:20 UTC

8 points

0 comments27 min readLW link

A Generated Web

Klemen12 Jun 2026 23:09 UTC

3 points

0 comments3 min readLW link

The Quest To Find The Next Big Communicators In AI Safety

Akshyae Singh12 Jun 2026 20:17 UTC

17 points

3 comments6 min readLW link

Updates on performative misalignment

David Vella Zarb, Rustem, Taywon Min and Shi

12 Jun 2026 20:15 UTC

22 points

0 comments12 min readLW link

Statistical Physics for Ambitious Interpretability: A Workshop Retrospective

Lauren Greenspan, Lucas Teixeira and ClaudineLim

12 Jun 2026 20:01 UTC

5 points

0 comments6 min readLW link

Calibrating Activation Vectors using Norm

Kamesh R12 Jun 2026 19:59 UTC

1 point

0 comments3 min readLW link

Claude Fable 5 and Mythos 5: The System Card

Zvi12 Jun 2026 18:50 UTC

48 points

1 comment29 min readLW link

(thezvi.wordpress.com)

What’s Continual Learning, and Why Might We Expect To See It In Advanced LLM Agents?

RohanS, Rauno Arike, Owen Terry, Achu Menon, Zhijing Jin, Francis Rhys Ward and Seth Herd

12 Jun 2026 18:43 UTC

28 points

2 comments17 min readLW link

Implications of Continual Learning for LLM Agents: Introduction

RohanS, Rauno Arike, Owen Terry, Achu Menon, Zhijing Jin, Francis Rhys Ward and Seth Herd

12 Jun 2026 18:36 UTC

48 points

0 comments6 min readLW link

Surplus: for massive public good

Austin Chen12 Jun 2026 18:10 UTC

13 points

0 comments4 min readLW link

(surplus.dev)

Reward Hacking at the 1937 World’s Fair

frmsaul12 Jun 2026 17:47 UTC

36 points

14 comments3 min readLW link

Bunk in AF

Fernand012 Jun 2026 17:41 UTC

6 points

0 comments1 min readLW link

Building and evaluating model diffing agents

bilalchughtai, Josh Engels and Neel Nanda

12 Jun 2026 17:14 UTC

61 points

2 comments12 min readLW link

Rational Animations is a 501(c)(3) nonprofit and is looking for board members

Writer12 Jun 2026 16:47 UTC

7 points

0 comments2 min readLW link

“AF needs empirical grounding” is a meaningless valley of compromise

Fernand012 Jun 2026 16:37 UTC

9 points

3 comments1 min readLW link

How bad would it be if GPS satellites were shot down?

Jackson Wagner12 Jun 2026 16:34 UTC

19 points

0 comments21 min readLW link

Sympathy for both sides of the egregious misalignment debate

Steven Byrnes12 Jun 2026 16:26 UTC

201 points

26 comments4 min readLW link

The Uncertainty That Matters Isn’t Fundamental

jimmy12 Jun 2026 16:23 UTC

30 points

1 comment13 min readLW link

Citations Needed: Magic Encyclopedias to Save the World

Oliver Sourbut12 Jun 2026 15:35 UTC

40 points

3 comments5 min readLW link

(www.oliversourbut.net)

If you, a human, can imagine red and green being swapped, you are probably conscious

vals tutor12 Jun 2026 13:28 UTC

4 points

19 comments7 min readLW link

Simulating Simulators

kromem12 Jun 2026 12:56 UTC

43 points

2 comments15 min readLW link

Learning to spend money

Yair Halberstadt12 Jun 2026 6:56 UTC

19 points

1 comment2 min readLW link

Parkinson’s Heuristic: The Only Time To Do Anything

Ben Pace, the Vacationing Vagabond12 Jun 2026 6:55 UTC

118 points

9 comments5 min readLW link

PSA: Almost nobody is directly working on superintelligent alignment

Chi Nguyen and peterbarnett

12 Jun 2026 5:17 UTC

240 points

41 comments1 min readLW link

Honey is Good

G Wood12 Jun 2026 4:07 UTC

9 points

4 comments3 min readLW link

The Aestheticising Vice by Paul Seabright

Linch12 Jun 2026 2:20 UTC

25 points

2 comments2 min readLW link

Celene’s thoughts on consciousness

ToasterLightning12 Jun 2026 0:55 UTC

46 points

34 comments18 min readLW link

(terminuspoint.substack.com)

Construct validity of Claude Opus 4.8′s System Card – A commentary

Maria Federica Martino Lena 11 Jun 2026 23:33 UTC

8 points

0 comments16 min readLW link

you won’t one-shot a perfect system, but try anyway

PossiblyElaine11 Jun 2026 22:43 UTC

7 points

1 comment4 min readLW link

(possiblyelaine.substack.com)

Announcing the Next Phase of AI Forge

Mike Vaiana, johnclund and Diogo de Lucena

11 Jun 2026 21:27 UTC

11 points

0 comments2 min readLW link

The long arc of alignment: second-order instrumental convergence

Emma Leonhart11 Jun 2026 21:12 UTC

−2 points

0 comments3 min readLW link

Newcomb’s problem from the grand-system and petty-system views

transhumanist_atom_understander11 Jun 2026 20:58 UTC

12 points

0 comments5 min readLW link

[New Paper] Prioritizing Risks from AI: A Delphi Study of 272 Experts

peterslattery11 Jun 2026 20:57 UTC

14 points

0 comments2 min readLW link

(airisk.mit.edu)

Telepathy Is (Algorithmically) Easy

Elliot Callender11 Jun 2026 20:31 UTC

4 points

5 comments4 min readLW link

Mortgage rate: 6.5% If indexed: 1.2%. Three Nobelists approve.

Bruce Middleton11 Jun 2026 20:31 UTC

5 points

2 comments2 min readLW link

[Question] Becoming a Researcher in a Non-EA-Priority Field vs Donating $100k / Year to EA Research?

Master Chief11 Jun 2026 19:22 UTC

8 points

0 comments1 min readLW link

AI #172: The First Fable

Zvi11 Jun 2026 19:00 UTC

44 points

2 comments34 min readLW link

(thezvi.wordpress.com)

Failing to Ragebait the New Gemma

Neil Shah, David Africa and arav-dhoot

11 Jun 2026 17:50 UTC

30 points

0 comments3 min readLW link

Curating and evaluating high-impact legal research (Unjournal progress, resources)

david reinstein11 Jun 2026 11:42 UTC

11 points

0 comments1 min readLW link

(info.unjournal.org)

Models May Behave Worse When Eval Aware

Senthooran Rajamanoharan and Neel Nanda

11 Jun 2026 9:28 UTC

86 points

7 comments13 min readLW link

Becoming a Researcher in a Non-EA-Priority Field vs Donating $100k / Year to EA Research

Master Chief11 Jun 2026 2:28 UTC

8 points

0 comments1 min readLW link

Inverse Rubric Optimization: A testbed for agent science

zef, leni, kaivu and rohuang

11 Jun 2026 1:44 UTC

9 points

0 comments1 min readLW link

(fulcrum.inc)

Drawing Big Bright Lines for Cyber & Biological AI

Austin Morrissey11 Jun 2026 0:55 UTC

−5 points

0 comments4 min readLW link

Predictive Processing: Conscious when Training

Chamod Kalupahana11 Jun 2026 0:06 UTC

13 points

1 comment2 min readLW link

Thoughts on Claude Fable’s silent safeguards

Andy Arditi10 Jun 2026 23:35 UTC

51 points

20 comments10 min readLW link

Notes on Algorithms

Menotim10 Jun 2026 23:28 UTC

7 points

0 comments25 min readLW link

[Question] Fuel Crisis: Situation Modeling Thread

Nicholas Kross10 Jun 2026 21:59 UTC

8 points

7 comments1 min readLW link

[Question] Fuel Crisis: Justified Practical Advice Thread

Nicholas Kross10 Jun 2026 21:59 UTC

14 points

0 comments1 min readLW link

Solsong Chord Updates

jefftk10 Jun 2026 21:00 UTC

10 points

0 comments1 min readLW link

(www.jefftk.com)

Dario Amodei—Policy on the AI Exponential

DW1110 Jun 2026 20:56 UTC

22 points

0 comments1 min readLW link