All 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 2022 2023 2024 20252026

All Jan Feb Mar Apr MayJun

All 1 2 3 4 5 6 7 8910 11 12 13 14 15 16 17 18 19

How to build a cancer vaccine, and whether they will work this time

Abhishaike Mahajan8 Jun 2026 20:45 UTC

58 points

9 comments25 min readLW link

(www.owlposting.com)

Efficient tradeoffs and the safety-usefulness tradeoff model

Buck8 Jun 2026 20:28 UTC

42 points

1 comment8 min readLW link

Accelerated Skill Learning via Dream Engineering and Biofeedback

Elliot Callender8 Jun 2026 20:08 UTC

5 points

2 comments3 min readLW link

How valuable are weak AI safety regulations?

MichaelDickens8 Jun 2026 18:24 UTC

28 points

0 comments6 min readLW link

How to reduce capability degradation from off-model SFT

Dylan Xu, SebastianP and Alek Westover

8 Jun 2026 16:24 UTC

21 points

0 comments3 min readLW link

The Next Swan: Frank Ramsey, Variable Hypotheticals, and the Bet on Induction

Ramseyian8 Jun 2026 12:01 UTC

4 points

0 comments18 min readLW link

Coverage-driven alignment—What ‘Teaching Claude Why’ can borrow from AV verification

Yoav Hollander8 Jun 2026 11:42 UTC

16 points

4 comments14 min readLW link

(blog.foretellix.com)

Bun’s Migration from Zig to Rust as a Potential Case Study for Gradual Disempowerment

Sayhan Yalvaçer8 Jun 2026 7:06 UTC

96 points

8 comments3 min readLW link

Contra Dance at LessOnline

jefftk8 Jun 2026 5:50 UTC

23 points

0 comments1 min readLW link

(www.jefftk.com)

Honking is good

PossiblyElaine8 Jun 2026 4:36 UTC

9 points

7 comments4 min readLW link

(open.substack.com)

The CIA believes everything

volpe8 Jun 2026 0:43 UTC

22 points

10 comments2 min readLW link

(volpe.envs.net)

How do people stop spiraling about Roko’s Basilisk & acausal extortion?

anon2028 Jun 2026 0:39 UTC

9 points

6 comments1 min readLW link

Contextual Identity Laundering: How Claude’s Image Refusal Can Be Routed Through Web Search

Failfinder708 Jun 2026 0:39 UTC

7 points

2 comments9 min readLW link

Mental causation is not load-bearing

jessicata7 Jun 2026 20:43 UTC

38 points

4 comments10 min readLW link

How Far Apart Does a Model Think Its Tokens Are?

Brendan Long7 Jun 2026 20:20 UTC

47 points

9 comments10 min readLW link

(www.brendanlong.com)

Autopilot Thinking

XelaP7 Jun 2026 20:20 UTC

10 points

4 comments6 min readLW link

Secret Loyalties Likely Raise Remote-Influenceability

Kaustubh Kislay7 Jun 2026 17:51 UTC

13 points

0 comments6 min readLW link

From One Piece to One Pace - Vision and mission in coordination of agents

a unemployed pastor- de S Brito7 Jun 2026 17:07 UTC

2 points

0 comments4 min readLW link

Neglected Basics of AI Alignment

Quirinus_Quirrell7 Jun 2026 9:02 UTC

28 points

2 comments6 min readLW link

The Hats of LessOnline

AprilSR7 Jun 2026 8:57 UTC

15 points

2 comments3 min readLW link

(aprilsr.substack.com)

Can activation verbalizers surface an internal chain of thought?

oakhu and ryan_greenblatt

7 Jun 2026 4:24 UTC

122 points

0 comments16 min readLW link

Frontier Models Still Lag Behind Humans at Robust Belief-State Tracking

Lukas Frei6 Jun 2026 23:54 UTC

13 points

6 comments5 min readLW link

Coming Around To Political Donations

jefftk6 Jun 2026 21:30 UTC

59 points

8 comments2 min readLW link

(www.jefftk.com)

Analysis of Metastable States in the Transformer Activation Space

Zach Baker6 Jun 2026 21:30 UTC

10 points

0 comments20 min readLW link

The Diamond Lemma

Isaac Newton6 Jun 2026 21:15 UTC

21 points

0 comments7 min readLW link

(archimedeanmonoid.substack.com)

Iliad is Hiring

Peter Jean6 Jun 2026 21:08 UTC

13 points

0 comments1 min readLW link

Against Corrigibility

peralice6 Jun 2026 20:28 UTC

66 points

17 comments12 min readLW link

The Residual Stream Has a Geometry of Time

Fodenthal6 Jun 2026 19:57 UTC

23 points

0 comments8 min readLW link

Exponential Solitude

PeterMaui6 Jun 2026 19:49 UTC

5 points

1 comment9 min readLW link

Freud heard a rumor that Science existed, and had a wonderful dream

Bruce Middleton6 Jun 2026 14:47 UTC

8 points

8 comments6 min readLW link

Coalitional Darwinism and the Instrumental Utility of Individuality

CarolusRenniusVitellius6 Jun 2026 12:53 UTC

25 points

5 comments17 min readLW link

(charlesr-w.github.io)

Why Software Automation Is Hard

silentbob6 Jun 2026 8:56 UTC

114 points

20 comments12 min readLW link

What if Anthropic unilaterally paused capabilities development right now?

Karl von Wendt6 Jun 2026 7:39 UTC

61 points

15 comments3 min readLW link

Optimisation over non-stationary distributions creates weirder minds

Samuel Ratnam and Pjain

6 Jun 2026 0:05 UTC

36 points

8 comments4 min readLW link

[Question] Does robotics capabilities research accelerate AGI timelines?

Master Chief5 Jun 2026 23:32 UTC

4 points

3 comments1 min readLW link

Evaluating using Mock Tool Calls to Quarantine Untrusted Prompt Inputs

dgros5 Jun 2026 22:43 UTC

15 points

0 comments11 min readLW link

Two More Methods for Consistency Training and Some New Ways to Apply It

David Africa, Sukrati_Gautam, Neil Shah and arav-dhoot

5 Jun 2026 21:06 UTC

18 points

0 comments7 min readLW link

Revisiting GSM-Symbolic: models seem to reason okay, actually

Sturb5 Jun 2026 20:54 UTC

24 points

0 comments5 min readLW link

Accepting Death & Adult Responsibility

Unreal5 Jun 2026 19:23 UTC

−19 points

10 comments4 min readLW link

The Masochistic Prior

Modulo.Roland5 Jun 2026 19:05 UTC

12 points

2 comments2 min readLW link

(substack.com)

Beyond the lexical personality traits: What is the structure of personality?

tailcalled5 Jun 2026 19:05 UTC

60 points

1 comment5 min readLW link

Do not try to write your first research publication as a single author

Mikhail Mironov5 Jun 2026 18:31 UTC

12 points

0 comments5 min readLW link

Do We Want a Superintelligent People-Pleaser?

GenericHousewife_B5 Jun 2026 18:07 UTC

1 point

0 comments6 min readLW link

Explaining SAE Features With Foreign Natural Language Autoencoders

fzaffino5 Jun 2026 17:51 UTC

17 points

1 comment8 min readLW link

SecureBio Detection is Hiring Software Engineers

jefftk5 Jun 2026 16:50 UTC

33 points

2 comments1 min readLW link

(www.jefftk.com)

One Year of PauseAI UK

Joseph Miller and PauseAI UK

5 Jun 2026 16:41 UTC

94 points

7 comments11 min readLW link

(pauseai.uk)

Learnings from starting an AI safety research team

draganover and Erin Robertson

5 Jun 2026 16:27 UTC

101 points

7 comments6 min readLW link

Preparing for Warning Shots to Catalyze International Cooperation on AGI Risks

Mark Kagach, EliasSchlie, Thomas Van Damme and JustinShovelain

5 Jun 2026 15:49 UTC

40 points

1 comment5 min readLW link

My research: a computational cognitive neuroscience perspective on alignment

Seth Herd5 Jun 2026 14:19 UTC

52 points

0 comments18 min readLW link

Editing is Easy, but Revision is Hard

IanWS5 Jun 2026 11:58 UTC

5 points

0 comments3 min readLW link

(write.ianwsperber.com)